Exploiting the quantum mechanically derived force field for functional materials simulations

The computational design of functional materials relies heavily on large-scale atomistic simulations. Such simulations are often problematic for conventional classical force fields, which require tedious and time-consuming parameterization of interaction parameters. The problem can be solved using a quantum mechanically derived force field (QMDFF)—a system-specific force field derived directly from the first-principles calculations. We present a computational approach for atomistic simulations of complex molecular systems, which include the treatment of chemical reactions with the empirical valence bond approach. The accuracy of the QMDFF is verified by comparison with the experimental properties of liquid solvents. We illustrate the capabilities of our methodology to simulate functional materials in several case studies: chemical degradation of material in organic light-emitting diode (OLED), polymer chain packing, material morphology of organometallic photoresists. The presented methodology is fast, accurate, and highly automated, which allows its application in diverse areas of materials science.


INTRODUCTION
Many problems in computational chemistry and materials science require large-scale simulations at atomic resolution. State-of-the-art classical force fields (FFs) allow to study the structure and dynamics of systems that include billions of atoms 1 . Such studies, however, can be applied only to the relatively narrow range of chemical compounds for which FF parameterizations exist. Although some families of compounds, such as biological macromolecules or molecular liquids, are well parameterized and are simulated routinely 2 , less popular areas of chemical space often lack FF parameters. For example, the vast majority of organometallic complexes or fused heteroaromatic moieties, which are common in organic electronic devices such as solar cells, diodes, and transistors cannot be simulated by means of classical molecular dynamics (MD) without laborious and time-consuming development of interaction parameters for FF. Moreover, the study of functional materials usually implies not only structure or energy calculations but also the modeling of charge and energy transfer processes that involve molecules in ionic or excited states. Such exotic objects are currently beyond the scope of any predefined empirical potentials. Ab initio methods, on the other hand, can be applied to substances of any chemical nature and any electronic configuration. An issue, in that case, is prohibitively high demand for computational resources. A way to combine strong points of both classical and ab initio approaches is therefore required.
One of the possible solutions is to apply ab initio calculations to parameterize FF every time we starting simulations of the new system. Actually, ab initio data have always been included in the target values for parameter fitting in the FF development 3,4 . To bring this idea to its full fruition, the references to purely empirical (experimental) data should be excluded and an automatic procedure for the conversion of ab initio output to the FF input should be implemented. The algorithm needs to treat all intraand intermolecular interactions in a self-consisted fashion. Intermolecular terms are particularly challenging here because it is difficult to obtain interatomic repulsion potential or reliable dispersion energy from the first principles. A concept of the firstprinciples based FF has now several implementations, such as QuickFF 5 , MOF-FF 6 , or JOYCE/PICKY protocol 7,8 , not to mention numerous studies devoted to specific systems.
In the present work, we adopt an approach developed by Grimme 9 and called quantum mechanically derived force field (QMDFF). Our choice is based on the advanced treatment of the intermolecular interactions demonstrated by QMDFF, its applicability to a wide range of chemical compounds, and the availability of program tools for input preparation and basic simulations. Unlike other approaches, QMDFF generates intermolecular potential energy terms from the ab initio calculations of a single molecule in a vacuum, without the use of external databases or simulations of condensed phase. The amount of input data is very limited: the equilibrium structure of the molecule, the Hessian matrix, the atomic partial charges, and the covalent bond orders. Based on this input, a full-fledged FF is created by automated procedure, with the intra-and intermolecular potential energy terms, although without explicit account for atomic polarization (fixed point charges approximation). Being defined for a single configuration of the isolated molecule, QMDFF preserves high accuracy even for conformationally flexible systems and weakly bound intermolecular complexes 9 . QMDFF also showed good performance in predicting structures and energies across diverse molecular datasets, including transition metal complexes 9 . Wide coverage of chemical space and automated derivation of FF parameters make QMDFF a promising tool for MD simulations during the computational design of novel materials. QMDFF-based MD simulations of materials were indeed reported recently 10 in relation to the studies of OLEDs. However, the general question of applicability of FF parameters derived for an isolated molecule to condensed phase simulations remains poorly discussed in the literature. Accurate and consistent treatment of intra-and intermolecular noncovalent interactions implemented in QMDFF is encouraging, but further validation for bulky systems is required.
The number of potential applications of the FF can be greatly increased if it allows for the simulation of chemical reactions, i.e., formation or cleavage of covalent bonds. QMDFF is not a reactive FF in the sense that the molecular topology cannot change during the simulation and no new bonds can be formed, which distinguishes QMDFF from conventional reactive FFs like COMB 11 or ReaxFF 12 . Nevertheless, QMDFF provides potential opportunities in this respect, since all covalent interactions in QMDFF are anharmonic and the bonds can be broken. The continuous transition from the reactants to the products can then be modeled with the aid of the empirical valence bond (EVB) scheme 13 , which produces a total double-well potential energy curve as a combination of the diabatic states corresponding to the reactants and the products. In principle, the EVB scheme is able to add reactive capabilities to any nonreactive FF. There are two essential features of the QMDFF that make it especially beneficial for the EVB approach. The first is the ability to account for unusual bonding topology and electronic configuration occurring after the covalent bond is broken or formed. The second is the accurate description of anharmonic dissociation terms, which stems from the proper treatment of the harmonic stretching and atomization energies inherent to the FF definition. Accurate dissociation terms are expected to ensure a more accurate description of the whole reaction path with minimal need for empirical fitting.
EVB scheme with QMDFF potentials (EVB+QMDFF) indeed demonstrated impressive performance for several test cases of chemical reactions involving nontrivial nuclear reorganization and asymmetric energy barriers 14,15 . These test cases, however, included rather small molecules that can be considered only as simplified prototypes of real functional materials. One of our aims in the present work is to apply the EVB+QMDFF approach to the chemical reactions in complex molecular systems and materials, such as excited organic molecules undergoing chemical crosslinking reactions in the OLED emissive layer. Simulations of this kind are highly relevant in material design studies when target reactions or undesired degradation processes should be considered.
The applicability of MD simulation studies is limited not only by the intrinsic capabilities of the FF but also by the available computational methods and their performance, which is determined by the software implementation. The complicated form of the energy terms used in QMDFF makes it incompatible with conventional software packages for MD. Originally, QMDFF was incorporated into the QMSIM program developed by Shushkov and Grimme 16 . An adaptation of QMDFF to the RPMDrate 17 code has also been presented 15 . The latter aims at the calculation of the reaction rate constants using ring polymer MD approach 18,19 and can be efficiently applied only to the reactants with a very small number of atoms. QMSIM provides more convenient possibilities for the simulation of clusters and periodic systems. It is, however, still a serial program with only basic algorithms for MD simulations. In order to merge QMDFF capabilities with the computational efficiency of more mature software, we modified LAMMPS 20 program. LAMMPS is a popular open-source software package for MD simulations, which has been developed for many years and now offers rich possibilities for the simulation of crystalline, liquid, and amorphous systems. LAMMPS is based on scalable and well-optimized parallel C++ code and provides one of the best computational performances in the field 21-23 . Our custom version of LAMMPS is capable of supporting exotic functional forms of interatomic interaction potentials used in QMDFF. The core LAMMPS engine is supplemented by preprocessing tools that can write potential energy terms as numerical tables and feed them to LAMMPS. External tools are also responsible for the generation of starting conformation in LAMMPS format and merging FF for reactants and products following the EVB scheme. To further advance QMDFF capabilities toward the efficient simulation of materials, we also equipped preprocessing tool with the option to generate FF for polymer chain based on the ab initio calculations of the oligomers. The use of powerful and versatile simulation software is intended to unlock the potential of the QMDFF approach in relation to the real-life problems of material design.
To validate our computational workflow and clarify its application domain, we performed the following illustrative simulations: (1) Morphology of composite material consisting of polymer chains doped with transition metal complexes: We demonstrate the ability to build FF for polymers from ab initio calculations of two connected monomers. We simulate the morphology of amorphous poly(methyl methacrylate) (PMMA), both neat and in a mixture with small-molecule dopants. The mobility of the radical products of the degradation reaction in the polymer matrix is studied, and possible implications of this type of process on the material's stability are discussed. (2) Chemical degradation of the host material in OLED: Most of the detrimental degradation pathways during the OLED operation include photo-or electron-induced excitations in various situations. Especially for the deep-blue OLED devices, the high-energy triplet T 1 state plays a crucial role in materials deterioration 24 . For a well-known hole-conducting OLED host material 9,10-Bis(2-naphthyl)anthracene (ADN) 25 , we elucidate the influence of environment and entropic effects on the height of free energy barriers of the reactions and corresponding degradation rates. (3) Molecular organometallic photosensitive materials: We demonstrate automatic parameterization of hybrid organic-inorganic materials based on the hafnium oxide nanoparticles. We investigate the morphology of the amorphous condensed phase and calculate distributions of the active sites prone to irradiation-induced crosslinking reactions. Atomistic simulations provide estimates of the parameters suitable for the simulation of the polymerization processes occurring in photosensitive materials.
The present work is supposed to justify the use of QMDFF in simulations of condensed matter and give the outline of possible applications of QMDFF in computational materials science.

FF definition
A complete guide to the QMDFF potential energy terms, intrinsic parameters, and empirical fitting strategy can be found in ref. 9 . Here we do not repeat it, but focus on the selected features essential for our work, software implementation of QMDFF, and corresponding discussion items. The total energy in QMDFF can be represented as a sum of three parts: where E e,QM is the energy of the equilibrium structure, E intra represents the energy of the bonded interactions, and E NCI denotes the intra-and intermolecular noncovalent interactions. The E intra part contains bond stretching (str), angle bending (bend), in-plane and out-of-plane torsion (tors, oop) terms: where the sums run over all bonds and 1,3-interactions (str), bending (bend), torsional (tors), and inversion (oop) angles, and V X corresponds to the potential of type X.
The stretching term between atoms A and B is described with generalized Lennard-Jones potential: Here, r e,AB and r AB are equilibrium and current interatomic distances, and the parameter k str,AB is connected to the force constant near equilibrium. The exponent a has been fitted to the atomization energies, and the force constant is fitted to the molecule-specific ab initio Hessian matrix. The functional form of V str defined by Eq. (3) allows smooth dissociation of the bond. Although only equilibrium geometries are used for FF parameterization, due to the reasonable definition and proper fitting of global parameters, QMDFF features a good extrapolation ability. It reproduces dissociation energies of diatomic molecules and more complex structures with an accuracy of 5-10 % 9 . Noncovalent interactions are treated with the modified version of the standard intermolecular FF 26 . Total noncovalent energy consists of Pauli repulsion (rep), electrostatic (ES), hydrogen bonding (hbnd), halogen bonding (xbnd), London dispersion (disp), and (optionally) polarization (pol) terms 9 : The definition of hydrogen and halogen bonds in QMDFF is based on the atomic triplets, including donor, acceptor, and hydrogen/halogen atoms 9 . Being applied to intermolecular interactions, this approach cannot be easily combined with conventional neighbor searching algorithms used in the MD simulations. That is why we do not include the terms E hbnd and E xbnd in the present work. The possible drawback of this decision is the poor behavior of the FF in the (hopefully rare) situations when a specific kind of intermolecular interaction is present in the system. We also exclude polarization term, limiting ourselves to the additive FF with fixed atomic charges. Nonspecific Van der Waals (VdW) interactions include Pauli repulsion and London dispersion that follows D3 empirical dispersion correction scheme 27 : Here s 8 , β rep , and R 0 AB are global fit parameters and C AB 6 and C AB 8 are pair-specific parameters of D3 method 27 . f R 0 AB is Becke-Johnson rational damping factor 28 , Z eff are effective nuclear charges. The transition from bonded to nonbonded interactions is defined with the topological screening parameter ε rep/disp , which depends on the number of covalent bonds between atoms A and B. It has special values for 1-2, 1-3, 1-4, and 1-5 interactions and is equal to unity for more distant pairs of atoms. Scaling factor ω rep is the only empirical parameter not present in the original formulation of the QMDFF 9 . It is introduced to adapt the FF generated for the isolated molecules to the simulations of the condensed phase. The value ω rep = 1.4 was recommended in the QMSIM manual 16 . We tested this approach on the properties of liquid solvents. For the solvents with strong hydrogen and halogen bonding, the default parameterization procedure is not satisfactory, so additional fitting of the repulsion and, possibly, charges are required for every particular case. For organic and aprotic solvents, the value ω rep = 1.4 is indeed optimal for the calculation of density and heat of vaporization. Therefore, in the present work, we use that value of the parameter, since studied materials do not feature any prominent hydrogen or halogen bonds. See also Supplementary Fig. 1, Supplementary Tables 1 and 2, and Supplementary Note 1 for the detailed discussion. The total parameterization procedure requires a certain fitting to the results of ab initio calculations for every new molecule. Covalent bonding topology is determined on the basis of standard covalent atomic radii 29 and Wiberg-Mayer covalent bond orders 30,31 . Bonded terms are fitted to reproduce the equilibrium structure and vibration frequencies derived from the ab initio Hessian matrix. A special procedure determines rotatable bonds and fits torsional terms to the results of tight-binding DFT calculations 32,33 of corresponding prototypical fragments. Atomic partial charges are obtained by scaling of CM5 charges 34 based on Hirshfeld partitioning 35 of the electron density. During the original development of QMDFF, only 47 global empirical parameters have been adjusted 9 . After ab initio calculations of the target system, it is possible to simulate almost any molecule consisting of the first 86 elements of the periodic table.

Algorithm implementation
The flowchart of our LAMMPS-based implementation of QMDFF is illustrated in Fig. 1. A core modification of the LAMMPS source code is the ability to support 1-5 interactions, as it is required by the definition of the dispersion/repulsion in QMDFF (see Eqs. (5) and (6)). All potential energy terms are imported into LAMMPS as tables, so no further changes of the source code are needed. LAMMPS engine is supplemented by the utility LAMMPS-QMDFF written in python to provide the flow of data during FF parameterization, system construction, and MD simulations.
For every new molecule, the simulations start with the FF definition. Hessian matrix, partial charges, and bond orders are computed at the optimized geometry using the Gaussian software 36 . QMDFF tool is used to read the Gaussian output files, generate the FF, and print out the parameters in the text format. Up to that moment, we follow the procedure originally developed by Grimme 9 . The next steps are meant to expand the capabilities of QMDFF to simulate complex molecular systems and bulky materials. Given parameter files for every type of molecule, the FF definition for the total system is generated and printed out in Fig. 1 A flowchart of the LAMMPS implementation of the QMDFF force field. Blue units correspond to quantum chemistry (QC) software (Gaussian 16 36 was used) and green to the QMDFF tool. Our contribution is shown with red (modified version of LAMMPS) and orange (python tools) colors.
LAMMPS input format. This operation includes merging of the atom types, producing the required number of instances for every type of molecule, generating reasonable initial coordinates, and, possibly, modifying the FF to build a polymer chain from oligomer data. Tabulated potential energy terms written in text format can then be modified by the addition of restraining functions to prevent bonds with low dissociation energy from breaking during the initial thermalization and annealing procedure. When input files are prepared, the MD simulations can be carried out using the LAMMPS program in a conventional manner, including high-performance parallel execution and support of many-core and community-provided packages. QMDFF showed good performance and scalability for all the case studies present in the paper. Supplementary Fig. 2 shows that moderate discrepancies from the linear scaling are observed for the systems with tens of thousands of atoms.
To simulate chemical reactions according to the EVB approach, two systems must be defined, with the reactants and products having the same ordering of atoms, but different FF parameters. Starting from the ab initio relaxed potential energy scans that should be obtained beforehand, EVB coupling parameters (see section "Empirical valence bond approach") are fitted automatically using Nelder-Mead algorithm 37 and mean-squared error loss function between the ab initio and FF energies calculated for an array of values of the reaction coordinate. During the MD run, the solution of EVB equations (see section "Empirical valence bond approach") is performed by an external callback procedure, which calculates new energies and forces at every time step.
Initial coordinates can be generated in several ways. A predefined number of molecules can be placed randomly in the rectangular box, including mixtures of different molecules. The size of the box and positions of the molecules are adjusted to remove overlapping. Although the density of the initial structure is far from normal, it can be used as a starting point for the simulation. We also implemented the ability to construct a linear polymer chain from the coordinates of the monomer. Finally, the initial structure can be borrowed from the previous simulation, where some molecules are replaced by the molecules with the same number of atoms, but different FF parameters. This can be useful for starting EVB simulation in the bulk of equilibrated material.

Morphology of PMMA-based materials
Polymers and polymer-based composites represent a very important class of materials. FF that can cover a wide range of polymer compositions is, therefore, in demand by computational materials science. Such a FF must include the same parameters for similar atoms in monomeric units and provide special treatment for intra-and intermolecular interactions near the bonds connecting monomers. The procedure starting from ab initio calculation of a single molecule, implemented in QMDFF 9 , is not compatible with this approach. We enhanced the default procedure and created the model of polymeric chain on the basis of QMDFF parameterization of the methyl methacrylate (MMA) dimer (see Supplementary Fig. 3 and Supplementary Note 2).
Having a tool for PMMA modeling with QMDFF, we simulated the morphology of the materials using PMMA as a matrix (see Fig. 2). PMMA as a transparent, stable, and chemically inactive polymer has a lot of possible applications in research and industry. By the addition of luminescent dopants to PMMA, a variety of functional materials can be created 38 . This strategy was used to develop optical fibers 39,40 , solid-state lasers 41 , and sensors 42 , not to mention the widespread application of PMMA as a matrix material in laboratory practice when luminescent properties of molecules or nanoparticles are under investigation 43,44 . Hybrid materials of this kind consist of polymer chains doped with complexes containing heavy metal atom 40,42 , which makes atomistic simulations of these systems especially challenging. With the ability to treat macromolecules on the same basis as low molecular weight compounds, we can use QMDFF capabilities of automatic parameterization to routinely simulate hybrid polymer-based materials.
Our simulations started from ab initio calculations of the dopants (if needed) and the dimer of methacrylate (see Fig. 2b). We processed the results with qmdff program and obtained FF parameters. Then, we transformed the FF of the MA dimer and generated parameters for corresponding repeating and capping monomer units. On the basis of these parameters, the model of the syndiotactic PMMA chain of 500 monomers was then created. The simulation box contained one such chain in linear conformation and five dopant molecules (see Fig. 2c). The final morphology was obtained by annealing from 600 to 300 K during 2 ns, accompanied by further relaxation in the NPT ensemble during 2 ns. The morphology preparation procedure is illustrated in Supplementary Fig. 4. In the case of neat polymer (no dopants in the cell), the density for five independent runs was in the range 1.15-1.16 g cm −3 , which is close to the experimental estimates under normal conditions 45 (1.18 g cm −3 ). The typical equilibrium configuration of the doped material, suitable for further study of its properties at the atomistic level, can be seen in Fig. 2a.
Using a flexible approach provided by QMDFF for both closedand open-shell compounds, it is possible to simulate not only the quasi-equilibrium structure of doped polymer materials but also to study chemical transformations and degradation processes. In order to provide an illustrative example, we considered PMMA doped with imidazopyrazine-based Ir(III) complexes used as light emitters in blue OLED materials 43 (designated as Ir1 and Ir2 in Fig. 2b). One of the most abundant degradation channels present in OLED materials is homolytic dissociation of single C-N bonds in the first singlet excited state 24,46 . For compounds Ir1 and Ir2 this process affects bonds marked with red in Fig. 2b, leading to detachment of corresponding radical species. The only difference in the chemical structure of Ir1 and Ir2 is tert-butyl side group in the ligand; therefore, it is a convenient model system to study an impact of such a facile difference on the operational stability of the material. The main idea is as follows: the rate of irreversible dissociation in the solid matrix is governed not only by the initial radical formation but also depends on the kinetics of the subsequent recombination process. Additional bulky tert-butyl moiety is suggested to affect the size and shape of the mobile radical product, causing a noticeable difference in the radical separation rate. This concept, being recognized long ago in connection with OLED degradation 47 , has yet to be studied in detail.
To address this problem, we simulated structural relaxation and radical motion after the bond dissociation event in Ir1 and Ir2. Starting from the equilibrium snapshot of the intact system, we switched potentials and continued simulation with parameters corresponding to disconnected products. After the initial relaxation, radical products formed a contact pair separated by the VdW distance. After that, further structural relaxation during 1 ns was simulated and C-N bond length was monitored. For each MD run, the moment when the radicals get separated (determined by the first peak of the C-N pair distribution function) was recorded. The portion of the separated products among 45 runs as a function of time is plotted in Fig. 2c. In all cases, a 1 ns period of time was sufficient for the separation. Single exponent fit provides time constants 290 and 57 ps for Ir1 and Ir2, respectively. These times are far below the range of β-relaxation in PMMA (>1 ms at normal conditions 48 ) or noticeable diffusive displacement of the radicals as a whole. Separation of the products occurs mainly due to the relaxation of the local strains following the dissociation of the covalent bond. Surprisingly, tert-butyl separates faster than phenyl. We monitor radical separation by C-N distance, and its increase is caused not by the center-of-mass motion, but rather by the rotation of the radical inside the cage. Under these circumstances, tert-butyl moiety in meta-position serves as a lever bar, providing more asymmetric interaction with polymer matrix and transferring local structure relaxation into the rotational motion.
The presented simulations demonstrate how QMDFF can be used to provide mechanistic insights into the degradation reactions of light-emitting molecules in the solid matrix. Discussed mechanisms are not limited by the doped polymers and are expected to be important in the majority of amorphous OLED materials.

Degradation of OLED material
To demonstrate the possibility to investigate chemical reactions in organic materials using QMDFF in conjunction with the EVB approach, we addressed the degradation of 9,10-dinaphthylanthracene (ADN). ADN is well-known hole-conducting OLED material, existing in αand β-isomeric forms 25 . Operational degradation of ADN-based devices is one of several OLED degradation processes with known chemical origins. It was shown that irreversible chemical transformation of ADN molecule starts from the intramolecular linking reaction in the triplet state when a new bond is formed between two carbon atoms 49 (see structures in Fig. 3). Reactions of this kind are possibly responsible for the degradation of various OLED materials sharing the same structural features as ADN, so the study of crosslinking processes is of high importance in computational molecular design.
We used the developed automated methodology to simulate intramolecular linking reaction in ADN and calculate corresponding activation barriers. One of our goals was to test the capability of the EVB approach to reproduce ab initio potential energy profiles for the complex polyatomic system. We also aimed to calculate entropic contribution to the total free energy along the reaction path and estimate the impact of realistic material morphology under normal conditions. The results of the simulations can help to compare αand β-ADN molecules and investigate how relatively small changes in molecular topology can affect the stability of OLED materials.
For the considered crosslinking reaction, the most natural choice for the reaction coordinate is the distance between carbon atoms that become connected. We calculate energy profiles for αand β-ADN using three types of simulations. (i) Relaxed potential energy scan along the C-C distance. This type of calculation is identical to reaction path calculations, which are performed with QC methods. (ii) MD simulations at 300 K in a vacuum. The free energy profiles were calculated using steered MD technique, taking into account all intramolecular vibrations and conformational changes. (iii) MD simulations at 300 K in bulk. Free energy profile was derived by averaging nonequilibrium work over multiple trajectories. The results of these three types of calculations are presented in Fig. 3.
It can be seen in Fig. 3 that QMDFF was able to reproduce ab initio potential energy with reasonable accuracy. Different shapes of the curve for long C-C distances can be attributed to poor description of torsional barriers for naphthyl rotation. Nevertheless, the height, position, and shape of the peak are estimated accurately. The introduction of the entropic term resulted in higher free energy of the products for β-ADN (solid black line in Fig. 3) due to the hindered rotation of naphthyl residue after the formation of a new covalent bond. In the case of α-ADN free energy, the difference coincides with the reaction enthalpy since the rotation of naphthyl moiety is partially constrained by sterical clashes even in pristine molecules. Activation barriers have a substantial entropic contribution for both isomers. It can also be seen in Fig. 3 that the free energies of the products and transition states in bulk material are higher than in a vacuum. But the difference is surprisingly small and the free energy profile in the bulk does not show any additional features. In our opinion, the reason is the intramolecular nature of the considered reaction. In this case, initial, final and all intermediate conformations of ADN are expected to be less affected by the surroundings. The distribution of the conformations in the bulk material should be similar to that in the vacuum, as well as the average free energy profile. Nevertheless, profiles for the individual molecules can be very different (see Supplementary Figs. 5 and 6) due to different equilibrium conformations present in the amorphous solid matrix. In the case of bimolecular reactions, such as intermolecular crosslinking or bond cleavage, the entropic contribution is defined primarily by the mutual orientation of reactants or products, making a careful simulation of the material morphology inevitable. To the best of our knowledge, this is the first full-atom MD simulation of the OLED degradation process accounting for entropic effects in realistic morphology. Note that simulation is carried out on the triplet excited potential energy surface of the ADN molecule. MD simulations with QMDFF FF provide a convenient way to perform simulations of this kind for a wide range of organic materials. LAMMPS with implemented QMDFF provides a convenient way to perform simulations of this kind for a wide range of organic materials.
Although experimental observation of the degradation products was made only for a beta isomer of ADN 49 , our simulations support the existence of the same degradation channel for α-ADN. The difference in activation energies between the isomers is small and comes mainly from the entropic term, connecting the molecular structure with the stability of the OLED hole-conducting layer through the rotational mobility of the naphthyl residue.
We note that the quality of intermolecular interactions defined in the FF is of vital importance in MD simulations of various processes relevant for OLED manufacturing and operation: from deposition of OLED molecules 50 to charge or energy transfer 51,52 . Pair-specific dispersion energies based on the D3 scheme, combined with appropriately scaled repulsion terms, provide less empirical and generally more accurate parameterization than atom-type specific parameters used in conventional FFs. This makes QMDFF a promising tool for simulations of OLEDs, either regarding irreversible degradation or steady operation processes.

Light-activated organometallic materials
Another practically relevant problem that can be addressed with QMDFF is the simulation of photoresists. Rational design in this area can be greatly accelerated with the aid of computer simulations 53 . At the current level of technology achieving acceptable line edge roughness and critical dimension, uniformity implies considering explicit molecular structure of the material, so continuous theory alone is not sufficient and molecular simulations are required 54,55 .
On the other hand, the use of conventional MD simulations for many promising photoresists is hindered by the lack of reliable FF parameters due to the unusual chemical composition. The typical example is MORE-Molecular Organometallic Resists 56 , promising materials for lithography, which combine inorganic (usually metal oxide) core and organic ligands covalently bonded to the inorganic part. QMDFF has no restrictions on the FF parameters for a mixture of organic and inorganic compounds, so it can treat MORE materials on equal footing with conventional organic molecules. Using QMDFF, we were able to define FF in this problematic case following the same procedure as were used for organic substances in the preceding sections. This allowed for efficient and large-scale MD simulations of the bulk photoresist material.
We considered a typical metal-containing photoresist based on hybrid organic-inorganic nanoparticles 57 . Each nanoparticle is a small (2-3 nm) hafnium oxo-methacrylate (HfMAA) nanocluster consisted of HfO core and methacrylic acid ligand shell 58 . A typical cluster of this kind has a composition Hf 4 O 2 (OMc) 12 59 with the structure presented in Fig. 4a, b. The hybrid nature of this material allows for (relatively) efficient absorption of short-wavelength photons by hafnium core, followed by the decomposition of the ligand and formation of free radicals, which initiates radical polymerization reaction 57 . As a result, exposed parts of the sample lose their solubility providing a basis for the lithography process.
The development of the first-principles lithographic models should start from the simulation of the atomistic morphology of the bulk disordered phase. Following this idea, we made a model of amorphous HfMAA with 75 nanoparticles inserted in the simulation box. After initial heating, cooling, and subsequent geometry relaxation, we obtained an amorphous structure illustrated in Fig. 4c. The material consists of nearly spherical nanoparticles, which serve as coarse grains (~2 nm), leaving relatively large interparticle spaces. Analysis of mutual arrangement of the nanoparticles and active atomic sites prone to crosslinking reactions allows to develop coarse-grain models suitable for simulations of the resist behavior on time and length scales not accessible with atomistic approach 60,61 . The most straightforward procedure to approximate resist morphology is to consider each nanocluster as a spherical core-shell particle consisting of hardcore and penetrable ligand layer with a certain number of active sites 60 . Key parameters of this model can be derived from the radial pair distributions presented in Fig. 4d.
The radial distribution function (RDF) between centers of masses (COM) of the nanoparticles reveals the existence of the close order in the amorphous phase. The first and the second peaks denote the respective coordination shells. The onset of the distribution corresponds to the diameter of the core-the closest possible distance between two particles. Therefore, the radius of the core is defined as half of that distance: r core = 4.25 Å.
Estimation of the outer radius of the ligand layer R shell is not so straightforward. R shell is half the maximum distance between particles when a crosslinking reaction is possible. We can consider a reaction between the active sites from different nanoparticles as possible, if they are located sufficiently close to each other. The main crosslinking reaction between ligands proposed for HfMAA is the formation of a new bond between two carbon atoms 57 labeled as 1 and 2 in Fig. 4a. From the first minimum of their pair distribution function (see Supplementary Fig. 7), we can estimate cut-off radius d c = 6.35 Å and consider C1-C2 pairs as connected if the distance between them is less than d c . By doing so, we can calculate the number of connected atoms for each pair of nanoparticles. This number as a function of the COM-COM distance between nanoparticles is presented in Fig. 4d. We can see that this quasi-linear function goes to zero right near the first minimum of COM-COM RDF. The outer radius of the ligand layer R shell is defined as half that distance and equals~8.65 Å. From these observations, one can conclude that the ligand layers of any two nanoparticles from the first coordination shell always overlap, while for the second shell they never overlap. This is in contrast with the default assumption made in ref. 60 , where ligand layers start to overlap at the second minimum of RDF.
Despite somewhat different interpretations, our simulations agree well with the core-shell model suggested in ref. 60 . In Fig. 4d (solid line) we present the results of a simple model: two overlapping core-shell particles with a uniform continuous density of active sites inside the layer. The number of connections, in this case, is proportional to the volume of overlap between two spherical layers (see Supplementary Fig. 8). Even this simple model is able to reproduce the overall shape of the MD curve if we set R shell = 10.0 Å. The number of connections is slightly overestimated, since we have not considered edge effects for the active sites located near the boundary of the ligand layer and the presence of other nanoparticles.
HfMAA is a typical example of a state-of-the-art hybrid molecular photoresist. Keeping in mind the large amount of similar organometallic complexes with varying core composition and ligand nature, the ability of QMDFF to automatically parameterize new compounds seems to be valuable in computational material design. A realistic model of the material's morphology can provide key parameters for multiscale simulations of the lithography process.

DISCUSSION
Implementation of QMDFF in the versatile and powerful generalpurpose MD package LAMMPS provides an assess to various applications related to computer simulations and the discovery of novel functional materials. Highly optimized algorithms ensure computational efficiency and scalability on par with modern classical FFs. Moreover, while other FFs are suitable for a relatively narrow region of chemical space, QMDFF can be applicable for nearly every molecule since it utilizes interaction parameters obtained directly from quantum mechanical calculations. The latter is of great importance for the discovery of materials, which feature novel, diverse, and peculiar chemical scaffolds. In the present work, we demonstrated the applicability of the developed approach for real-life problems of functional materials. To the best of our knowledge, this is the first example of large-scale QMDFFbased MD simulations of complex materials with potential technological importance. We illustrated the possibility to address such problems in an automated fashion, even if the material features were very peculiar of chemistry, such as organometallic complexes.
The incorporation of the EVB approach in the QMDFF framework provides us with a powerful tool to study chemical transformations responsible for the operation and degradation of a broad variety of functional materials ranging from OLED degradation or polymer crosslinking to the chemical reactions accompanying etching processes in photoresist materials. The resulting approach is not equivalent to the full-fledged reactive FF, since it cannot predict unknown processes in the material and their final products. In order to use the EVB+QMDFF approach, one has to define certain reactions based on some a priori knowledge or chemical intuition and prepare input for both the reactants and products. From our point of view, this is a reasonable price for the universality of the FF, straightforward parameterization, and its computational speed and accuracy. The ability of QMDFF to describe chemical compounds also in the excited electronic state proved to be useful for ADN degradation study. The products in this case are just intermediate unstable species of diradical nature that seemingly do not exist in the ground state, and the whole reaction proceeds on the triplet surface. As far as we are aware, our simulations of the ADN degradation is the first example of the reaction involving large organic molecules in the bulk of the material studied using EVB.
After validating QMDFF on the set of pure solvents, we can delineate its limits of applicability. Parameters generated by QMDFF are not supposed to be used in the simulations of bulk solvents and materials if hydrogen, halogen, or, possibly, any other types of strong specific intermolecular interactions need to be considered. On the other hand, QMDFF can be used in simulations of organic solvents without specific interactions and consequently can be recommended for the simulations of organic and organometallic materials. The capability to process any chemical elements up to Z = 86 makes QMDFF almost irreplaceable in many applications, such as concerning molecules and materials with transition metals and other exotic elements. Until recently, the only solution in these cases was the universal FF 62 covering the entire periodic table, but it cannot provide the same level of accuracy as system-specific first-principles-based FFs. Therefore, the adaptation of the QMDFF to the condensed phase reported in the present work has a broad spectrum of applications and good prospects in the field of computational materials science.

METHODS EVB approach
EVB approach 13 was used to calculate adiabatic energy along the reaction path defined by the reaction coordinate q on the basis of two equilibrium diabatic states (reactants and products). Denoting potential energy terms for these states as F 1 (q) and F 2 (q), we can write down the so-called EVB matrix: F 1 ðqÞ CðqÞ CðqÞ F 2 ðqÞ : Here, C(q) is an empirical non-diagonal term that ensures the proper mixing of diabatic states. After diagonalization, we can set the lowest eigenvalue as a final adiabatic energy profile: An explicit implementation of the method requires the proper choice of the term C(q). A constant C for all values of q is often not sufficient. The next simplest choice is the Gaussian function that depends on the difference between two diabatic potentials: CðqÞ ¼ a exp Àb F 1 ðqÞ À F 2 ðqÞ À c ½ 2 : Equation (9) introduces three fitting parameters a, b, and c that can be optimized to match the reaction profile obtained from ab initio simulations. Although some cases require much more sophisticated approaches to define interstate coupling 15,[63][64][65] , in the present work we limited ourselves by Eq. (9), since this approximation proved to be robust and efficient in the studies of many chemical reactions 66,67 . Studying chemical reactions in organic materials, one usually does not have to accurately calculate absolute rates, but rather estimate reaction enthalpies and activation barriers, with account for entropic contribution and local molecular conformations. Therefore, some deviation of the derived PES from that of ab initio calculations can be tolerated. Using Eq. (9), we obtained smooth energy curves reasonably close to their ab initio counterparts (see Fig. 3), which supported our choice.

Simulation details
Ab initio calculations required for QMDFF parameterization were performed using PBE 68,69 exchange-correlation functional with GD3BJ 70 empirical dispersion correction. We used 6-31G(d,p) 71,72 basis set for the second-row elements and LanL2DZ 73 effective core potentials for metal atoms.
MD simulations were based on the tabulated potentials for intramolecular and VdW interactions, with conventional coulomb terms for electrostatics. Outer table boundary of 10.5 nm coincided with a cut-off radius of VdW and Coulomb potentials. Tables consisted of 500 bins with energies and forces calculated in the intermediate points using cubic splines. Long-range electrostatic interactions were treated with the Ewald approach 74 . Simulations in canonical and isothermal-isobaric ensembles involved Nose-Hoover-style non-Hamiltonian corrections of the equations of motion 75 with the time constant of 0.1 ps for thermostat and 1.0 ps for barostat. All covalent bonds involving hydrogen atoms were replaced by rigid constraints following SHAKE algorithm 76 . Timestep of the numerical integration was 0.5 fs throughout all simulations.
Free energy profiles for ADN were calculated with COLVARS 77 module using steered MD 78 approach with force constant 5000 kcal mol −1 Å −2 and speed 4.5 Å ns −1 . Total profile was obtained after averaging over 30 independent runs for different molecules in both directions (see Supplementary Figs. 5 and 6).

DATA AVAILABILITY
All files with the parameters, initial coordinates, and simulation setup required to reproduce case studies presented in the paper, as well as final coordinates (the snapshots from equilibrium MD trajectories) are available online at https://github. com/lammps-qmdff/lammps-qmdff.