## Abstract

The multifaceted physics of oxides is shaped by their composition and the presence of defects, which are often accompanied by the formation of polarons. The simultaneous presence of polarons and defects, and their complex interactions, pose challenges for first-principles simulations and experimental techniques. In this study, we leverage machine learning and a first-principles database to analyze the distribution of surface oxygen vacancies (V_{O}) and induced small polarons on rutile TiO_{2}(110), effectively disentangling the interactions between polarons and defects. By combining neural-network supervised learning and simulated annealing, we elucidate the inhomogeneous V_{O} distribution observed in scanning probe microscopy (SPM). Our approach allows us to understand and predict defective surface patterns at enhanced length scales, identifying the specific role of individual types of defects. Specifically, surface-polaron-stabilizing V_{O}-configurations are identified, which could have consequences for surface reactivity.

### Similar content being viewed by others

## Introduction

The rich and tunable physics of oxides depend on their precise chemical composition, and the presence of impurities, including atomic vacancies, interstitial atoms, and dopants in the material^{1,2,3,4}. Defects at the atomic level frequently lead to the formation of polarons, which are localized charge carriers arising from the synergy between unbound charges and lattice phonons^{5,6,7}. In the specific case of so-called small polarons, the polaronic charge is localized almost entirely on one atomic site, surrounded by sizable distortion of the local lattice structure^{8}. In conjunction with their inducing defects, these small polarons play a dominant role in a wide range of processes relevant to technological applications^{9} and fundamental phenomena such as charge carrier mobility^{10,11,12}, electron-hole recombination^{13,14} and adsorption^{15,16,17}.

Most importantly the role of polarons is known to be highly relevant in the context of (photo)catalysis^{18,19,20,21} and, single-atom catalysis^{22,23,24}. The localized charge carriers act as active centers, which enhance (photo)catalytic activity by providing sites that can readily adsorb and interact with reactant molecules^{25,26}. Although polaron formation may in principle occur on any site of the lattice, the defects can act as attractive or repulsive centers, favoring specific polaronic configurations over others^{27}. In turn, the dynamics and distribution of the atomic defects are known to be altered by the polarons^{28}. Therefore, control over the spatial distribution of polaronic active centers becomes pivotal in optimizing (photo)catalytic performance.

While theoretical studies based on density functional theory (DFT) have elucidated excess charge localization in relation to the inducing defect in many materials^{29,30,31}, the specific role of subsurface and surface polarons, particularly in the presence of defects, on the archetypal redox active oxide surface TiO_{2}(110) is still debated. Here, a problem arises from the complexity of the configuration space of point impurities, where DFT calculations strive to account for the computational cost of the problem. As a consequence, either no exploration attempt is performed (i.e., most studies rely on the configuration randomly obtained in the DFT calculation)^{32,33}, or effective but costly approaches are adopted such as molecular dynamics^{34,35}, Monte-Carlo-driven DFT simulations^{36} or systematic explorations limited to a handful of localization sites^{37}. While other fitting methods such as cluster expansion^{38} have addressed the configurational problem of disordered impurities in (oxide) materials^{39,40,41}, the interactions arising from polarons and other charged defects have sizable contributions within large cutoff distances (≈10 Å)^{37} resulting in a combinatorial divergence of possible cluster interactions^{42,43}. Thus, finding a method that effectively navigates the diverse defect-polaron configuration landscape has become a research imperative.

In this study, we focus on rutile TiO_{2}(110) and show how the spatial distribution of V_{O} measured by SPM can be successfully predicted and interpreted by first-principles calculations if the coupling between V_{O} and polarons is taken into account. To address this problem, we developed a strategy based on defect distribution descriptors and neural networks to predict the stability of specific polaron-vacancy patterns. Through an iterative optimization active learning cycle (similar in spirit to cluster expansion approaches studying atomic disorder^{39}), we systematically extended the DFT reference dataset and converged the machine learning (ML) model, to efficiently explore the defect-polaron configuration space. The model can capture the complexity of the V_{O}-polaron interactions with DFT accuracy and proposes alternative configurations showing remarkable energy stability. By feeding Markov-chain Monte-Carlo (MC) algorithms with the ML configuration energies, we simulate the annealing process leading to the formation of vacancies and polarons in the experimental samples. As a final result, we obtain large-area (>10 × 10 nm^{2}) surface morphologies resembling the SPM measurements. This analysis revealed physical properties of the polarons on TiO_{2}(110), where the formation of inhomogeneously distributed V_{O} is linked to an increased formation of surface polarons and, therefore, to the density of active sites.

## Results and discussion

### Defect distribution via DFT, experiment, and machine learning

Figure 1 shows the surface structure of reduced rutile TiO_{2}(110) as imaged by constant current STM measurements (see panel b and Methods Section), together with the models predicted from DFT without taking polarons into account or by explicitly modeling their impact via machine learning (see panels a and c respectively). The unreconstructed 1 × 1 rutile surface consists of alternating rows of under-coordinated (two-fold) oxygen atoms (the bridging oxygen atoms, O_{br}) and fivefold coordinated titanium atoms (Ti_{5c}) running along the [001] direction^{44,45}. Oxygen vacancies form easily on the O_{br} sites upon sputtering and annealing, up to a critical concentration of c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) ≃ 17%^{34}. At stronger reducing conditions, the surface undergoes a structural reconstruction^{46,47,48,49,50,51}. Every V_{O} releases two excess electrons that form polaronic states, localizing preferably on subsurface Ti sites^{27,35,52}. Thus, the V_{O} can be considered as a positively charged (2+) center. By simple electrostatic considerations (and by, simultaneously, neglecting the role of polarons), one would expect a purely repulsive interaction among the vacancies. In this picture, the configuration maximizing the V_{O}–V_{O} distances represents the most favorable vacancy distribution. For the critical concentration of c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) = 17%, this corresponds to a homogeneous configuration with a V_{O}–V_{O} distance of six lattice sites along the [001] row, and three lattice sites considering two oxygen vacancies on adjacent rows (see Fig. 1a).

DFT calculations confirm the homogeneous V_{O} distribution in Fig. 1a as the ground state configuration, as far as the formation of the polarons is suppressed (i.e., the excess electrons are forced into spatially delocalized states at the bottom of the conduction band, rather than localized polaronic states). While this unphysical metallic solution (rutile TiO_{2} is an n-type semiconductor) is less stable than the polaronic solution, it simplifies the search for the optimal defect distribution via a two-step process. Initially identifying the optimal defect pattern through DFT calculations, where polaron formation is suppressed, and subsequently introducing polarons into random positions or finding the most favorable polaron configuration within the given defect distribution^{27}. While this approach reduces the combinatorial divergence of defect-polaron configurations, it relies on the assumption that the distribution of atomic defects is not affected by the polarons, which is not valid for most materials^{28}.

The experimental measurements do not support such homogeneous V_{O} pattern. Figure 1b shows a typical image as obtained from low-temperature STM measurements on a TiO_{2}(110) surface after sputtering and annealing treatment to form a high content of oxygen vacancies (c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) ≃ 14%, close to the critical value of 17%). At this temperature, the oxygen vacancies (imaged as bright spots along the dark [001] O_{br} rows) are immobile and appear in irregular patterns, quite far from any homogeneous distribution. The discrepancy with the simple models discussed above is a strong indication of the role that polarons can have in determining the optimal V_{O} surface structure. Simply adding polarons on a rigid V_{O} pattern (effectively decoupling V_{O} and polaron) as usually done in standard DFT simulation, would not improve the situation.

Figure 1c reports the surface structure as predicted by our machine learning model, which allows simultaneously varying both V_{O} and polaron positions to find the configuration that minimizes the total energy of the system. The resulting V_{O} distribution is in good qualitative agreement with the inhomogeneous distribution found throughout experiments. Our methodology, described in detail in the following, is capable of capturing the effects of the polarons on the oxygen vacancy distribution, going beyond the simple picture relying on purely V_{O}–V_{O} interactions. Moreover, it allows us to consider large surface areas of about 250 nm^{2} (>15 × 15 nm^{2}), corresponding to 54 × 24 supercells, extending considerably the limits of standard DFT simulations.

### Machine learning polaron and defect distributions

The methodology proposed here is structured in three parts: First, we train a feed-forward neural network^{53} to predict the DFT energy of the system depending on the configurations of the impurities. Due to the computational limitation of the DFT calculations, we adopt relatively small unit cells in this step. Polaron-polaron/defect interactions in rutile TiO_{2} become negligible within ≃ 10 Å^{37}. Here, we used two supercells with different lateral extensions (6 × 4 and 12 × 2, see “Methods”) to include these long-ranged interactions along different crystallographic directions. Then, we use the trained model to search for low-energy configurations that were not included in the original set of data, adopting an active learning scheme^{54}. Finally, we use the actively trained model to obtain large-area predictions. In the following, we describe in detail the architecture of the machine-learning model and compare the ML predictions with experimental data on reduced TiO_{2}(110).

The training of the machine learning model requires a reference database built up by several, distinct polaron and atomic-defect configurations. By following the process described in detail in the Methods Section, we calculated the free energy for different configurations at the DFT+*U* level using VASP^{55,56,57}, with a *U* = 3.9 eV on the *d* orbitals of Ti atoms^{27,58}. Polarons were localized at chosen surface Ti_{S0} (Pol_{S0}) and subsurface Ti_{S1} (Pol_{S1}) sites via occupation matrix control^{59}. We modeled 2367 symmetrically-inequivalent polaron-V_{O} configurations in a 6 × 4 supercell (i.e., six and four times the [001] and [1\(\bar{1}\)0] lattice vectors, respectively), and 2155 configurations in a 12 × 2 supercell. To optimize the model, we randomly split the calculated configurations and energies into training and validation data sets, including 80% and 20% configurations, respectively.

Figure 2 sketches the main features of our ML architecture predicting the stability of different defect-polaron configurations. A generic polaron-V_{O} distribution on the rutile TiO_{2} surface is depicted in Fig. 2a. The descriptor representing the configuration is constructed by, first, discretizing the space into a rough grid (see Fig. 2b), and encoding the spatial distribution of polarons and atomic defects. While this representation does not explicitly include the specific structural distortions accompanying the defect configuration, we account for it by training our model on relaxed structures that capture the distortions and their contribution to the energy. This is possible since the distortions accompanying a specific defect configuration are defined by the position of the defects in the lattice (i.e., in a DFT calculation, defining the defect configuration suffices to find the minimum energy structure). The discretized space simplifies the training of the ML model, as compared to using Euclidean distances^{27}. To improve the description of the interaction with polarons/defects at similar crystallographic displacements, we employ a one-hot encoding (i.e., value of 1 for grid cells containing a defect, 0 otherwise) smeared to effectively encode the defect density. We achieve this smearing via multiple applications of a discrete Laplacian kernel to the one-hot encoding (Fig. 2c), mimicking a diffusion process. Then, to predict the energy of the whole system given a specific configuration, we split the total energy into contributions arising from a single defect/polaron impurity (Fig. 2d):

Here, *E*_{tot} is the total energy of a given configuration, and *E*_{i}, *E*_{j}, and *E*_{k} are the virtual contributions of a single Pol_{S1}, Pol_{S0}, and V_{O} respectively. We use a feed-forward neural network to estimate the virtual contribution of a single defect/polaron (Fig. 2e). Finally, we sum over the virtual contributions to obtain the total configuration energy^{27}. The total energy can be computed by DFT calculations^{37}, while the virtual contributions are not directly accessible in the DFT data. Thus, we can train our model using the discretized defect-polaron positions as a descriptor, and the DFT energy as the target quantity. By training the ML model on DFT data obtained for the 6 × 4 unit cell (see Supplementary Fig. 1), we achieved a mean absolute error of 1.8 and 2.2 meV/V_{O} for the training and validation sets, respectively. By adding training data from the 12 × 2 unit cell (see Supplementary Fig. 2), the mean absolute error increased slightly (2.9 and 3.5 meV/V_{O} in training and validation, respectively). However, by using both sets of data in the training, the ML model can account for longer interactions in both the [001] and [\(1\bar{1}0\)] directions. For a detailed description of the training process see “Methods” Section.

Aiming for a comparison with the experimental measurements, we focus here on the low-energy configurations, which are more likely to get stabilized in real samples. To identify such stable configurations, we performed simulations that model the annealing process. In the preparation of the experimental samples, both polarons and oxygen vacancies diffuse on the sample during annealing. At lower temperatures, V_{O} on rutile are immobile, while polarons always show a certain degree of mobility, hopping/tunneling a few lattice sites around the equilibrium position^{34,45}. The simulated annealing can be implemented as a global optimization scheme^{60}. Candidate configurations are obtained by perturbing the current configuration, randomly displacing one defect/polaron impurity to any nearest neighbor site. The candidate configuration is either accepted or declined by virtue of the Metropolis-Hastings algorithm^{61} with the acceptance criterion based on the configuration energy—similar in spirit to large-scale defect distribution studies based on reverse MC^{62} (although in our approach the defect distribution is not fitted to minimize the deviation from experiment, but it relies entirely on DFT/ML data). Calculating the energy of the candidate configurations within the DFT framework would make this approach unfeasible, due to the computational cost of DFT calculations and the high number of energy evaluations required for a single optimization. Conversely, the ML model allows us to inspect the stability of an extremely high number of defect-polaron configurations (minimization of the energy requires on the order of 10^{3}−10^{6} energy evaluations depending on the size and initialization of the employed configuration) and enables the exploration of candidate structures.

We iterated annealing simulations following an active learning procedure. The initial DFT data set built by random configurations was progressively augmented by including the results from the annealing optimization (see Supplementary Fig. 4). Consequently, we obtained a final ML model refined to account for a broader range of configurations. The refined model is finally used to obtain large-area predictions (7.1 × 10.5 nm^{2} to collect statistics and 16 × 16 nm^{2} for visualizations) on the defect-polaron distributions, using again the simulated annealing approach. The qualitative agreement with the experimental data is shown in Fig. 1c for the 54 × 24 rutile TiO_{2}(110) supercell. In the following, we quantitatively analyze our results.

### Formation of V_{O}-polaron patterns and their mutual interaction

The analysis of the low-energy configurations (see Supplementary Fig. 5 for the energy distribution of all possible V_{O} configurations in the 6 × 4 cell) is summarized in Fig. 3. Figure 3a shows the improvement of energies of the TiO_{2}(110) 6 × 4 cell as obtained by treating V_{O}-polaron coupling at three different levels: (i) Suppressing polarons ("No Polarons”); (ii) Distributing polarons in random or positions guided by physical intuition (“Random Polarons”); (iii) Full inclusion of polaron–V_{O} interaction via our proposed ML protocol ("ML Polarons”). By suppressing polaron formation, the ground state configuration is given by the vacancies being homogeneously distributed on the surface (C\({}_{{{{\rm{NoP}}}}}^{{{{\rm{Hom}}}}}\) configuration, see “No Polarons” column in Fig. 3a). The “Random Polarons” column of Fig. 3a shows instead the energy of the system obtained by including polarons in random positions and enriched by adding specific, low-energy polaronic configurations that were suggested in previous studies^{27,34,37}. Here, the ground state configuration is given by a homogeneous distribution of Pol_{S1} in the homogeneous V_{O} pattern (labeled as C\({}_{{{{\rm{RandP}}}}}^{{{{\rm{Hom}}}}}\) in Fig. 3a).

Treating polaron-V_{O} coupling at the ML level (third approach) results in V_{O} distributions with lower energy, indicating an alternative ground state for the system, where the homogeneous configuration is no longer the most stable one, as shown in the “ML Polarons” column of Fig. 3a. First, we note that the ML model identified a different order of Pol_{S1} showing better stability in the homogeneous V_{O} background (labeled as C\({}_{{{{\rm{ML}}}}}^{{{{\rm{Hom}}}}}\), see also Supplementary Fig. 6). Moreover, polaron configurations explored by the extensive ML search improve the stability of many other V_{O} patterns (see the energy levels in black in the “ML Polarons” column of Fig. 3a, lower than in the “Random Polarons” column). Importantly, two of these previously-unexplored polaron configurations (labeled as C\({}_{{{{\rm{ML}}}}}^{{{{\rm{0}}}}}\) and C\({}_{{{{\rm{ML}}}}}^{{{{\rm{2}}}}}\)) resulted in energy values even lower than the homogeneous distribution, revealing an alternative ground state for the system. Moreover, alternative V_{O}-patterns were proposed by the ML search as low-energy configurations. One in particular (red line in Fig. 3a) is ranked as the second most stable configuration (C\({}_{{{{\rm{ML}}}}}^{{{{\rm{1}}}}}\)). The polarons play a key role in stabilizing this V_{O}-pattern and as further proof, we calculated the energy of this V_{O}-pattern, artificially suppressing the polaron formation, and obtained a much worse stability (red line in the “No Polarons” column).

Interestingly, in all the low-energy configurations obtained from the ML-driven search (except for C\({}_{{{{\rm{ML}}}}}^{{{{\rm{Hom}}}}}\)), we note the presence of at least one polaron on a surface Ti_{S0} site (configurations containing Pol_{S0} are orange highlighted in Fig. 3a). Figure 3b and c compare the spatial distribution of the surface Pol_{S0} and subsurface Pol_{S1}. The formation of the surface polaron is particularly stable when occurring in the central Ti_{S0} site between two oxygen vacancies aligned on the [1\(\bar{1}\)2] direction (see top view in Fig. 3b). This [1\(\bar{1}\)2]-aligned V_{O}-polaron complex represents indeed the ground state configuration obtained by our ML search (e.g., it is present in C\({}_{{{{\rm{ML}}}}}^{{{{\rm{0,1,2}}}}}\)). Another remarkably stable complex is given by two vacancies aligned along the [\(1\bar{1}0\)] direction and one Pol_{S0} in their vicinity (see Supplementary Fig. 6). For instance, this complex appears in the configurations highlighted in purple in Fig. 3a (C\({}_{{{{\rm{ML}}}}}^{{{{\rm{10,11}}}}}\)). The [1\(\bar{1}\)2] and [\(1\bar{1}0\)] alignments found in the ML search agree well with the experimental SPM measurements (compare Fig. 1b and c) showing a high coverage of such high-density V_{O} regions. In contrast, DFT predictions, which neglect polaron-V_{O} interaction or randomly distribute polarons, favor homogeneous configurations.

### Comparison of a large-scale model and the experimental surface

Figure 4 shows our results as obtained by ML-driven annealing simulations on large-area 24 × 16 cells (corresponding to 7 × 10 nm^{2}), which enables a direct comparison with the experiment. Visual inspection (Fig. 1b vs. c) already indicates that our ML treatment provides V_{O} distribution that closely resembles the experimental one. We quantify this agreement by calculating autocorrelation functions (ACF)^{63} for simulated annealings under different computational conditions and compare it to the experimental ACF of the V_{O} distributions extracted from Fig. 1b (for details see Supplementary Fig. 8). The simulated annealing procedure starts from random V_{O}–polaron configurations, where we obtain several large-area models (such as the one in Fig. 1c), all showing very similar characteristics. To complete our comparison, we also use the ML model to anneal a system where polaron formation is suppressed. This scheme, similar to the non-polaronic DFT approach of Fig. 1a, assumes a homogeneous V_{O} pattern but takes into account annealing-induced disorder effects.

The ACFs are shown in Fig. 4, where projections of V_{O} defect populations along the same and adjacent [001] rows are shown in the histograms in panels a and b, respectively. For oxygen vacancies lying on the same row, both the ML model and the experiments show that short V_{O}–V_{O} distances of 1 and 2 lattice sites are unlikely. The highest probability lies at a distance of 4 or 5 lattice sites for both the experimental and ML annealing including polaron–V_{O} interactions (see Fig. 4a and b red and blue data, respectively). By considering only the V_{O}–V_{O} repulsion as driving force (i.e., excluding polaron formation in the ML annealing procedure; see ML No Pol in Fig. 4a and b) and applying an identical annealing protocol as in the polaron-V_{O} interaction case, we find the probability maximum lying at a 6-site distance for in-row and 3-site distance in the adjacent row. This is further evidence for polarons’ role in stabilizing the V_{O} arrangement.

As a result, the rutile TiO_{2} surface shows some areas with a locally low density of oxygen vacancies (down to 0%), alternated with highly dense areas (up to 20%, which is compatible with the 4-site-distance distribution). Our data suggest that the great stability of the [1\(\bar{1}\)2]- and [\(1\bar{1}0\)]-aligned V_{O}–polaron complexes contribute to this alternation of locally less and more reduced areas at a given c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\). To further corroborate this result, we performed additional DFT calculations modeling this strong inhomogeneity (see Supplementary Fig. 6 with configurations C\({}_{{{{\rm{ML}}}}}^{{{{\rm{26-29}}}}}\)). We also note that this analysis reconciles the DFT predictions on the critical concentration at which the (1 × 2) surface reconstruction occurs for the surface phase transition, which was calculated as ~20%, in apparent disagreement with the experiments reporting an average concentration of 17%^{34}.

## Conclusions

In summary, we directly elucidated the impact of polarons on the structure of oxide surfaces, using an example of the prototypical rutile TiO_{2}(110) surface. Specifically, we designed a computational machinery to predict the distribution of polarons and oxygen vacancies on rutile TiO_{2}(110), by performing machine-learning-guided DFT calculations. While the defect configuration energy could in principle also be modeled by a linear model such as cluster expansion, non-linear models (such as neural networks) might turn out preferable due to the complexity of the interactions at play in oxide materials^{43}. MC-driven annealing simulations based on the ML data enabled the exploration of defect distributions on scales much larger than standard DFT allows. An analysis of the experimental SPM images yielded a direct validation of the theoretical predictions. While conventional approximations used in traditional DFT calculations result in homogeneous solutions, we were able to retrieve the inhomogeneity of the V_{O} distribution as detected by the experiments. Our analysis clarifies the peculiar inhomogeneous distribution of V_{O} on rutile TiO_{2}(110). Most importantly, the system shows a tendency towards the formation of high-density V_{O} patterns, alternated with low-density V_{O} regions. While larger defect-free areas are typically attributed to subsurface Ar impurities^{64}, the here observed fluctuation of the local c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) can partially be attributed to the interaction of polarons and V_{O}s.

These results suggest that surface reactivity could be optimized by tuning the annealing procedure to facilitate the formation of energetically more favorable, high-density V_{O} patterns, which promote surface localized charges and their interaction with adsorbates^{26}. To elucidate the role of the surface polaron, further experiments are necessary. Resonant photoelectron diffraction does not rule out the formation of surface localized charge carriers, even at low c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\)^{65}. SPM measurements in the presence of CO adsorbates confirm the formation of the [1\(\bar{1}\)2]-aligned V_{O}–polaron complex^{26}, while STM measurements probing the filled states on the clean surface do show some disparity in comparison to simulated STM^{37}. The reasons for this discrepancy are manifold, ranging from temperature-induced effects^{66}, to the electric field of the tip.

Furthermore, we expect our methodology to be applicable to any other polaronic system, even including multiple defects as sources of polarons, such as the perovskite SrTiO_{3}(001) surface^{67} exhibiting Sr adatom/vacancy and often doped by Nb atoms^{29}. Moreover, this methodology could be used to study the spatial distribution of defects (e.g., subsurface, bulk) that are not directly accessible by the experiments, such as interstitial titanium in rutile. Additionally, the stochastic optimization model could be further improved by considering realistic anisotropic diffusion probabilities along certain directions. This could be achieved by explicitly computing hopping and diffusion barriers, and incorporating these barriers into the annealing simulations.

## Methods

### DFT modeling

We performed DFT+*U* calculations using VASP^{55,56,57} on the rutile TiO_{2}(110) surface. We used standard projector augmented wave pseudopotentials for Ti (treating *d*- and *s*-orbitals as valence) and soft O pseudopotentials. We adopted a Hubbard *U* = 3.9 eV on the *d* orbitals of Ti atoms^{27,58}. The sampling of the reciprocal space included the Γ-point and the plane-wave energy cutoff was set to 400 eV.

The surfaces were modeled using five-layer-thick slabs (where the two bottom stochiometric layers were fixed at their bulk position) with lateral supercell sizes of 6 × 4 and 12 × 2. To partially account for the role of thermal effects in the stabilization of the V_{O} patterns during the annealing treatment in the experiments, we used an expanded [001] lattice vector. Specifically, the low T lattice constant of 2.953 Å^{44} was expanded to 2.968 Å (high T corresponding to 500–600 K) in accordance with thermal expansion coefficient measurements^{68,69}. This strain of +0.5% is well below the crossover point of + 3%, where surface polaron formation is favored over subsurface polaron formation^{66}.

Within the supercells, we removed 4 surface-bridging oxygen atoms (in random positions) from every slab, obtaining a c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) of approximately 17%. To assess non-polaronic solutions we performed spin unpolarized DFT, constraining the excess electrons in spatially delocalized states at the bottom of the conduction band. To model the polaronic structure, we followed a three-step procedure: Initially, we removed bridging oxygen atoms from a pristine structure to generate a specific oxygen vacancy pattern. This structure was relaxed while all excess charge carriers were kept delocalized by employing a spin un-polarized relaxation. After retrieving the structural properties of the oxygen vacancy configuration, we introduced polaronic distortions at selected sites via occupation matrix control^{59}, using distinct occupation matrices for Pol_{S1} and Pol_{S0} sites^{27}. Finally, we performed an unconstrained relaxation starting from the structures and wave functions determined in the previous step. As a starting point for building our dataset, we chose 43 symmetrically inequivalent V_{O}configurations and pre-relaxed them at the delocalized level. Afterward, we added polarons at random positions to these relaxed structures, while keeping the polaron densities in each layer within the ranges suggested by our previous works^{27,37} (i.e., most configurations have less than 4 surface polarons). After the initial dataset was constructed, we systematically extended the dataset via simulated annealing with our ML model. With this active learning procedure, we extended our dataset by roughly 200 configurations.

### ML model training and optimization of defect configurations

The machine learning model is implemented in the framework of JAX^{70}. Here, we describe the model optimization based on the study of configurations in the 6 × 4 supercell. Our ML model consists of featurisation (see Fig. 2a–d) and defect-type-specific feed-forward neural networks (see Fig. 2e). The featurisation consists of a one-hot encoding on a grid, specifying the defect configuration, followed by smearing via multiple applications of a discrete laplacian kernel:

Here \({u}_{i,j}^{n}\) represents the defect density encoding for the defect *u* ∈ {V_{O}, Pol_{S0}, Pol_{S1}} at grid position *i*, *j* and after iteration *n*. *D* is treated as a hyperparameter and was set to 0.1. The application of the Laplacian kernel was performed for 8 iterations. The local environment range was set to include 6-sites in the [001]- and 4-sites in the \([1\bar{1}0]\)-direction, as this choice provided the best results in the hyperparameter optimization (see Supplementary Fig. 3). Each defect-specific feed-forward neural network consisted of 3 layers of width 94, 32, and 10, respectively. We optimized the machine learning model using stochastic gradient descent and backpropagation on an augmented dataset, including all symmetrically equivalent representations, of the training defect configurations. We randomly split this dataset into 80% training data and 20% validation data and optimized the model parameters by minimizing the mean squared error of the energy prediction of the training data via backpropagation. Before training, energies, as well as the descriptors, were rescaled to [0, 1], by min-max scaling. Using an early stopping mechanism, the best model was selected based on the lowest validation dataset error within the optimization procedure. The mean squared error during training as well as a scatter plot of DFT and ML energies are displayed in Supplementary Figs. 1 and 2.

To ensure sufficient accuracy when using the model in the case of exploration, we applied an active learning procedure as depicted in Supplementary Fig. 4. Here, we performed an iterative training-testing loop to further improve the reliability, data efficiency, and scope of the proposed model. Since our main interest lies in the determination of low-energy polaron-defect complexes, our model was used for the optimization of defect configurations in various cases. We searched for global optima of configurations by allowing all defects to diffuse during the optimization. Local minima of fixed polaron layer densities were added by restricting polaron movement to intra-layer hopping. Also, local minima of cases where the V_{O}-configuration was fixed and only polarons were relaxed, were explored. Within these three exploration cases, we extracted and confirmed the most stable configurations by performing a comparative DFT calculation of the proposed polaron configurations.

The optimization of configurations is performed via simulated annealing^{60}, where we use the predicted energy of our ML model to determine the Metropolis acceptance probability *P*_{accept}.

Here *E*(**x**) is the ML predicted energy associated with configuration **x**. Configuration \({{{{\bf{x}}}}}^{{\prime} }\) is generated from configuration **x** in a random walk, by randomly choosing one defect in configuration **x** and randomly displacing it to an adjacent site. The temperature variable *T* in the Metropolis criterion was set to 1000 K (similar to the annealing temperatures in the sample preparation). Even though the diffusion processes of the respective defects during the optimization are physically motivated, they do not necessarily represent the physical process of the formation of observed defect patterns. Defect transport mechanisms such as inter-row hopping of oxygen vacancies have not been reported^{71} but may improve optimization efficiency or more efficiently overcome energy barriers. Discrepancies between polaron and vacancy hopping rates were also ignored, which potentially affects the final outcome of the optimization. Similar effects were observed for the specific temperature or temperature ramp employed in the simulated annealing.

### Experimental setup

SPM was performed using STM in an ultrahigh vacuum (UHV) chamber with a base pressure below 2 × 10^{−11} mbar; the whole chamber, equipped with an Omicron qPlus low-temperature head, was suspended using 36 bungee cords for efficient vibration damping^{72}. Stiff qPlus sensors^{73} (*k* = 1800 N⋅m^{−1}, *Q* = 5000–30,000, *f*_{0} ∈ [25–45] kHz) with a a sharp W tip^{74} were used to collect the tunneling current (*I*_{t}) and the frequency shift (Δ*f*) signals; deflection detection was achieved using a cryogenic preamplifier in vacuum^{75}. W tips were treated at a Cu(110) surface decorated with a sharp, conductive Cu pyramid at the apex, and were subsequently applied for imaging the rutile TiO_{2}(110) surface. Tip sharpness was indicated by the low-frequency shifts (Δ*f* ∈ [0, −1] Hz) recorded during STM imaging of a Cu(110) test sample.

Sample preparation was performed in a separate UHV chamber (connected to the measurement chamber via a gate valve for in-situ transfer) with a base pressure below 1 × 10^{−10} mbar. Surfaces were cleaned by cycles of sputtering and UHV annealing that consequently reduced the samples and introduced V_{O}s to the surface. A typical cleaning cycle consisted of sputtering with 1.5 keV Ar^{+} ions for 10 min with an ion current of 1 μA ⋅ cm^{−2}, and subsequently annealing the sputtered surfaces in UHV up to 700 °C. Before each measurement, 3–5 cleaning cycles were performed. The over-reduction of the surface was occasionally remedied by annealing the sample to 750 °C in 5 × 10^{−7} mbar of O_{2} shower for 10 min. When the reduction level was too high, the rutile TiO_{2} samples were re-oxidized ex situ at 800 °C in O_{2} flow and reintroduced to UHV for cleaning.

Figure 1b displays the *z*-channel of a feedback-controlled unoccupied-states STM image taken at a sample temperature of 14 K; Imaging parameters: sample bias *V*_{S} = +0.9 V, grounded tip, tunneling current set-point *I*_{t }= 20 pA, oscillation amplitude *A* = 500 pm.

The contrast in Fig. 1b corresponds to a typical unoccupied-states STM imaging contrast over a reduced rutile TiO_{2}(110) surface, which is dominated by electronic rather than geometric considerations: 1 eV above the Fermi level the conduction band consists of Ti 3d states and defect V_{O} states, while the O states constitute the valence band^{76}. Therefore, the highest probability of electron tunneling from the tip to the surface is above the Ti_{5c} rows and V_{O}s—they appear bright under these STM conditions. On the other hand, the tunneling is less likely above the O_{br} rows and they appear dark even though they geometrically protrude highest from the surface. Note that in Fig. 1b individual Ti_{5c} atoms can be recognized as spheres forming a row along the [001] direction, while V_{O}s are recognized as isolated, bright spheres.

## Data availability

The data presented in this article is available in Zenodo^{77} or from the corresponding author upon request.

## Code availability

A minimal example of the code used to produce the presented results is available from Github or Zenodo^{77}.

## References

Rousseau, R., Glezakou, V.-A. & Selloni, A. Theoretical insights into the surface physics and chemistry of redox-active oxides.

*Nat. Rev. Mater.***5**, 460–475 (2020).Franceschi, G. & Diebold, U.

*Oxide Surfaces*(Elsevier, 2023). https://doi.org/10.1016/B978-0-12-819728-8.00059-0.Jupille, J. & Thornton, G.

*Defects at Oxide Surfaces*,*Springer Series in Surface Sciences*(Springer International Publishing, 2015). https://doi.org/10.1007/978-3-319-14367-5.Strand, J. & Shluger, A. L. On the structure of oxygen-deficient amorphous oxide films.

*Adv. Sci.***11**, 2306243 (2023).Franchini, C., Reticcioli, M., Setvin, M. & Diebold, U. Polarons in materials.

*Nat. Rev. Mater.***6**, 560–586 (2021).Emin, D. Polarons (Cambridge University Press, 2013). http://www.cambridge.org/9780521519069.

Alexandrov, A. S. & Devreese, J. T.

*Advances in Polaron Physics*,*Springer Series in Solid-State Sciences*(Springer International Publishing, 2010). http://link.springer.com/10.1007/978-3-642-01896-1.Stoneham, A. M. et al. Trapping, self-trapping and the polaron family.

*J. Phys. Condens. Matter***19**, 255208 (2007).Pastor, E. et al. Electronic defects in metal oxide photocatalysts.

*Nat. Rev. Mater.***7**, 503–521 (2022).Kick, M., Grosu, C., Schuderer, M., Scheurer, C. & Oberhofer, H. Mobile small polarons qualitatively explain conductivity in lithium titanium oxide battery electrodes.

*J. Phys. Chem. Lett.***11**, 2535–2540 (2020).Chen, M. et al. The impacts of dopants on the small polaron mobility and conductivity in hematite - the role of disorder.

*Nanoscale***15**, 1619–1628 (2023).Smart, T. J. & Ping, Y. Effect of defects on the small polaron formation and transport properties of hematite from first-principles calculations.

*J. Phys. Condens. Matter***29**, 394006 (2017).Cheng, C., Zhu, Y., Fang, W.-H., Long, R. & Prezhdo, O. V. CO adsorbate promotes polaron photoactivity on the reduced rutile TiO

_{2}(110) surface.*JACS Au***2**, 234–245 (2022).Cheng, C., Zhu, Y., Zhou, Z., Long, R. & Fang, W.-H. Photoinduced small electron polarons generation and recombination in hematite.

*npj Comput. Mater.***8**, 1–8 (2022).Sokolović, I. et al. Resolving the adsorption of molecular O

_{2}on the rutile TiO_{2}(110) surface by noncontact atomic force microscopy.*Proc. Natl Acad. Sci. USA***117**, 14827–14837 (2020).Tanner, A. J. et al. Polaron-adsorbate coupling at the TiO

_{2}(110)-carboxylate interface.*J. Phys. Chem. Lett.***12**, 3571–3576 (2021).Yim, C. M. et al. Visualization of water-induced surface segregation of polarons on rutile TiO

_{2}(110).*J. Phys. Chem. Lett.***9**, 4865–4871 (2018).Cheng, C., Zhou, Z. & Long, R. Time-domain view of polaron dynamics in metal oxide photocatalysts.

*J. Phys. Chem. Lett.***14**, 10988–10998 (2023).Ren, Z., Shi, Z., Feng, H., Xu, Z. & Hao, W. Recent progresses of polarons: fundamentals and roles in photocatalysis and photoelectrocatalysis.

*Adv. Sci.*2305139 https://onlinelibrary.wiley.com/doi/10.1002/advs.202305139 (2023).Dohnálek, Z., Lyubinetsky, I. & Rousseau, R. Thermally-driven processes on rutile TiO

_{2}(110)-(1 × 1): a direct view at the atomic scale.*Prog. Surf. Sci.***85**, 161–205 (2010).Tanner, A. J. & Thornton, G. TiO

_{2}polarons in the time domain: implications for photocatalysis.*J. Phys. Chem. Lett.***13**, 559–566 (2022).Sombut, P. et al. Role of polarons in single-atom catalysts: case study of Me1 [Au1, Pt1, and Rh1] on TiO

_{2}(110).*Top. Catal.***2**, 1–16 (2022).Geiger, J. & López, N. Coupling metal and support redox terms in single-atom catalysts.

*J. Phys Chem. C***126**, 13698–13704 (2022).Geiger, J., Sabadell-Rendón, A., Daelman, N. & López, N. Data-driven models for ground and excited states for single atoms on ceria.

*npj Comput. Mater.***8**, 1–8 (2022).Cao, Y. et al. Scenarios of polaron-involved molecular adsorption on reduced TiO

_{2}(110) surfaces.*Sci. Rep.***7**, 6148 (2017).Reticcioli, M. et al. Interplay between adsorbates and polarons: CO on rutile TiO

_{2}(110).*Phys. Rev. Lett.***122**, 016805 (2019).Birschitzky, V. C., Ellinger, F., Diebold, U., Reticcioli, M. & Franchini, C. Machine learning for exploring small polaron configurational space.

*npj Comput. Mater.***8**, 1–9 (2022).Zhang, D., Han, Z. K., Murgida, G. E., Ganduglia-Pirovano, M. V. & Gao, Y. Oxygen-vacancy dynamics and entanglement with polaron hopping at the reduced CeO

_{2}(111) surface.*Phys. Rev. Lett.***122**, 096101 (2019).Ellinger, F., Shafiq, M., Ahmad, I., Reticcioli, M. & Franchini, C. Small polaron formation on the Nb-doped SrTiO

_{3}(001) surface.*Phys. Rev. Mater.***7**, 064602 (2023).Österbacka, N., Ambrosio, F. & Wiktor, J. Charge localization in defective BiVO

_{4}.*J. Phys. Chem. C***126**, 2960–2970 (2022).Sun, L., Huang, X., Wang, L. & Janotti, A. Disentangling the role of small polarons and oxygen vacancies in CeO

_{2}.*Phys. Rev. B***95**, 245101 (2017).Reticcioli, M., Diebold, U., Kresse, G. & Franchini, C.

*Handbook of Materials Modeling*(Springer International Publishing, 2019). http://link.springer.com/10.1007/978-3-319-50257-1_52-1, https://link.springer.com/10.1007/978-3-319-50257-1.Pham, T. D. & Deskins, N. A. Efficient method for modeling polarons using electronic structure methods.

*J. Chem. Theory Comput.***16**, 5264–5278 (2020).Reticcioli, M. et al. Polaron-driven surface reconstructions.

*Phys. Rev. X***7**, 031053 (2017).Kowalski, P. M., Camellone, M. F., Nair, N. N., Meyer, B. & Marx, D. Charge localization dynamics induced by oxygen vacancies on the TiO

_{2}(110) surface.*Phys. Rev. Lett.***105**, 146405 (2010).Han, Z. K., Yang, Y. Z., Zhu, B., Ganduglia-Pirovano, M. V. & Gao, Y. Unraveling the oxygen vacancy structures at the reduced CeO

_{2}(111) surface.*Phys. Rev. Mater.***2**, 035802 (2018).Reticcioli, M., Setvin, M., Schmid, M., Diebold, U. & Franchini, C. Formation and dynamics of small polarons on the rutile TiO

_{2}(110) surface.*Phys. Rev. B***98**, 045306 (2018).Sanchez, J., Ducastelle, F. & Gratias, D. Generalized cluster description of multicomponent systems.

*Phys. A: Stat. Mech. Appl.***128**, 334–350 (1984).Cao, L., Li, C. & Mueller, T. The use of cluster expansions to predict the structures and properties of surfaces and nanostructured materials.

*J. Chem. Inf. Model.***58**, 2401–2413 (2018).Barroso-Luque, L. et al. Cluster expansions of multicomponent ionic materials: formalism and methodology.

*Phys. Rev. B***106**, 144202 (2022).Yang, J. H., Chen, T., Barroso-Luque, L., Jadidi, Z. & Ceder, G. Approaches for handling high-dimensional cluster expansions of ionic systems.

*npj Comput. Mater.***8**, 1–11 (2022).Wang, K., Cheng, D. & Zhou, B.-C. Generalization of the mixed-space cluster expansion method for arbitrary lattices.

*npj Comput. Mater.***9**, 1–11 (2023).Kasamatsu, S. et al. Facilitating ab initio configurational sampling of multicomponent solids using an on-lattice neural network model and active learning.

*J Chem. Phys.***157**, 104114 (2022).Diebold, U. The surface science of titanium dioxide.

*Surf. Sci. Rep.***48**, 53–229 (2003).Setvin, M. et al. Direct view at excess electrons in TiO

_{2}rutile and anatase.*Phys. Rev. Lett.***113**, 086402 (2014).Onishi, H. & Iwasawa, Y. Reconstruction of TiO

_{2}(110) surface: STM study with atomic-scale resolution.*Surf. Sci.***313**, L783–L789 (1994).Li, M. et al. Oxygen-induced restructuring of the TiO

_{2}(110) surface: a comprehensive study.*Surf. Sci.***437**, 173–190 (1999).Li, M., Hebenstreit, W. & Diebold, U. Morphology change of oxygen-restructured surfaces by UHV annealing: formation of a low-temperature structure.

*Phys. Rev. B***61**, 4926–4933 (2000).McCarty, K. F. & Bartelt, N. C. The 1 × 1/1 × 2 phase transition of the TiO

_{2}(110) surface-variation of transition temperature with crystal composition.*Surf. Sci.***527**, L203–L212 (2003).Wang, Q., Oganov, A. R., Zhu, Q. & Zhou, X. F. New reconstructions of the (110) surface of rutile TiO

_{2}predicted by an evolutionary method.*Phys Rev. Lett.***113**, 266101 (2014).Mochizuki, I. et al. Structure determination of the rutile-TiO

_{2}(110)-(1 × 2) surface using total-reflection high-energy positron diffraction (TRHEPD).*Phys. Chem. Chem. Phys.***18**, 7085–7092 (2016).Deskins, N. A., Rousseau, R. & Dupuis, M. Localized electronic states from surface hydroxyls and polarons in TiO

_{2}(110).*J. Phys. Chem. C***113**, 14583–14586 (2009).LeCun, Y., Bengio, Y. & Hinton, G. Deep learning.

*Nature***521**, 436–444 (2015).Behler, J. Four generations of high-dimensional neural network potentials.

*Chem. Rev.***121**, 10037–10072 (2021).Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set.

*Comput. Mater. Sci.***6**, 15–50 (1996).Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to the projector augmented-wave method.

*Phys. Rev. B***59**, 1758–1775 (1999).Blöchl, P. E. Projector augmented-wave method.

*Phys. Rev. B***50**, 17953–17979 (1994).Dudarev, S. & Botton, G. Electron-energy-loss spectra and the structural stability of nickel oxide: An LSDA+U study.

*Phys. Rev. B***57**, 1505–1509 (1998).Allen, J. P. & Watson, G. W. Occupation matrix control of d- and f-electron localisations using DFT + U.

*Phys. Chem. Chem. Phys.***16**, 21016–21031 (2014).Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing.

*Science***220**, 671–680 (1983).Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines.

*J. Chem. Phys.***21**, 1087–1092 (1953).Ji, W. et al. Oxygen vacancy distributions and electron localization in a CeO

_{2}(100) nanocube.*Inorg. Chem. Front.***9**, 275–283 (2022).Setvin, M. et al. A multitechnique study of CO adsorption on the TiO

_{2}anatase (101) surface.*J. Phys. Chem. C***119**, 21044–21052 (2015).Potapenko, D. V., Li, Z., Kysar, J. W. & Osgood, R. M. Nanoscale strain engineering on the surface of a bulk tio

_{2}crystal.*Nano Letters***14**, 6185–6189 (2014).Krüger, P. et al. Defect states at the TiO

_{2}(110) surface probed by resonant photoelectron diffraction.*Phys. Rev. Lett.***100**, 055501 (2008).Reticcioli, M., Diebold, U. & Franchini, C. Modeling polarons in density functional theory: lessons learned from TiO

_{2}.*J. Phys. Condens. Matter***34**, 204006 (2022).Sokolović, I., Schmid, M., Diebold, U. & Setvín, M. Incipient ferroelectricity: a route towards bulk-terminated SrTiO

_{3}.*Phys. Rev. Mater.***3**, 034407 (2019).Kirby, R. K. Thermal expansion of rutile from 100 to 700 K.

*J. Res. Natl Bureau Stand. Sect. A Phys. Chem.***71A**, 363–369 (1967).Hummer, D. R., Heaney, P. J. & Post, J. E. Thermal expansion of anatase and rutile between 300 and 575 K using synchrotron powder X-ray diffraction.

*Powder Diffr.***22**, 352–357 (2007).Bradbury, J. et al. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax (2018).

Zhang, Z. et al. Imaging intrinsic diffusion of bridge-bonded oxygen vacancies on TiO

_{2}(110).*Phys. Rev. Lett.***99**, 126105 (2007).Schmid, M., Setvín, M. & Diebold, U. Device for suspending a load in a vibration-insulated manner (2019).

Giessibl, F. J. The qPlus sensor, a powerful core for the atomic force microscope.

*Rev. Sci. Instrum.***90**, 011101 (2019).Setvín, M. et al. Ultrasharp tungsten tips—characterization and nondestructive cleaning.

*Ultramicroscopy***113**, 152–157 (2012).Huber, F. & Giessibl, F. J. Low noise current preamplifier for qPlus sensor deflection signal detection in atomic force microscopy at room and low temperatures.

*Rev. Sci. Instrum.***88**, 073702 (2017).Diebold, U., Anderson, J. F., Ng, K. O. & Vanderbilt, D. Evidence for the tunneling site on transition-metal oxides: TiO

_{2}(110).*Phys. Rev. Lett.***77**, 1322 (1996).Birschitzky, V., Reticcioli, M. & Franchini, C. Dataset: machine learning based prediction of polaron-vacancy patterns on the TiO

_{2}(110) surface. https://zenodo.org/records/10991103 (2024).

## Acknowledgements

This research was funded in part by the Austrian Science Fund (FWF) 10.55776/F81. For open access purposes, the author has applied a CC BY public copyright license to any author-accepted manuscript version arising from this submission. The computational results have been achieved using the Vienna Scientific Cluster (VSC). C.F. acknowledges the NextGenerationEU-Piano Nazionale Resistenza e Resilienza (PNRR) CN-HPC grant no. (CUP) J33C22001170001, SPOKE 6 - Multiscale Modelling & Engineering. KP acknowledges the NRDIO-Hungary grant no. FK124100 and a Bolyai Fellowship of the Hungarian Academy of Sciences.

## Author information

### Authors and Affiliations

### Contributions

V.B. implemented the ML model and performed calculations together with M.P. and M.R. V.B. wrote the first draft under the supervision of C.F. and M.R. C.F. and M.R. conceptualized and supervised the work. I.S. and M.S. performed experiments under the supervision of U.D. All authors contributed to reviewing and editing the final draft.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Birschitzky, V.C., Sokolović, I., Prezzi, M. *et al.* Machine learning-based prediction of polaron-vacancy patterns on the TiO_{2}(110) surface.
*npj Comput Mater* **10**, 89 (2024). https://doi.org/10.1038/s41524-024-01289-4

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41524-024-01289-4