Introduction

The rich and tunable physics of oxides depend on their precise chemical composition, and the presence of impurities, including atomic vacancies, interstitial atoms, and dopants in the material1,2,3,4. Defects at the atomic level frequently lead to the formation of polarons, which are localized charge carriers arising from the synergy between unbound charges and lattice phonons5,6,7. In the specific case of so-called small polarons, the polaronic charge is localized almost entirely on one atomic site, surrounded by sizable distortion of the local lattice structure8. In conjunction with their inducing defects, these small polarons play a dominant role in a wide range of processes relevant to technological applications9 and fundamental phenomena such as charge carrier mobility10,11,12, electron-hole recombination13,14 and adsorption15,16,17.

Most importantly the role of polarons is known to be highly relevant in the context of (photo)catalysis18,19,20,21 and, single-atom catalysis22,23,24. The localized charge carriers act as active centers, which enhance (photo)catalytic activity by providing sites that can readily adsorb and interact with reactant molecules25,26. Although polaron formation may in principle occur on any site of the lattice, the defects can act as attractive or repulsive centers, favoring specific polaronic configurations over others27. In turn, the dynamics and distribution of the atomic defects are known to be altered by the polarons28. Therefore, control over the spatial distribution of polaronic active centers becomes pivotal in optimizing (photo)catalytic performance.

While theoretical studies based on density functional theory (DFT) have elucidated excess charge localization in relation to the inducing defect in many materials29,30,31, the specific role of subsurface and surface polarons, particularly in the presence of defects, on the archetypal redox active oxide surface TiO2(110) is still debated. Here, a problem arises from the complexity of the configuration space of point impurities, where DFT calculations strive to account for the computational cost of the problem. As a consequence, either no exploration attempt is performed (i.e., most studies rely on the configuration randomly obtained in the DFT calculation)32,33, or effective but costly approaches are adopted such as molecular dynamics34,35, Monte-Carlo-driven DFT simulations36 or systematic explorations limited to a handful of localization sites37. While other fitting methods such as cluster expansion38 have addressed the configurational problem of disordered impurities in (oxide) materials39,40,41, the interactions arising from polarons and other charged defects have sizable contributions within large cutoff distances (≈10 Å)37 resulting in a combinatorial divergence of possible cluster interactions42,43. Thus, finding a method that effectively navigates the diverse defect-polaron configuration landscape has become a research imperative.

In this study, we focus on rutile TiO2(110) and show how the spatial distribution of VO measured by SPM can be successfully predicted and interpreted by first-principles calculations if the coupling between VO and polarons is taken into account. To address this problem, we developed a strategy based on defect distribution descriptors and neural networks to predict the stability of specific polaron-vacancy patterns. Through an iterative optimization active learning cycle (similar in spirit to cluster expansion approaches studying atomic disorder39), we systematically extended the DFT reference dataset and converged the machine learning (ML) model, to efficiently explore the defect-polaron configuration space. The model can capture the complexity of the VO-polaron interactions with DFT accuracy and proposes alternative configurations showing remarkable energy stability. By feeding Markov-chain Monte-Carlo (MC) algorithms with the ML configuration energies, we simulate the annealing process leading to the formation of vacancies and polarons in the experimental samples. As a final result, we obtain large-area (>10 × 10 nm2) surface morphologies resembling the SPM measurements. This analysis revealed physical properties of the polarons on TiO2(110), where the formation of inhomogeneously distributed VO is linked to an increased formation of surface polarons and, therefore, to the density of active sites.

Results and discussion

Defect distribution via DFT, experiment, and machine learning

Figure 1 shows the surface structure of reduced rutile TiO2(110) as imaged by constant current STM measurements (see panel b and Methods Section), together with the models predicted from DFT without taking polarons into account or by explicitly modeling their impact via machine learning (see panels a and c respectively). The unreconstructed 1 × 1 rutile surface consists of alternating rows of under-coordinated (two-fold) oxygen atoms (the bridging oxygen atoms, Obr) and fivefold coordinated titanium atoms (Ti5c) running along the [001] direction44,45. Oxygen vacancies form easily on the Obr sites upon sputtering and annealing, up to a critical concentration of c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) 17%34. At stronger reducing conditions, the surface undergoes a structural reconstruction46,47,48,49,50,51. Every VO releases two excess electrons that form polaronic states, localizing preferably on subsurface Ti sites27,35,52. Thus, the VO can be considered as a positively charged (2+) center. By simple electrostatic considerations (and by, simultaneously, neglecting the role of polarons), one would expect a purely repulsive interaction among the vacancies. In this picture, the configuration maximizing the VO–VO distances represents the most favorable vacancy distribution. For the critical concentration of c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) = 17%, this corresponds to a homogeneous configuration with a VO–VO distance of six lattice sites along the [001] row, and three lattice sites considering two oxygen vacancies on adjacent rows (see Fig. 1a).

Fig. 1: Oxygen vacancy distribution on rutile TiO2(110) obtained by various methods.
figure 1

a Schematic representation of the most favorable VO distribution in non-polaronic DFT calculations as obtained from a 6 × 4 (~1.8 × 2.6 nm2) supercell. The schematic depiction is generated by showing the Obr bridging atoms as black regions and Ti5c rows and VO as white. The inset displays the structural model of rutile TiO2(110). The distance maximizing VO distribution (six sites in row, three sites in adjacent row) and the 6 × 4 supercell are indicated. b Unoccupied-states, constant-current STM image of a clean, reduced rutile TiO2(110) surface (imaging parameters in the Figure) depicting Ti5c rows and VOs as bright, while Obr rows are depicted as dark. More details on the contrast formation are given in the Methods. Locally low and high VO concentrations (c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\)) areas are marked with solid and dashed red boxes, respectively. The crystalographic directions are consistent in all panels. c ML-predicted schematic representation of surface oxygen vacancy distribution, where the interaction of surface and subsurface polarons (PolS0 and PolS1, respectively) and VOs are modeled in a 54 × 24 (~16 × 16 nm2) supercell. Orange and yellow markers show the position of surface and subsurface polarons in the ML prediction.

DFT calculations confirm the homogeneous VO distribution in Fig. 1a as the ground state configuration, as far as the formation of the polarons is suppressed (i.e., the excess electrons are forced into spatially delocalized states at the bottom of the conduction band, rather than localized polaronic states). While this unphysical metallic solution (rutile TiO2 is an n-type semiconductor) is less stable than the polaronic solution, it simplifies the search for the optimal defect distribution via a two-step process. Initially identifying the optimal defect pattern through DFT calculations, where polaron formation is suppressed, and subsequently introducing polarons into random positions or finding the most favorable polaron configuration within the given defect distribution27. While this approach reduces the combinatorial divergence of defect-polaron configurations, it relies on the assumption that the distribution of atomic defects is not affected by the polarons, which is not valid for most materials28.

The experimental measurements do not support such homogeneous VO pattern. Figure 1b shows a typical image as obtained from low-temperature STM measurements on a TiO2(110) surface after sputtering and annealing treatment to form a high content of oxygen vacancies (c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) 14%, close to the critical value of 17%). At this temperature, the oxygen vacancies (imaged as bright spots along the dark [001] Obr rows) are immobile and appear in irregular patterns, quite far from any homogeneous distribution. The discrepancy with the simple models discussed above is a strong indication of the role that polarons can have in determining the optimal VO surface structure. Simply adding polarons on a rigid VO pattern (effectively decoupling VO and polaron) as usually done in standard DFT simulation, would not improve the situation.

Figure 1c reports the surface structure as predicted by our machine learning model, which allows simultaneously varying both VO and polaron positions to find the configuration that minimizes the total energy of the system. The resulting VO distribution is in good qualitative agreement with the inhomogeneous distribution found throughout experiments. Our methodology, described in detail in the following, is capable of capturing the effects of the polarons on the oxygen vacancy distribution, going beyond the simple picture relying on purely VO–VO interactions. Moreover, it allows us to consider large surface areas of about 250 nm2 (>15 × 15 nm2), corresponding to 54 × 24 supercells, extending considerably the limits of standard DFT simulations.

Machine learning polaron and defect distributions

The methodology proposed here is structured in three parts: First, we train a feed-forward neural network53 to predict the DFT energy of the system depending on the configurations of the impurities. Due to the computational limitation of the DFT calculations, we adopt relatively small unit cells in this step. Polaron-polaron/defect interactions in rutile TiO2 become negligible within  10 Å37. Here, we used two supercells with different lateral extensions (6 × 4 and 12 × 2, see “Methods”) to include these long-ranged interactions along different crystallographic directions. Then, we use the trained model to search for low-energy configurations that were not included in the original set of data, adopting an active learning scheme54. Finally, we use the actively trained model to obtain large-area predictions. In the following, we describe in detail the architecture of the machine-learning model and compare the ML predictions with experimental data on reduced TiO2(110).

The training of the machine learning model requires a reference database built up by several, distinct polaron and atomic-defect configurations. By following the process described in detail in the Methods Section, we calculated the free energy for different configurations at the DFT+U level using VASP55,56,57, with a U = 3.9 eV on the d orbitals of Ti atoms27,58. Polarons were localized at chosen surface TiS0 (PolS0) and subsurface TiS1 (PolS1) sites via occupation matrix control59. We modeled 2367 symmetrically-inequivalent polaron-VO configurations in a 6 × 4 supercell (i.e., six and four times the [001] and [1\(\bar{1}\)0] lattice vectors, respectively), and 2155 configurations in a 12 × 2 supercell. To optimize the model, we randomly split the calculated configurations and energies into training and validation data sets, including 80% and 20% configurations, respectively.

Figure 2 sketches the main features of our ML architecture predicting the stability of different defect-polaron configurations. A generic polaron-VO distribution on the rutile TiO2 surface is depicted in Fig. 2a. The descriptor representing the configuration is constructed by, first, discretizing the space into a rough grid (see Fig. 2b), and encoding the spatial distribution of polarons and atomic defects. While this representation does not explicitly include the specific structural distortions accompanying the defect configuration, we account for it by training our model on relaxed structures that capture the distortions and their contribution to the energy. This is possible since the distortions accompanying a specific defect configuration are defined by the position of the defects in the lattice (i.e., in a DFT calculation, defining the defect configuration suffices to find the minimum energy structure). The discretized space simplifies the training of the ML model, as compared to using Euclidean distances27. To improve the description of the interaction with polarons/defects at similar crystallographic displacements, we employ a one-hot encoding (i.e., value of 1 for grid cells containing a defect, 0 otherwise) smeared to effectively encode the defect density. We achieve this smearing via multiple applications of a discrete Laplacian kernel to the one-hot encoding (Fig. 2c), mimicking a diffusion process. Then, to predict the energy of the whole system given a specific configuration, we split the total energy into contributions arising from a single defect/polaron impurity (Fig. 2d):

$${E}_{{{{\rm{tot}}}}}=\mathop{\sum}\limits_{i}^{{N}_{{{{\rm{S}}}}1}}{E}_{i}+\mathop{\sum}\limits_{j}^{{N}_{{{{\rm{S}}}}0}}{E}_{j}+\mathop{\sum}\limits_{k}^{{N}_{{{{\rm{VO}}}}}}{E}_{k}$$
(1)

Here, Etot is the total energy of a given configuration, and Ei, Ej, and Ek are the virtual contributions of a single PolS1, PolS0, and VO respectively. We use a feed-forward neural network to estimate the virtual contribution of a single defect/polaron (Fig. 2e). Finally, we sum over the virtual contributions to obtain the total configuration energy27. The total energy can be computed by DFT calculations37, while the virtual contributions are not directly accessible in the DFT data. Thus, we can train our model using the discretized defect-polaron positions as a descriptor, and the DFT energy as the target quantity. By training the ML model on DFT data obtained for the 6 × 4 unit cell (see Supplementary Fig. 1), we achieved a mean absolute error of 1.8 and 2.2 meV/VO for the training and validation sets, respectively. By adding training data from the 12 × 2 unit cell (see Supplementary Fig. 2), the mean absolute error increased slightly (2.9 and 3.5 meV/VO in training and validation, respectively). However, by using both sets of data in the training, the ML model can account for longer interactions in both the [001] and [\(1\bar{1}0\)] directions. For a detailed description of the training process see “Methods” Section.

Fig. 2: Machine learning model architecture.
figure 2

a A defect structure consisting of oxygen vacancies and polarons in a supercell. b The supercell is converted into a discretized grid, where each cell encodes whether it contains a defect/polaron. c Smearing of the one-hot encoding. d The supercell is partitioned into the local environment of each defect. e The local environment descriptors are fed through a feed-forward neural network to predict the energy contribution of each defect. The sum of the individual defect contributions gives the total energy of the system.

Aiming for a comparison with the experimental measurements, we focus here on the low-energy configurations, which are more likely to get stabilized in real samples. To identify such stable configurations, we performed simulations that model the annealing process. In the preparation of the experimental samples, both polarons and oxygen vacancies diffuse on the sample during annealing. At lower temperatures, VO on rutile are immobile, while polarons always show a certain degree of mobility, hopping/tunneling a few lattice sites around the equilibrium position34,45. The simulated annealing can be implemented as a global optimization scheme60. Candidate configurations are obtained by perturbing the current configuration, randomly displacing one defect/polaron impurity to any nearest neighbor site. The candidate configuration is either accepted or declined by virtue of the Metropolis-Hastings algorithm61 with the acceptance criterion based on the configuration energy—similar in spirit to large-scale defect distribution studies based on reverse MC62 (although in our approach the defect distribution is not fitted to minimize the deviation from experiment, but it relies entirely on DFT/ML data). Calculating the energy of the candidate configurations within the DFT framework would make this approach unfeasible, due to the computational cost of DFT calculations and the high number of energy evaluations required for a single optimization. Conversely, the ML model allows us to inspect the stability of an extremely high number of defect-polaron configurations (minimization of the energy requires on the order of 103−106 energy evaluations depending on the size and initialization of the employed configuration) and enables the exploration of candidate structures.

We iterated annealing simulations following an active learning procedure. The initial DFT data set built by random configurations was progressively augmented by including the results from the annealing optimization (see Supplementary Fig. 4). Consequently, we obtained a final ML model refined to account for a broader range of configurations. The refined model is finally used to obtain large-area predictions (7.1 × 10.5 nm2 to collect statistics and 16 × 16 nm2 for visualizations) on the defect-polaron distributions, using again the simulated annealing approach. The qualitative agreement with the experimental data is shown in Fig. 1c for the 54 × 24 rutile TiO2(110) supercell. In the following, we quantitatively analyze our results.

Formation of VO-polaron patterns and their mutual interaction

The analysis of the low-energy configurations (see Supplementary Fig. 5 for the energy distribution of all possible VO configurations in the 6 × 4 cell) is summarized in Fig. 3. Figure 3a shows the improvement of energies of the TiO2(110) 6 × 4 cell as obtained by treating VO-polaron coupling at three different levels: (i) Suppressing polarons ("No Polarons”); (ii) Distributing polarons in random or positions guided by physical intuition (“Random Polarons”); (iii) Full inclusion of polaron–VO interaction via our proposed ML protocol ("ML Polarons”). By suppressing polaron formation, the ground state configuration is given by the vacancies being homogeneously distributed on the surface (C\({}_{{{{\rm{NoP}}}}}^{{{{\rm{Hom}}}}}\) configuration, see “No Polarons” column in Fig. 3a). The “Random Polarons” column of Fig. 3a shows instead the energy of the system obtained by including polarons in random positions and enriched by adding specific, low-energy polaronic configurations that were suggested in previous studies27,34,37. Here, the ground state configuration is given by a homogeneous distribution of PolS1 in the homogeneous VO pattern (labeled as C\({}_{{{{\rm{RandP}}}}}^{{{{\rm{Hom}}}}}\) in Fig. 3a).

Fig. 3: Analysis of VO-polaron configurations in a 6 × 4 TiO2(110) cell.
figure 3

a Comparison of selected low-energy vacancy-polaron configurations as obtained by different treatments of the polaron–VO interaction. For a comparison of all configurations and their labeling, refer to Supplementary Fig. 6. The change in energy for all low-energy configurations is displayed in Supplementary Fig. 7. “No Polarons” refers to DFT calculations suppressing polaron formation. “Random Polarons” refers to the reference DFT data set, built by including polarons in random positions or guided by physical intuition. “ML Polarons” indicates the DFT energies of configurations identified in the ML search. Total energies ΔE are shown using the homogeneous VO distributions (from “No Polarons” and “Random Polarons'') as references (note the large energy gain ΔEpol of −3.23 eV between non-polaronic and polaronic solutions with homogeneous VO patterns). Dashed lines connect identical VO configurations. VO configurations found in the ML search are displayed in red. The occurrence of TiS0 polarons is highlighted in orange and purple for [1\(\bar{1}\)2]- and [1\(\bar{1}\)0]-aligned oxygen vacancies, respectively. The most important VO-polaron complexes are shown schematically in top view at the bottom of the Figure. Only the most stable polaronic configuration per VO arrangement is shown. b, c Top and side views of the polaronic isocharge surfaces of the [1\(\bar{1}\)2]-aligned VO-PolS0 complex (top), and of the PolS1 in the homogeneous VO-distribution (bottom).

Treating polaron-VO coupling at the ML level (third approach) results in VO distributions with lower energy, indicating an alternative ground state for the system, where the homogeneous configuration is no longer the most stable one, as shown in the “ML Polarons” column of Fig. 3a. First, we note that the ML model identified a different order of PolS1 showing better stability in the homogeneous VO background (labeled as C\({}_{{{{\rm{ML}}}}}^{{{{\rm{Hom}}}}}\), see also Supplementary Fig. 6). Moreover, polaron configurations explored by the extensive ML search improve the stability of many other VO patterns (see the energy levels in black in the “ML Polarons” column of Fig. 3a, lower than in the “Random Polarons” column). Importantly, two of these previously-unexplored polaron configurations (labeled as C\({}_{{{{\rm{ML}}}}}^{{{{\rm{0}}}}}\) and C\({}_{{{{\rm{ML}}}}}^{{{{\rm{2}}}}}\)) resulted in energy values even lower than the homogeneous distribution, revealing an alternative ground state for the system. Moreover, alternative VO-patterns were proposed by the ML search as low-energy configurations. One in particular (red line in Fig. 3a) is ranked as the second most stable configuration (C\({}_{{{{\rm{ML}}}}}^{{{{\rm{1}}}}}\)). The polarons play a key role in stabilizing this VO-pattern and as further proof, we calculated the energy of this VO-pattern, artificially suppressing the polaron formation, and obtained a much worse stability (red line in the “No Polarons” column).

Interestingly, in all the low-energy configurations obtained from the ML-driven search (except for C\({}_{{{{\rm{ML}}}}}^{{{{\rm{Hom}}}}}\)), we note the presence of at least one polaron on a surface TiS0 site (configurations containing PolS0 are orange highlighted in Fig. 3a). Figure 3b and c compare the spatial distribution of the surface PolS0 and subsurface PolS1. The formation of the surface polaron is particularly stable when occurring in the central TiS0 site between two oxygen vacancies aligned on the [1\(\bar{1}\)2] direction (see top view in Fig. 3b). This [1\(\bar{1}\)2]-aligned VO-polaron complex represents indeed the ground state configuration obtained by our ML search (e.g., it is present in C\({}_{{{{\rm{ML}}}}}^{{{{\rm{0,1,2}}}}}\)). Another remarkably stable complex is given by two vacancies aligned along the [\(1\bar{1}0\)] direction and one PolS0 in their vicinity (see Supplementary Fig. 6). For instance, this complex appears in the configurations highlighted in purple in Fig. 3a (C\({}_{{{{\rm{ML}}}}}^{{{{\rm{10,11}}}}}\)). The [1\(\bar{1}\)2] and [\(1\bar{1}0\)] alignments found in the ML search agree well with the experimental SPM measurements (compare Fig. 1b and c) showing a high coverage of such high-density VO regions. In contrast, DFT predictions, which neglect polaron-VO interaction or randomly distribute polarons, favor homogeneous configurations.

Comparison of a large-scale model and the experimental surface

Figure 4 shows our results as obtained by ML-driven annealing simulations on large-area 24 × 16 cells (corresponding to 7 × 10 nm2), which enables a direct comparison with the experiment. Visual inspection (Fig. 1b vs. c) already indicates that our ML treatment provides VO distribution that closely resembles the experimental one. We quantify this agreement by calculating autocorrelation functions (ACF)63 for simulated annealings under different computational conditions and compare it to the experimental ACF of the VO distributions extracted from Fig. 1b (for details see Supplementary Fig. 8). The simulated annealing procedure starts from random VO–polaron configurations, where we obtain several large-area models (such as the one in Fig. 1c), all showing very similar characteristics. To complete our comparison, we also use the ML model to anneal a system where polaron formation is suppressed. This scheme, similar to the non-polaronic DFT approach of Fig. 1a, assumes a homogeneous VO pattern but takes into account annealing-induced disorder effects.

Fig. 4: Autocorrelation functions of the VO positions as extracted from Fig. 1b (Exp) and ML-based ACFs with (ML Pol) and without polarons (ML No Pol).
figure 4

Comparison of experimental and simulated VO autocorrelation functions along a single (a) and adjacent (b) [001]-aligned Obr row. Experimental autocorrelation functions are averaged to remove remaining anisotropies. Simulated ones are averaged over all symmetrically equivalent most stable configurations from 60 differently seeded simulated annealing runs performed in 24 × 16 supercells. ML Pol and ML No Pol are started from identically seeded VO patterns, with 2\({N}_{{{{{\rm{V}}}}}_{{{{\rm{O}}}}}}\) polarons and no polarons, respectively. Autocorrelation functions are rescaled to account for c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) differences in experiment (14.2%) and simulation (16.7%).

The ACFs are shown in Fig. 4, where projections of VO defect populations along the same and adjacent [001] rows are shown in the histograms in panels a and b, respectively. For oxygen vacancies lying on the same row, both the ML model and the experiments show that short VO–VO distances of 1 and 2 lattice sites are unlikely. The highest probability lies at a distance of 4 or 5 lattice sites for both the experimental and ML annealing including polaron–VO interactions (see Fig. 4a and b red and blue data, respectively). By considering only the VO–VO repulsion as driving force (i.e., excluding polaron formation in the ML annealing procedure; see ML No Pol in Fig. 4a and b) and applying an identical annealing protocol as in the polaron-VO interaction case, we find the probability maximum lying at a 6-site distance for in-row and 3-site distance in the adjacent row. This is further evidence for polarons’ role in stabilizing the VO arrangement.

As a result, the rutile TiO2 surface shows some areas with a locally low density of oxygen vacancies (down to 0%), alternated with highly dense areas (up to 20%, which is compatible with the 4-site-distance distribution). Our data suggest that the great stability of the [1\(\bar{1}\)2]- and [\(1\bar{1}0\)]-aligned VO–polaron complexes contribute to this alternation of locally less and more reduced areas at a given c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\). To further corroborate this result, we performed additional DFT calculations modeling this strong inhomogeneity (see Supplementary Fig. 6 with configurations C\({}_{{{{\rm{ML}}}}}^{{{{\rm{26-29}}}}}\)). We also note that this analysis reconciles the DFT predictions on the critical concentration at which the (1 × 2) surface reconstruction occurs for the surface phase transition, which was calculated as ~20%, in apparent disagreement with the experiments reporting an average concentration of 17%34.

Conclusions

In summary, we directly elucidated the impact of polarons on the structure of oxide surfaces, using an example of the prototypical rutile TiO2(110) surface. Specifically, we designed a computational machinery to predict the distribution of polarons and oxygen vacancies on rutile TiO2(110), by performing machine-learning-guided DFT calculations. While the defect configuration energy could in principle also be modeled by a linear model such as cluster expansion, non-linear models (such as neural networks) might turn out preferable due to the complexity of the interactions at play in oxide materials43. MC-driven annealing simulations based on the ML data enabled the exploration of defect distributions on scales much larger than standard DFT allows. An analysis of the experimental SPM images yielded a direct validation of the theoretical predictions. While conventional approximations used in traditional DFT calculations result in homogeneous solutions, we were able to retrieve the inhomogeneity of the VO distribution as detected by the experiments. Our analysis clarifies the peculiar inhomogeneous distribution of VO on rutile TiO2(110). Most importantly, the system shows a tendency towards the formation of high-density VO patterns, alternated with low-density VO regions. While larger defect-free areas are typically attributed to subsurface Ar impurities64, the here observed fluctuation of the local c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) can partially be attributed to the interaction of polarons and VOs.

These results suggest that surface reactivity could be optimized by tuning the annealing procedure to facilitate the formation of energetically more favorable, high-density VO patterns, which promote surface localized charges and their interaction with adsorbates26. To elucidate the role of the surface polaron, further experiments are necessary. Resonant photoelectron diffraction does not rule out the formation of surface localized charge carriers, even at low c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\)65. SPM measurements in the presence of CO adsorbates confirm the formation of the [1\(\bar{1}\)2]-aligned VO–polaron complex26, while STM measurements probing the filled states on the clean surface do show some disparity in comparison to simulated STM37. The reasons for this discrepancy are manifold, ranging from temperature-induced effects66, to the electric field of the tip.

Furthermore, we expect our methodology to be applicable to any other polaronic system, even including multiple defects as sources of polarons, such as the perovskite SrTiO3(001) surface67 exhibiting Sr adatom/vacancy and often doped by Nb atoms29. Moreover, this methodology could be used to study the spatial distribution of defects (e.g., subsurface, bulk) that are not directly accessible by the experiments, such as interstitial titanium in rutile. Additionally, the stochastic optimization model could be further improved by considering realistic anisotropic diffusion probabilities along certain directions. This could be achieved by explicitly computing hopping and diffusion barriers, and incorporating these barriers into the annealing simulations.

Methods

DFT modeling

We performed DFT+U calculations using VASP55,56,57 on the rutile TiO2(110) surface. We used standard projector augmented wave pseudopotentials for Ti (treating d- and s-orbitals as valence) and soft O pseudopotentials. We adopted a Hubbard U = 3.9 eV on the d orbitals of Ti atoms27,58. The sampling of the reciprocal space included the Γ-point and the plane-wave energy cutoff was set to 400 eV.

The surfaces were modeled using five-layer-thick slabs (where the two bottom stochiometric layers were fixed at their bulk position) with lateral supercell sizes of 6 × 4 and 12 × 2. To partially account for the role of thermal effects in the stabilization of the VO patterns during the annealing treatment in the experiments, we used an expanded [001] lattice vector. Specifically, the low T lattice constant of 2.953 Å44 was expanded to 2.968 Å (high T corresponding to 500–600 K) in accordance with thermal expansion coefficient measurements68,69. This strain of +0.5% is well below the crossover point of + 3%, where surface polaron formation is favored over subsurface polaron formation66.

Within the supercells, we removed 4 surface-bridging oxygen atoms (in random positions) from every slab, obtaining a c\({}_{{{{\rm{{V}}}_{\mathrm{O}}}}}\) of approximately 17%. To assess non-polaronic solutions we performed spin unpolarized DFT, constraining the excess electrons in spatially delocalized states at the bottom of the conduction band. To model the polaronic structure, we followed a three-step procedure: Initially, we removed bridging oxygen atoms from a pristine structure to generate a specific oxygen vacancy pattern. This structure was relaxed while all excess charge carriers were kept delocalized by employing a spin un-polarized relaxation. After retrieving the structural properties of the oxygen vacancy configuration, we introduced polaronic distortions at selected sites via occupation matrix control59, using distinct occupation matrices for PolS1 and PolS0 sites27. Finally, we performed an unconstrained relaxation starting from the structures and wave functions determined in the previous step. As a starting point for building our dataset, we chose 43 symmetrically inequivalent VOconfigurations and pre-relaxed them at the delocalized level. Afterward, we added polarons at random positions to these relaxed structures, while keeping the polaron densities in each layer within the ranges suggested by our previous works27,37 (i.e., most configurations have less than 4 surface polarons). After the initial dataset was constructed, we systematically extended the dataset via simulated annealing with our ML model. With this active learning procedure, we extended our dataset by roughly 200 configurations.

ML model training and optimization of defect configurations

The machine learning model is implemented in the framework of JAX70. Here, we describe the model optimization based on the study of configurations in the 6 × 4 supercell. Our ML model consists of featurisation (see Fig. 2a–d) and defect-type-specific feed-forward neural networks (see Fig. 2e). The featurisation consists of a one-hot encoding on a grid, specifying the defect configuration, followed by smearing via multiple applications of a discrete laplacian kernel:

$${u}_{i,j}^{n+1}={u}_{i,j}^{n}+D\left({u}_{i+1,j}^{n}+{u}_{i-1,j}^{n}+{u}_{i,j+1}^{n}+{u}_{i,j-1}^{n}-4{u}_{i,j}^{n}\right)$$
(2)

Here \({u}_{i,j}^{n}\) represents the defect density encoding for the defect u {VO, PolS0, PolS1} at grid position i, j and after iteration n. D is treated as a hyperparameter and was set to 0.1. The application of the Laplacian kernel was performed for 8 iterations. The local environment range was set to include 6-sites in the [001]- and 4-sites in the \([1\bar{1}0]\)-direction, as this choice provided the best results in the hyperparameter optimization (see Supplementary Fig. 3). Each defect-specific feed-forward neural network consisted of 3 layers of width 94, 32, and 10, respectively. We optimized the machine learning model using stochastic gradient descent and backpropagation on an augmented dataset, including all symmetrically equivalent representations, of the training defect configurations. We randomly split this dataset into 80% training data and 20% validation data and optimized the model parameters by minimizing the mean squared error of the energy prediction of the training data via backpropagation. Before training, energies, as well as the descriptors, were rescaled to [0, 1], by min-max scaling. Using an early stopping mechanism, the best model was selected based on the lowest validation dataset error within the optimization procedure. The mean squared error during training as well as a scatter plot of DFT and ML energies are displayed in Supplementary Figs. 1 and 2.

To ensure sufficient accuracy when using the model in the case of exploration, we applied an active learning procedure as depicted in Supplementary Fig. 4. Here, we performed an iterative training-testing loop to further improve the reliability, data efficiency, and scope of the proposed model. Since our main interest lies in the determination of low-energy polaron-defect complexes, our model was used for the optimization of defect configurations in various cases. We searched for global optima of configurations by allowing all defects to diffuse during the optimization. Local minima of fixed polaron layer densities were added by restricting polaron movement to intra-layer hopping. Also, local minima of cases where the VO-configuration was fixed and only polarons were relaxed, were explored. Within these three exploration cases, we extracted and confirmed the most stable configurations by performing a comparative DFT calculation of the proposed polaron configurations.

The optimization of configurations is performed via simulated annealing60, where we use the predicted energy of our ML model to determine the Metropolis acceptance probability Paccept.

$${P}_{{{{\rm{accept}}}}}=\min \left(1,\exp ((E({{{\bf{x}}}})-E({{{{\bf{x}}}}}^{{\prime} }))/{k}_{{{{\rm{B}}}}}T)\right)$$
(3)

Here E(x) is the ML predicted energy associated with configuration x. Configuration \({{{{\bf{x}}}}}^{{\prime} }\) is generated from configuration x in a random walk, by randomly choosing one defect in configuration x and randomly displacing it to an adjacent site. The temperature variable T in the Metropolis criterion was set to 1000 K (similar to the annealing temperatures in the sample preparation). Even though the diffusion processes of the respective defects during the optimization are physically motivated, they do not necessarily represent the physical process of the formation of observed defect patterns. Defect transport mechanisms such as inter-row hopping of oxygen vacancies have not been reported71 but may improve optimization efficiency or more efficiently overcome energy barriers. Discrepancies between polaron and vacancy hopping rates were also ignored, which potentially affects the final outcome of the optimization. Similar effects were observed for the specific temperature or temperature ramp employed in the simulated annealing.

Experimental setup

SPM was performed using STM in an ultrahigh vacuum (UHV) chamber with a base pressure below 2 × 10−11 mbar; the whole chamber, equipped with an Omicron qPlus low-temperature head, was suspended using 36 bungee cords for efficient vibration damping72. Stiff qPlus sensors73 (k = 1800 Nm−1, Q = 5000–30,000, f0 [25–45] kHz) with a a sharp W tip74 were used to collect the tunneling current (It) and the frequency shift (Δf) signals; deflection detection was achieved using a cryogenic preamplifier in vacuum75. W tips were treated at a Cu(110) surface decorated with a sharp, conductive Cu pyramid at the apex, and were subsequently applied for imaging the rutile TiO2(110) surface. Tip sharpness was indicated by the low-frequency shifts (Δf  [0, −1] Hz) recorded during STM imaging of a Cu(110) test sample.

Sample preparation was performed in a separate UHV chamber (connected to the measurement chamber via a gate valve for in-situ transfer) with a base pressure below 1 × 10−10 mbar. Surfaces were cleaned by cycles of sputtering and UHV annealing that consequently reduced the samples and introduced VOs to the surface. A typical cleaning cycle consisted of sputtering with 1.5 keV Ar+ ions for 10 min with an ion current of 1 μA  cm−2, and subsequently annealing the sputtered surfaces in UHV up to 700 °C. Before each measurement, 3–5 cleaning cycles were performed. The over-reduction of the surface was occasionally remedied by annealing the sample to 750 °C in 5 × 10−7 mbar of O2 shower for 10 min. When the reduction level was too high, the rutile TiO2 samples were re-oxidized ex situ at 800 °C in O2 flow and reintroduced to UHV for cleaning.

Figure 1b displays the z-channel of a feedback-controlled unoccupied-states STM image taken at a sample temperature of 14 K; Imaging parameters: sample bias VS = +0.9 V, grounded tip, tunneling current set-point It = 20 pA, oscillation amplitude A = 500 pm.

The contrast in Fig. 1b corresponds to a typical unoccupied-states STM imaging contrast over a reduced rutile TiO2(110) surface, which is dominated by electronic rather than geometric considerations: 1 eV above the Fermi level the conduction band consists of Ti 3d states and defect VO states, while the O states constitute the valence band76. Therefore, the highest probability of electron tunneling from the tip to the surface is above the Ti5c rows and VOs—they appear bright under these STM conditions. On the other hand, the tunneling is less likely above the Obr rows and they appear dark even though they geometrically protrude highest from the surface. Note that in Fig. 1b individual Ti5c atoms can be recognized as spheres forming a row along the [001] direction, while VOs are recognized as isolated, bright spheres.