Modelling atomic and nanoscale structure in the silicon–oxygen system through active machine learning

Silicon–oxygen compounds are among the most important ones in the natural sciences, occurring as building blocks in minerals and being used in semiconductors and catalysis. Beyond the well-known silicon dioxide, there are phases with different stoichiometric composition and nanostructured composites. One of the key challenges in understanding the Si–O system is therefore to accurately account for its nanoscale heterogeneity beyond the length scale of individual atoms. Here we show that a unified computational description of the full Si–O system is indeed possible, based on atomistic machine learning coupled to an active-learning workflow. We showcase applications to very-high-pressure silica, to surfaces and aerogels, and to the structure of amorphous silicon monoxide. In a wider context, our work illustrates how structural complexity in functional materials beyond the atomic and few-nanometre length scales can be captured with active machine learning.


INTRODUCTION
Elemental silicon and its oxide, silica (SiO 2 ), are widely studied building blocks of the world around us: 1 from minerals in geology to silicon-based computing architectures; thin-film solar cells in which amorphous silicon is the active material; 2 or zeolite catalysts based on the SiO 2 parent composition. 3Some of these materials have a single phase and are precisely defined on the atomic scale, whereas others show longer-ranging, hierarchical structures and varying degrees of disorder.For example, silica aerogels contain pores with sizes of 5 to 100 nm, leading to very low thermal conductivity and making aerogels promising candidates for thermal insulation. 4Under pressure, SiO 2 shows amorphousamorphous transitions to structures exceeding sixfold coordination, 5 crystallisation from the amorphous phase under shock compression, 6 and conversely the formation of complex disordered phases from crystalline SiO 2 . 7[10] A material in the binary silicon-oxygen system which is in fact dominated by such interfaces is the so-called silicon monoxide (SiO).4][15] Initial applications of SiO have been in protective layers for mirrors 16 or dielectrics for thin-film capacitors; 17 more recently, the same material has emerged as a promising anode material for lithium-ion batteries. 18,19However, to be able to fully exploit SiO in next-generation energystorage solutions, it would be valuable to understand the features of the nanoscopic structure on an atomistic level.
To develop atomic-scale models of complex materials such as SiO, molecular-dynamics (MD) computer simulations have become a central research tool.7][28][29] Alongside established, empirically fitted potentials based on physical models, alternatives based on large datasets and machine learning (ML) have emerged in recent years.These models have been fitted to silicon 30 as well as silica 31 and also to the more complex silica-water system. 32ML potentials promise the accuracy of first-principles methods such as densityfunctional theory (DFT) for a small fraction of the cost.ML potentials are now firmly established in the field of computational materials science and their application to homogeneous phases has been well documented.
In the present work, we describe a unified computational model for the Si-O system that we have obtained with the help of an active-learning scheme for local environments.We extract representative atomic environments from large-scale simulations and embed them in a melt-quenched amorphous matrix, allowing us to sample representative environments for the fitting of accurate ML potentials.Our final model shows high accuracy across a wide configurational space including highpressure silica, silica surfaces, and mixtures of silica and silicon.We showcase the usefulness of the method by creating fully atomistically resolved, 10-nm-scale structure models of SiO.
FIG. 1.An active-learning workflow for complex atomistic structures.(a) Overview of the procedure to obtain the database.After merging structures from the Si-GAP-18 (Ref.30) and SiO2-GAP-22 (Ref.31) databases, the process was split into three tracks, aiming to describe high-pressure silica, silica surfaces, and mixed Si-O systems with different stoichiometric compositions.In each of the tracks, small-scale MD simulations were used to sample new structures by active learning. 33In the last step, large-scale simulations were performed, where atoms with high uncertainty were recognised by a committee error.(b) The concept of our amorphous matrix embedding approach.First, we extract the wider environment of an atom with high uncertainty.Then we keep the atom of interest, as well as the direct environment fixed, and we melt and anneal the outer environment.As a result, we obtain a small-scale structural model which has the atom of interest and its local environment embedded into an amorphous matrix.This sample can be fed into the training database.

Active learning for nanoscale structure
5][36] We initialised the protocol with two existing datasets for silicon (Bartók  et al., Ref. 30) and silica (Erhard et al.,Ref. 31 ) respectively, and we then gradually explored the relevant configurational space using the active-learning workflow illustrated in Fig. 1.Quantum-mechanical reference ("training") data for energies and forces were obtained with the strongly constrained and appropriately normed (SCAN) 37 exchange-correlation functional for DFT, which shows excellent performance for elemental silicon 38 and the various silica polymorphs. 31ur active-learning workflow follows three main tracks: high-pressure bulk silica, silica surfaces, and nonstoichiometric SiO x systems (Fig. 1a).The individual tracks are kept separate during initial training, i.e., they do not share their newly generated training data; how-ever, in the end, all structures are merged into one comprehensive database.
The single subtracks are further divided into stages.In the first stage, we added initial structures, e.g., for crystalline high-pressure polymorphs or surfaces models.In the next stage, we fitted moment tensor potential (MTP) models 33 to the database and used these MTPs to explore configurational space in MD and to identify new structures by active learning 39 .Energies and forces for new structures were computed with DFT and added to the database.This process was iterated until the extrapolation threshold (Supplementary Material) was not exceed during the MD trajectories anymore.
The third stage, highlighted in red in Fig. 1a, is the most important part of our workflow, and is based on large-scale simulations in each track.We used 2-4 MTPs trained on the same database to estimate a per-atom committee error, as is commonly done for neural-network potentials. 40For atoms with high uncertainty (Supplementary Material), we extracted the environments into smaller, "DFT-sized" cells by an approach that we call amorphous matrix embedding (Fig. 1b).After identifying an atom with high uncertainty, we cut out a cube containing the corresponding environment of the atom.This cube has a size which is feasible for performing DFT computations; it is generally chosen larger than twice the cut-off of the potential.After extracting the cube, the atoms within the cutoff radius of the atom with high uncertainty are kept fixed.The remaining structure is molten in an ML-MD simulation to create an amorphous matrix and smooth boundaries.Details of the procedure can be found in the Supplementary Material.
The final database was obtained by merging the data of all tracks together, including some additional samples such as clusters and vacancies.This database contains 11,428 structures with a total of ≈ 1.3 million atoms (Supplementary Material).For validation, we held out 5% of these structures from training, selected at random.

Performance
The final potential is a complex non-linear ACE model, obtained by summation of one linear and seven non-linear ACE terms (Methods).This approach allows a more flexible description than just a linear or Finnis-Sinclair-like embedding, at only moderately higher computational expense.The resulting potential has a test-set root mean square error (RMSE) of 16.7 meV atom −1 for energies and 306 meV Å−1 for forces.These errors are averaged over the full dataset, however, and so they are not in themselves sufficient to characterise the quality of the potential.For example, they refer to a highly heterogeneous set of structures, with target energy values spanning more than 8 eV atom −1 , and a range of forces of 40 eV Å−1 covered by the database.Furthermore, the numerical accuracy of the potential in certain parts of configurational space (e.g., crystalline polymorphs) is far more important than in others (e.g., liquid and amorphous structures).In Table I, we therefore show the performance of our model on different separate test sets.The complex non-linear ACE is compared to our previous silica GAP model described in Ref. 31 ("SiO 2 -GAP-22" in the following), and also to simpler ACE models fitted to the new database using linear and Finnis-Sinclair-like embeddings, respectively.Indeed, the complex non-linear ACE potential is the only one among the three which achieves comparable errors to SiO 2 -GAP-22 for amorphous and crystalline structures.In contrast, for amorphous elemental silicon, mixed-stoichiometry as well as high-pressure phases the complex non-linear ACE is significantly more accurate than SiO 2 -GAP-22, since these structures are not part of the GAP database.This table therefore indicates the main challenge -and its solution -in the present model compared to the previous GAP: both are highly accurate for crystalline (≈ 1 meV atom −1 ) and bulk amorphous (≈ 5 meV atom −1 ) SiO 2 , but our ACE model caters to a much wider range of scenarious outside of the 1:2 stoichiometric composition.

Phase diagram of SiO2
Figure 2 shows the phase diagram of SiO 2 calculated by thermodynamic integration 45,46 using the ACE potential compared to a CALPHAD phase diagram from literature. 42The ACE and CALPHAD predictions agree well throughout, and for the boundary between quartz and coesite we observe almost quantitative agreement.In contrast, the cristobalite and tridymite phases seem to be over-stabilised.At 0 GPa, the melting point is notably overestimated (about 2,400 K, compared to ≈ 2, 000 K experimentally 42 ); moreover, the phase stability regions of both phases are more extended than in the reference.To illustrate the sensitivity of the analysis to small errors in predicted energies, we added a fictitious energy penalty of 5 meV atom −1 for cristobalite and tridymite (Supplementary Fig. S1a); in this case, the transition lines already agree much better with the CALPHAD reference than before.Further numerical tests showed that the tridymite-cristobalite transition line, in particular, is strongly affected by small shifts in energy (Supplementary Fig. S1b-f).We thus conclude that the quantitative deviation seen in Fig. 2 is due to the inaccuracy of the underlying exchange-correlation functional, rather than indicating a shortcoming of the ACE approach.
High-pressure structural transitions of SiO2 Figure 3 characterises high-pressure properties of silica.In Fig. 3a, we show energy-volume curves of αquartz, coesite, stishovite, α-PbO 2 -type, and pyrite-type silica as predicted by our ACE model and compared with DFT data, with which they agree well.In addition, we tested the behaviour of the model for rosiaite-type silica, which was recently observed in experiment 47 and predicted theoretically 48 for direct compression of α-quartz.
In contrast to the structures mentioned before, this particular polymorph is not part of the training database.Nevertheless, the ACE model reproduces DFT data for this structure similarly well as for the other polymorphs.
Figure 3b shows an enthalpy-pressure diagram at 0 K.For lower pressures, there is a transition from α-quartz to coesite between 2.5 and 3.0 GPa, consistent with the predicted phase diagram (Fig. 2), followed by a transition to stishovite at 5.5-6.0GPa.At higher pressures of ≈ 110 GPa, we observe the transition from stishovite to α-PbO 2 -type silica.Experimentally, rather than stishovite (rutile type), the structurally closely related CaCl 2 (distorted rutile) type polymorph of silica is stable.The transition from CaCl 2 -to α-PbO 2 -type silica was observed at 120 GPa and 2400 K. 49 Given that our enthalpy data correspond to a temperature of 0 K, both values agree well with each other.For the transition of α-PbO 2to pyrite-type silica, our ACE model predicts a pressure of ≈ 246 GPa, in good agreement with the experimentally determined transition pressure of ≈ 260 GPa at 1800 K. 50 Finally, rosiaite-type silica 47 is correctly identified as metastable over the pressure range studied.
Figure 3c shows the pressure evolution of the average coordination number (CN) of silicon in amorphous silica, extracted from an MD simulation at room temperature and under isostatic pressure.The ACE results agree well with experiment up to about 50 GPa, 5,43 and with ab initio MD 44 results over the whole pressure range.The good agreement with experiment is particularly true for the data from Ref. 43.Above 50 GPa, the ACE underestimates the average CN: at 175 GPa the experiment predicts it to be about 7; the ACE simulation predicts it to be 6.Importantly, this does not mean that there are no sevenfold-coordinated environments, but there is a residual number of 5-fold coordinated atoms as well, lowering the average (Fig. 3d).A possible reason for the good agreement with the ab initio result, but the deviation from the experimental results, might be the limited time scales in our simulations, which hinder a complete transition into higher-coordinated environments.Moreover, we note that calculated X-ray-Raman spectra of the ab initio structures from Ref. 44 are in good agreement with experiment indicating a lower CN.Other MD simulations also showed slightly lower CNs than the experimental values. 51Figure 3e shows three different 7fold coordinated environments extracted from the simulations.A CN of 7 in amorphous silica might be surprising since silicon is sixfold-coordinated in all crystalline silica polymorphs that are stable in this pressure range.However, the pyrite-type phase, which becomes thermodynamically stable at ≈ 240-260 GPa contains silicon atoms with a 6+2-fold environment.A recent study found certain, but limited, similarities between these seven-fold environments in the glass and pyrite silica. 51n Supplementary Fig. S2, we show two additional structural fingerprints which have been commonly analysed in experiment: the position of the first sharp diffraction peak and the Si-O bond length.For both cases, our simulations show good agreement with experiment.

SiO2 surfaces and aerogels
Figure 4 tests the ability of the potential to accurately predict surface energies.We begin with validation for different α-quartz surfaces: we created surface slab models, relaxed them with the ACE models, and evaluated the energetics, and therefore the surface energy per area, with DFT single-point computations (Fig. 4a).The ACE results agree well with DFT, especially considering that the training database does not contain all the surface terminations shown.Whilst these surface energies can be computed with DFT, realistic amorphous surface energies are much harder to calculate due to the required system sizes.Therefore, Fig. 4b validates the potential on 125 small-scale surface structures of amorphous SiO 2 , each containing 192 atoms.The surface models are created based on bulk structures from Ref. 31; the latter had been generated in melt-quench simulations with different interatomic potentials and therefore span a range of energies.Regardless of the starting structure, the ACE model captures the surface energy for all surfaces models very well: the total RMSE is about 0.01 eV/ Å2 , and only a slight underestimation compared to DFT is seen.Moreover, there are no clear outliers although the various surface energies indicate a large diversity of the surface structures.
The amorphous surfaces shown are already very complex, but in reality they are often not flat as here.They have curvature, for example when occurring inside pores, and such complex structures can no longer be directly validated with DFT.Fig. 4c therefore shows how well atomic environments in various porous amorphous structures are covered by the dataset.These structures were prepared by straining amorphous structures at elevated temperatures to the desired density.To validate the performance of the potential on this model, we show the linear extrapolation grade according to the maxvol selection. 55,57An extrapolation grade above 1 corresponds to atomic environments that have not been covered in the training database.This does not mean that the potential is no longer reliable, as there is a certain range of more or less reliable extrapolation, but as the extrapolation grade increases, non-physical behaviour and failure of the potential becomes more and more likely. 39or all porous structures, regardless of density, we find that the maximum extrapolation value is less than 1.Thus, we observe no extrapolation in any of the considered cases.This indicates an accurate description of the potential for a variety of curved surfaces.

Elemental silicon
Whilst our ACE model is designed for the binary Si-O system, we show in Table II the performance for diamond-like elemental silicon compared to both DFT and experiment.The bulk modulus of diamond is very well reproduced, whereas the vacancy formation energy is underestimated by about 30 %.The experimental surface energies are well recovered by the ACE model, but this may be partly due to serendipity, because the SCAN ground-truth data show poorer quality (Table II).We also computed the linear thermal expansion coefficient of diamond-type silicon in the quasiharmonic approxi- mation.We found almost perfect agreement between the ACE prediction and experiment; in particular, the unusual negative expansion coefficient below 130 K is reproduced (Supplementary Fig. S3).Finally, melt-quench simulations were performed to generate amorphous silicon structures (Supplementary Fig. S4).The agreement with the experimental structure factor is as good as that for a GAP-18-generated structure from Ref. 58.In addition, we are able to achieve lower quenching rates with the ACE than with the GAP, and for quenching rates as low as 10 10 K/s, we observed crystallisation.
Compared to Si-GAP-18, 30 we observe higher errors with respect to the reference data (Table II).This is not surprising as our database contains two elements and has a strong focus on the configurational space of SiO 2 .However, for properties such as the bulk modulus, the agreement with experiment is better than that of GAP-18.The reason for this is in the underlying SCAN data, which appears to provide a more accurate description of silicon than the GGA functionals used previously.Due to the lower accuracy in reproducing the SCAN data, the potential has some shortcomings for higher pressure polymorphs: the bc8 phase is erroneously predicted to be stable at elevated pressure (Supplementary Fig. S5).However, very-high-pressure silicon phases were not the scope of the present work -instead, the focus in this case is on the accurate description of ambient-pressure silicon environments as a constituent part of mixed binary phases and nanostructures.

SiO and mixed silicon-silica systems
Whilst the results so far have served to demonstrate the usefulness of the appraoch -both in terms of development of efficiently generated datasets and the fitting within the ACE framework -we are now in a position to study an actual application problem.To this end, Fig. 5a shows structural models of SiO.Experimentally, amorphous SiO is obtained by deposition of SiO from the gas phase. 62In contrast, we created our models by melt-quench simulations.SiO phases are known to be  metastable with respect to Si and SiO 2 .For example, a recent DFT-based crystal-structure prediction study explored possible ordered phases of homogeneous SiO, and found for a range of structures that these are metastable compared to the crystalline mixture of Si and SiO 2 . 63e verified that our ACE potential similarly reproduces the metastability of the ambient pressure phases (Supplementary Fig. S6).In good agreement with these results, our melt-quenched structures show a clear segregation between amorphous silicon (blue) and amorphous silica (red).With decreasing quench rate, the number of silicon grains decreases while the grain size increases.5c shows the ratio between the volume of the silicon grains divided by the interface area; details are given in the Methods section.In the approximation of spherical particles, the grain diameter is d = 6 • V Si,grains /A interface .From this we can estimate average grain diameters between 24 Å and 54 Å for the tested quench rates.These grain diameters agree very well with transmission electron microscopy measurements, which indicated diameters of 30 to 40 Å. 14 Figure 5d shows the excess energies of the structures referenced to to α-quartz and diamond-type silicon.The SiO structures were relaxed by optimisation of the cell size as well as the atom positions at 0 K.As experimental reference, we show the standard enthalpy of formation of SiO. 56The structures generated by quench rates of 5×10 12 and 2×10 12 K/s have energies comparable to experiment.Indeed, we can even create structures that are energetically more favourable than in experiment, noting again that our procedure to produce the structures deviates significantly from the experimental one.
But is this really an improvement compared to existing, empirically fitted interatomic potentials?Indeed, there are already two potentials implemented in LAMMPS 65 which are able to perform simulations of mixed systems: the Munetoh potential 53 and a charge-optimised many-body (COMB) potential. 29For both potentials, we tested the same procedure to generate structural models of SiO.The Munetoh potential yielded a homogeneous structure without observable segregation into silicon and SiO 2 , and the resulting structure factor (Supplementary Fig. S7) deviates strongly from experiment.For the COMB potential, we observed pore formation at elevated temperatures, finally resulting in a strongly increased simulation-cell size.Therefore, we only equilibrated our best-matching structure at room temperature and analysed the change in structure factor (Supplementary Fig. S6): again, we observed a strong deviation from experiment, indicating that the structure is very different from the ACE model prediction.

DISCUSSION
Understanding the microscopic nature of interfaces and nanostructured matter is essential to advancing materials research.Here, we have presented an active-learning scheme that we term "amorphous matrix embedding" that can realistically represent environments from largescale simulations in DFT-accessible cells, enabling fast and accurate atomistic modelling of heterogeneous materials.We used the approach to develop a general-purpose interatomic potential for binary Si-O phases with varied compositions that is able to describe the trifecta of modelling challenges in this material system: very high pressure phases (relevant to geology), surfaces (relevant to catalysis), and mixed stoichiometric compositions with nanoscale heterogeneity (relevant to battery systems).
Using the ACE approach, we observe a speed-up of about two orders of magnitude compared to the more established GAP framework.This makes it possible to access long time scales and large length scales with DFTlike accuracy.Of course, there are some shortcomings of this potential, e.g., the lower accuracy for pure silicon compared to the state of the art -but this use-case is not the focus of our work, as there are already competitive GAP and ACE models available. 30,35In our case, they quality of the underlying meta-GGA data might cause an out-performance compared to earlier ML-potentials fitted with more economical GGA labels.
We hope that our work, and the dataset and resources developed therein, will advance the modelling of porous silica nanostructures as well as of high-pressure silica.For the Si-SiO 2 interface, alternative interatomic potential models are scarce and the higher quality potentials come with an expensive charge-equilibration term.Our tests showed that the ACE describes silicon monoxide in much closer agreement with experiment than existing empirical models.
We view the present database and ML potential model as a starting point for wider-ranging studies in this important material system.In the future, higher accuracy for the mixed system might be achieved by using chargeequilibration schemes coupled with ML potentials. 66owever, this would come with much longer computing times as well as worse scaling for larger systems.Moreover, in the future, we will include lithium in the potential to investigate the battery performance of SiO on the atomistic scale.

METHODS
Machine-learning potential fitting.We used two frameworks for fitting ML potential models.While constructing the reference database, we used Moment Tensor Potentials 33 with active learning 57 as implemented in the MLIP package. 39For the final potential fit, we used the nonlinear Atomic Cluster Expansion (ACE) 34,35 as implemented in PACEMAKER. 36For ACE, we tested a range of combinations of embeddings, and found the following to be suitable: with ϕi being atomic properties, which are expanded by the ACE basis functions (for details see Ref. 34 ).The exponents of the embeddings include fractional exponents and higher integer powers of fi ∈ {1/8, 1/4, 3/8, 3/4, 7/8, 2}.We found that especially fractions between 0 and 1 improved the behaviour of the potential.This approach goes beyond the previously suggested linear embedding (only the first term) and Finnis-Sinclair (the first two terms) type embedding, 35 and is referred to as "complex" embedding in Table I.For the expansion of the atomic properties ϕi we used 600 basis functions with 5700 parameters.As radial basis we employed Bessel functions.A κ value of 0.01, which gives the ratio between force and energy weights value, was used during fitting.For optimisation we used the BFGS algorithm for 2000 steps.
DFT computations.All DFT computations were performed using VASP 67,68 and the projector augmented-wave method. 69,70For calculations we used the SCAN functional 37 with an energy cutoff of 900 eV and a k-spacing of 0.23 Å−1 .Surface calculations were performed with dipole corrections.We note that these convergence parameters are optimised for silica; however, we found them to be also well converged for mixed phases and for pure silicon structures.Only for veryhigh-pressure silicon allotropes, a higher k-spacing would provide a relevant advantage; however, since these are not in the scope of the present work, we neglect these inaccuracies.
MD workflows.Simulation protocols were implemented using the atomic simulation environment (ASE) 71 and the OVITO Python interface. 72While optimisation and smallcell MD were partially performed with ASE, large-scale MD and statics simulations were carried out using LAMMPS. 65he time step was 1 fs.For NVT simulations, we used a Nosé-Hoover thermostat with temperature damping constant of 100 fs; for NPT simulations, we added a Nosé-Hoover barostat with a pressure damping constant of 1,000 fs.
Amorphous structural models as starting point for compression simulations were created by the melt-quench procedure described in Ref.31, now using the ACE potential.The compression was performed under isostatic conditions.In each step, the pressure was initially increased by 1 GPa within 2.5 ps of simulation time, followed by equlibration over 2.5 ps at the new pressure.This procedure was iteratively repeated.Coordination numbers were determined after equilibration.
The aerogel structures were created by a similar protocol as in Ref. 31.An initial structure was randomized at 6,000 K for 10 ps, instantly cooled to 4,000 K and kept there for 100 ps.From this temperature, the liquid was cooled to 300 K with a quench rate of 10 13 K s −1 .During the equilibration at 4,000 K and up to half of the quenching process, the cells were additionally extended to the desired density.
The mixed structures were created using the same protocol as in Ref. 31 for producing amorphous structures.The volume of the silicon grains was determined within OVITO by deleting all silicon atoms and usage of the ConstructSurfaceMesh modifier on the remaining oxygen atoms.The corresponding interface area was extracted in the same way.
Phase diagram calculations.Thermodynamic integration was carried out as implemented in calphy. 45,46We used 50,000 equilibration steps, 800,000 switching steps for the switching to the Einstein crystal, as well as 300 steps/K for the thermodynamic integration to calculate the temperature dependence.Due to numerical issues, we fixed the spring constants of the Einstein crystal to 2 eV/ Å2 for oxygen and 4 eV/ Å2 for silicon.We carefully checked the influence of this constraint on the final results and found it to be negligible.
Structure factors.Faber-Ziman structure factors were obtained by summation of the Fourier transformations of the partial radial distribution functions calculated with OVITO.The corresponding partial structure factors were weighted by atomic form factors taken from Ref. 73.For the high-pressure structures, we used a cut-off radius of 20 Å for the radial distribution function, and analysed a single snapshot (without time averaging).For the SiO structures, we used a cut-off of 80 Å and an average over 10 snapshots.

DATA AVAILABILITY
The potential parameter files, the reference data with SCAN labels, and additional supporting data will be provided openly through Zenodo upon journal publication.
The Supplementary Material is split into two parts.In the first part we present figures referenced in the main article.In the second part methodological details about the database generation, the active learning and calculation of various properties are given.

I. DATABASE
The composition of the database (DB) and the weights used for the fit are summarised in Supplementary Tab.S1.The corresponding energy and force range can be seen in the scatter plot in Supplementary Figure S8.The DB contains a large fraction of structures used for training the SiO 2 GAP-22 13 (27%) as well as the Si GAP-18 14 (22%).Structures added in this work represent highpressure SiO 2 phases (both crystalline and amorphous), surfaces and charge-neutral vacancies, as well as Si/SiO 2 interfaces, quenched structures of Si-SiO 2 mixtures and clusters of [SiO] x .These additional structures were partially designed on purpose and partially obtained by active learning (AL), either using small-scale models and the AL technique implemented in the MLIP package 15 or using commitee voting coupled to our amorphous matrix embedding, see Supplementary Sec.I C 1.In both techniques we sample configuration space by molecular dynamics simulatons (MD).
In the following details concerning the structure generation and related simulation parameters will be given.

A. Manually prepared data
Most of the manually prepared data is based on crystal structures or other input structures, which have been rattled and deformed.If not stated differently, these structures have been prepare by applying variations of +/-2.5 % to the lattice parameters a,b and c and additionally by applying variations of +/-5 % to the angles α, β and γ.Afterwards the atoms have been displaced using the ASE rattle function with a standard deviation of 0.01 Å. 16

fcc and hcp silicon
We added fcc and hcp structure of silicon under highcompression to the database to improve the repulsive behaviour of the potential at short distances.

High pressure silica
We included simple unit cells, as well as, 2x2x2 supercells of pyrite and α-PbO 2 -type silica in our database.

Silica surfaces
Table S2 shows a list of surfaces, which we included into our database.Each of these surfaces has been included 30 times in the database, 10 times by rattling the structures with an average displacement of 0.01 Å, 10 time with an average displacement of 0.05 Å and 10 times with 0.1 Å. S2.Crystalline structures with surface orientations, which are included in the database.

Silica vacancies
Vacancy structures are created by removing randomly atoms according to one formula unit of SiO 2 from αquartz and amorphous structure models.

Crystalline-amorphous Si-SiO2 interfaces
Crystalline-amorphous interfaces between silicon and silica have been created by merging two crystalline bulk supercells together.One of these supercells was kept fixed, while the other was melted.A repulsive wall was used to prevent atoms from the molten phase to diffuse into the crystalline phase.

Clusters
[SiO]x clusters have been added in this work.These have been generated by iteratively performing MD simulations-.We started by an [SiO] 2 cluster and added five snapshots from five MD simulations (so in total 25) to the database.Within these MD simulations the temperature was increased from 100 to 2000 K.After adding the structures to the database, we refitted the potential and repeated the simulations with clusters containing one additional formula unit of SiO resulting in a [SiO] i+1 cluster.We repeated the process up to a system size of [SiO] 32 .
Supplementary Table S1.Composition of the total database subdivided in various structure types with various compositions.We also show, which parts of the database are taken from references and which parts are new.For the structures, which are collected by active learning, we indicate which active learning type was used.We also give the fitting weights for each of the structure types.In the small-scale active learning we used tools implemented in the MLIP package. 15We performed MD simulations and recognised new structure using the maxvol algorithm as implemented in that code.We used an extrapolation threshold of 1.5 and a stopping threshold of 3.0.This active learning process was repeated iteratively.We performed around 100 simulations in parallel, collected newly found structures and reduced them to the most relevant by the maxvol algorithm.Then we calculated corresponding forces and energies by DFT, refitted the potential and repeated the MD simulations.
In the following we will briefly give the simulation details of the performed MD simulations.We performed quenching simulations under various conditions.All quenching simulations have in common, that they start with a randomization phase for 10 to 100 ps at 6000 K and under NVT conditions.This is followed by an equilibration at 4000 K under NPT conditions for 100 ps and subsequent quenching to room temperature with various rates (10 12 to 10 14 K/s).We performed quenches with the SiO 2 compositions and mixed SiO x compositions under ambient pressure using the quenching procedure explained above (external pressure of 0 GPa).For the SiO 2 simulations we used various crystalline unit-and supercells as input.For the mixed simulations we stacked together various SiO 2 and Si crystalline cells to achieve different compositions.

High-pressure amorphous silica
Similar to the ambient pressure simulations, we also performed quenching simulations for the high-pressure phases.However, this time we used pressure between 0-200 GPa in the NPT part.After convergences of the quenching simulations we performed compression simulations using amorphous input structures under NPT conditions starting from a pressure of 0 GPa going to a pressure of 200 GPa.The temperature was held constant at a random value between 0 and 1000 K. Supplementary Figure S8.Scatter plot for the forces (bottom) and energies (top) of the training and testing set.The total dataset was randomly divided into the training (95%) and the testing set (5%).On top of the scatter plots we show the distribution of data points.

SiO2 surfaces
We used the surface structures, which have been created manually, as input for MD simulations.In the MD simulations we annealed the surfaces under NVT conditions from 50 K to 3000 K and cooled them back down to 50 K afterwards.

C. Large-scale active learning
In the large-scale MD simulations we used a committee-error to examine the uncertainty of each atom.Based on the committee-error we decide whether a structure is needed to be added to the database.In the case that we want to add the structure we extract a small-scale model by amorphous matrix embedding.In this section we will briefly explain the amorphous matrix embedding method and then give the details for the MD simulations.

Amorphous matrix embedding
Amorphous matrix embedding is used to enable active learning (AL) on large-scale systems.For DFT calculations, systems sizes are essentially restricted to a few hundreds of atoms.Therefore conventional AL is performed using models of this size.During large-scale simulations, however, local atomic environments may partially deviate from environments encountered in small-scale simulations.It is therefore desirable to include environments that are likely to appear only in large-scale simulations.Amorphous matrix embedding is a technique that allows us to do so.
The basic idea of the amorphous matrix embedding technique is to locally identify unknown environments in a large-scale simulation, extract this environment including its surrounding (a total of 200 -400 atoms), freeze the atoms in the relevant environment and amorphize the surrounding using periodic boundary conditions, see Fig. 1b in the research paper.As measure for the uncertainty of an environment, we use committee voting of several moment-tensor potentials (MTPs) with respect to forces on a particular atoms.In particular, we define the uncertainty u α (1) Here is the i-th component of the force acting on atom α as obtained by one of the committee members (potentials) for particular configuration along a trajectory; STD denotes the standard deviation.
Based on this, the amorphous matrix embedding is performed now according to the following procedure.First, we loop over all atoms and find atoms with an uncertainty above a certain threshold (1-2 eV/ Å).In principle, this indicates that the local environment of the atom is unknown to the potential.However, if the uncertainty is very high, the structure might be very unfavourable and contain atoms with very high forces.This would worsen the DFT convergence and afterwards the potential fit.Therefore, we introduce an upper threshold (5 eV/ Å), which is not limited to the atom itself, but also to atoms in the direct environment.In the case that one of the atoms exceeds that threshold, we do not use this environment.Another issue is that similar structures might be picked in subsequent steps of the MD trajectory.We use smooth overlap of atomic positions similarities as implemented in DScribe 17 to add only structures, which are not similar to previously selected atoms (similarity<0.9-0.95).
Finally, for atomic environments that fullfill all the requirements, we create a new box with a box size of 13 Å plus an additional margin of 1 Å.The reason for this box size is that it is larger than twice the cutoff while still being DFT feasible.We cut out a cube with a side length of 13 Å around the atom of interest and fill this cube into the newly created box.Depending on the scenario, the composition of the box is adjusted by deleting atoms outside of the region of interest.To fix the artificial boundaries we annealed the atoms outside of the cutoff up to 2000 K to 6000 K while keeping the inner atoms fixed.Afterwards, the structures were quenched to 300 K. To prevent atoms from outside the cutoff to enter the cutoff area, we applied a repulsive potential centred around the atom of interest.
For surfaces further special adjustments had to be made.If the atom was close to the surface, we estimated the normal vector of the surface by, Here x j is the position of atom j.The box, which needed to be extracted, was then rotated in a way that the normal vector was oriented along the z-direction.Additional vacuum layer of 5 Å along each direction have been placed into the direction of the normal vector.After the molecular dynamics simulations some of the surfaces structures have not been periodically connected into the x and y directions.These cases were handled instead of slab structures as cluster structures and additional vacuum into the x-and y-direction was added, to be able to use the dipole correction in VASP.
2. Quenching under ambient pressure (SiO2 and SiOx) We used the same protocol for quenches as mentioned in the small-scale active learning part.As input structure for the SiO 2 simulations we used β-cristobalite supercells, while for the SiO x simulations we used supercells of the cell used for the small-scale simulations.For the latter one an exemplary structure is shown in Supplementary Fig. S9.Supplementary Figure S9.Initial structure for the largescale SiOx interpolation models.The structure is a supercell of α-cristobalite structures merged with silicon structures of diamond-type.

Silica surfaces
To perform the large-scale active learning, we used the same standard quenching protocol as mentioned above.However, the NPT part was replaced by a part, where the initially liquid structure was constantly strained to larger volumes inducing pores into the structure.

Vacancies
As input structures we used quartz supercells and amorphous structures containing around 65,000 atoms including 50 formula units of randomly selected vacancies.These structures have been annealed from room temperature to 3000 K and back to room temperature.

Compression to very-high-pressures
We used amorphous models containing 65,000 atoms, which we compressed in MD simulations up to 200 GPa at a constant randomly selected temperature between 0 and 1000 K.

Clusters
We used a box containing 10,000 SiO molecules in gaseous state with a density of 0.011 g/cm 3 as input structure.This box was compressed to a density of ≈2 g/cm 3 , while keeping the temperature constant at 1400 K.

II. CALCULATION OF PHYSICAL PROPERTIES A. Phase diagram calculation
For the determination of the phase diagram we used calphy 18 , which implements thermodynamic integration using reversible scaling 19 and nonequilibrium calculations of free energy differences.The basic idea is that we are switching the Hamiltonian of our system continuously to the Hamiltonian of a references system with known free energy.Calphy uses either the Einstein crystal for crystalline structures or the Uhlenbeck-Ford model 20 for liquid systems.Finally, we receive an absolute Gibbs free energy for a certain temperature and pressure.
This Gibbs free energy is used as a reference energy, from which we start to perform thermodynamic integration using reversible scaling.By this we are able to find the temperature dependence of the Gibbs free energy for a given pressure or although not used here also the pressure dependence for a given temperature.Details of the theoretical background can be found in the corresponding publication. 18e used input structures with a cell size of around ≈ 15,000 atoms.For the calculations we used 50,000 equilibration steps and 800,000 switching steps for switching between our systems and the corresponding reference systems.For the thermodynamic integration we used 300 steps/K.The force constants for the Einstein crystal were set to 2 eV/ Å2 for oxygen and 4 eV/ Å2 for silicon, to avoid numerical instabilities.We carefully checked the dependence of the final results on the force constants and can exclude notable impact (∆G <0.01 meV/atom).
In our work, we sampled the free energies in a temperature range between 600 K and ∼2600-3000 K for various polymorphs.An example is shown in Supplementary Fig. S10a.Here, we show the Gibbs free energies for four different structure types of silica at a pressure of 0 GPa.For cristobalite we started integration at 1500 K up to shortly above the melting point.The tridymite interval was split into two integrations intervals to achieve higher accuracy.One integration was performed from 1500 K to 600 K and the other integration was performed from 1500 K to 2600 K.The melt was integrated down from 4000 K to 1500 K. Unfortunately, we cannot integrate quartz in the same way since α-quartz is dynamically switching to β-quartz in the simulations.Therefore, the α-quartz/β-quartz transition was determined using classical molecular dynamics simulations.This is illustrated in Supplementary Fig. S11a-b.α-quartz as well as βquartz input structures have been equilibrated in molecular dynamics simulations at various temperatures for 100 ps.Already around 30 K away from the transition points the final densities agree very well.The transition temperature was determined by finding the maximum slope of the following approximating function, .
(3) The parameters a, b, c, d, e, T 1 and T 2 are fitted to the results from the molecular dynamics simulations.This was repeated for several pressures.Corresponding transition temperatures in dependence of the pressure are shown in Supplementary Fig. S11c.A line is shown, which was fitted to the average transition temperatures.This line was also used in the phase diagram plots in Figure 2. Based on this transition line we performed thermodynamic integration for α-quartz from 600 K to 50 K below the α-β-quartz transition line, while for βquartz we started 50 K above this line.Therefore, we do not have data for quartz in a range of 100 K around the transition line.
This whole process was repeated for several pressures as it is indicated in Supplementary Fig. S10b for αquartz.Corresponding to the pressure the temperature integration interval was adjusted for each polymorph.Finally, to calculate the phase diagram, we fitted our data points to a polynomial of third order to achieve a equation for the Gibbs free energy dependence of the pressure.This was done for temperatures between 600 and 2900 K in 1 K steps.Exemplary, this is shown in Supplementary Fig. S10 for α-quartz for temperatures of 650, 1000, 1500 K.
Supplementary Figure S1 emphasises the difficulties of achieving accurate agreement between calculated and measured phase diagrams.We show how decreasing the stability of certain phases change the phase diagram compared to the original one.The original phase diagram is shown as thin red line in the background, while the modified one is shown by the black lines.For quartz, coesite and the melt we added 2 meV/atom to the free energy since the coexistence lines are less sensitive to a shift of the energy compared to cristobalite and tridymite.These are shown in Supplementary Fig. S1d-f.For cristobalite and tridymite we added only 0.1 meV/atom.The corre-sponding change of the transition lines is shown in Supplementary Fig. S1b-c.Additionally, in Supplementary Fig. S1a we show how the phase diagram changes, when we add 5 meV/atom to cristobalite and tridymite.The additional blue line in the background shows the CAL-PHAD reference data from 1 .There are several points we can conclude from this.First, we can conclude that there is a high uncertainty in the determination of the cristobalite-tridymite transition line.Especially, with respect to the error of potential to the corresponding DFT data (≈1 meV/atom).Second, it shows how essential the choice of the DFT exchange-correlation functional is, since depending on the functional this phase diagram would look totally different.The reason for this is that the difference between two exchange-correlation functionals can be more than 100 meV/atom for silica. 13Thirdly, we can see that also the SCAN exchange-correlation functional is not perfect.Destabilising cristobalite and tridymite would improve the match with the experimental phase diagram significantly.Therefore, it seems like that the SCAN energies for these polymorphs are predicted slightly too low.

B. Surface Energies
Table II and Fig. 4 both in the main manuscript, show surfaces energies of silicon and silica.We consider only stoichiometric slabs.These surface energies γ have been calculated by the following formula, where A is the total surface area, N is the number of particles in the slab, E ref is the bulk reference energy and E slab is the potential energy of the slab.The reference energy of the α-quartz surface is the energy of the minimised α-quartz unit cell per atom, correspondingly the reference energy of the diamond surface is the minimised diamond unit cell.The reference energy of the amorphous sample is the bulk energy of the same relaxed amorphous structure without surfaces.
C. Enthalpy Fig. 3 in the main manuscript shows the enthalpy for several phases.The enthalpy H is given by, where E is the inner energy, p is the pressure and V is the volume.The volume dependence of the energy has been determined by a Birch-Murnaghan fit to the energyvolume curve of each polymorph.p(V ) was given by the corresponding derivative.The energy-volume curves have been calculated by variing the volume by ±20% for α-quartz and coesite, by ±25% for stishovite and ±30% for all other phases.The corresponding structures have been structurally minimised allowing changes of the positions as well of the box shape, however, keeping the volume fixed.Coordination numbers have been determined by the integral over the first peak of the partial Si-O radial distribution function.The Si-O bond distances is given by the first peak position of the partial Si-O radial-distribution function.
Supplementary Table S3.Comparison of the accuracy of the GAP 13 with various ACE models for the training and testing set, as well for a range of separate smaller datasets.The errors given in the table are root mean square errors for energies and forces.The unit of the energy error is meV/atom and of the force error meV/ Å.

FIG. 2 .
FIG. 2. Temperature-pressure phase diagram of SiO2.(a) Phase diagram calculated based on experimental data, adapted from the literature (Ref.42).(b) The same phase diagram calculated based on predictions from our Si-O ACE model and thermodynamic integration.

FIG. 3 .
FIG. 3. Silica at megabar pressures.(a) Energy-volume curves for high-p silica polymorphs with energies referenced to α-quartz.Markers indicate SCAN DFT results; lines indicate the ACE prediction.(b) Enthalpy differences for various silica polymorphs referenced to stishovite.(c) Compression of a vitreous silica structure.Results for the average silicon coordination number are compared to experimental measurements from Refs. 5 ("Exp.1") and 43 ("Exp.2") and ab initio MD simulations from Ref. 44.The distribution of coordination numbers at selected pressures is indicated by violin points.(d) Snapshots of the compression simulation showing coordination polyhedra for different coordination numbers (only).(e) Visualisation of the coordination environments of selected 7-fold coordinated silicon atoms.

FIG. 4 .
FIG. 4. Surfaces and aerogels.(a) Surface energies of various α-quartz surfaces in relaxed and unrelaxed state.The black lines are the ACE results while the red lines are the DFT single point results for the relaxed/unrelaxed structures.The structure pictures are showing the unrelaxed surfaces.(b) Surface energies of amorphous models calculated with ACE and DFT.Amorphous structures were taken from Ref. 31, and have been created by various interatomic potentials: the BKS, 52 CHIK, 41 SiO2-GAP-22, 31 Munetoh, 53 and Vashishta 54 models.The structures were relaxed with ACE and sliced at various points, followed by another ACE relaxation.DFT surface energies were determined without further relaxation.(c) Exemplary porous amorphous silica structures with various densities.Additionally, we show the distribution of the linear extrapolation grade 55 and the maximum extrapolation grade (red line) for each structure.For all structures, the maximum extrapolation grade of the atoms is below one.

FIG. 5 .
FIG. 5.Nanoscale segregation in amorphous silicon monoxide.(a) Visualisation of SiO structures generated by quenching from the melt at rates between 10 13 and 10 12 K s −1 .Colour-coding is based on the nearest-neighbour Si-O coordination numbers, which are four in SiO2 and zero in elemental silicon.Accordingly, SiO2-like and Si-like regions are indicated in red and blue, respectively.(b) Structure factor for the 5 × 10 12 K s −1 simulation.(c) Relation between grain volume of the silicon grains and the interface area between silicon and silica.With increasing quench rate, the grain size of the structures decreases.(d) Energy of the SiO structures referenced to α-quartz and to diamond-type silicon, compared to the experimental standard enthalpy of formation for SiO.56

56
FIG. 5.Nanoscale segregation in amorphous silicon monoxide.(a) Visualisation of SiO structures generated by quenching from the melt at rates between 10 13 and 10 12 K s −1 .Colour-coding is based on the nearest-neighbour Si-O coordination numbers, which are four in SiO2 and zero in elemental silicon.Accordingly, SiO2-like and Si-like regions are indicated in red and blue, respectively.(b) Structure factor for the 5 × 10 12 K s −1 simulation.(c) Relation between grain volume of the silicon grains and the interface area between silicon and silica.With increasing quench rate, the grain size of the structures decreases.(d) Energy of the SiO structures referenced to α-quartz and to diamond-type silicon, compared to the experimental standard enthalpy of formation for SiO.56 Figure 5b shows the structure factor, S(Q), determined at 300 K for the structure quenched with 5 × 10 12 K/s.The S(Q) data for the other structures are shown in Supplementary Fig. S7.The structure factor of the structures generated by a quenching rate of 5×10 12 K/s agrees best with the experimental structure factor from Ref. 64. Figure

TABLE I .
41 model performance.We report energy root mean square error (RMSE) values in meV atom −1 on different test sets.We characterise three ACE models, fitted to the same dataset but with increasing model complexity (Methods).'a'indicatesamorphous structures.The CHIK41and GAP 31 generated structures are taken from Ref. 31.
a Structural models generated using ACE-MD.

TABLE II .
Properties of diamond-type silicon.We show SCAN DFT values for reference, as well those obtained with the complex ACE potential; both computations are compared to the GAP-18 model, the corresponding reference data (PW91) and to experimental data ("Expt.").

Table S4 .
Timing and speedup of various ACE potentials compared to the GAP from Ref.13.Timings have been obtained for 192 atoms cells over 100 time steps.