Field-free spin–orbit torque perpendicular magnetization switching in ultrathin nanostructures

Magnetic-field-free current-controlled switching of perpendicular magnetization via spin–orbit torque (SOT) is necessary for developing a fast, long data retention, and high-density SOT magnetoresistive random access memory (MRAM). Here, we use both micromagnetic simulations and atomistic spin dynamics (ASD) simulations to demonstrate an approach to field-free SOT perpendicular magnetization switching without requiring any changes in the architecture of a standard SOT-MRAM cell. We show that this field-free switching is enabled by a synergistic effect of lateral geometrical confinement, interfacial Dyzaloshinskii–Moriya interaction (DMI), and current-induced SOT. Both micromagnetic and atomistic understanding of the nucleation and growth kinetics of the reversed domain are established. Notably, atomically resolved spin dynamics at the early stage of nucleation is revealed using ASD simulations. A machine learning model is trained based on ~1000 groups of benchmarked micromagnetic simulation data. This machine learning model can be used to rapidly and accurately identify the nanomagnet size, interfacial DMI strength, and the magnitude of current density required for the field-free switching.


INTRODUCTION
Magnetization switching through spin-orbit torque (SOT) is of great recent interests due to its potential applications in SOT magnetoresistive random access memory (SOT-MRAM), which is expected to have faster write speed, lower write energy, and higher endurance than the currently used spin-transfer torque MRAM 1,2 . An SOT-MRAM cell integrates a magnetic tunnel junction (MTJ) on top of a heavy-metal layer. During writing, an in-plane charge current flowing into the heavy-metal layer is converted to a perpendicular spin current via the spin Hall effect (SHE) [3][4][5] , as schematically shown in Fig. 1a. The spin current then flows into the overlaying magnetic layer and switch its magnetization via the SOT. SOT has been used to switch both the perpendicular 6 and in-plane 7 magnetization, but perpendicular switching is more attractive for SOT-MRAM applications because it is faster 8 and more scalable 9 . However, SOT-mediated perpendicular magnetization switching typically requires a simultaneous application of an in-plane bias magnetic field 6,7 . This inplane magnetic field is not desirable for practical SOT-MRAM application, because it reduces the thermal stability of perpendicular magnetization and because it may cause crosstalk among neighboring SOT-MRAM cells. . Thus far, many approaches have been proposed for achieving a magnetic-field-free SOT-mediated perpendicular magnetization switching. An early proposed approach is to fabricate asymmetric multilayer stack, such as engineering a thickness gradient into the magnetic layer 10 or its overlaying oxide 9 or heavy-metal layer underneath 11 . Such structural asymmetry generates a unidirectional effective perpendicular magnetic field, leading to a deterministic switching. Approaches proposed thereafter aim to introduce a built-in in-plane bias magnetic field, generated by an in-plane magnetized reference layer in the MTJ 12,13 or an antiferromagnetic heavy-metal with in-plane magnetized sublattices 14 . Other approaches include (1) adding a ferroelectric layer below the heavy-metal to harness the effect of electric-fieldswitchable spontaneous polarization 15 or piezoelectric strain 16 ; (2) adding another heavy-metal layer to generate competing spin currents 17 ; (3) applying an out-of-plane charge current from the top terminal of the MTJ to generate an assisting STT 18 ; and (4) engineering the geometry of the ferromagnetic layer 19,20 . Most recently, micromagnetic simulations 21,22 have suggested that a magnetic-field-free SOT perpendicular magnetization switching can also be achieved by simultaneously controlling the lateral size of the magnetic layer, the strength of the Dyzaloshinskii-Moriya interaction (DMI) arising from the ferromagnet/heavy-metal interface, and the magnitude of applied charge current density within an intermediate range. Compared with earlier proposed approaches, this approach does not require any changes in the architecture of a standard perpendicular SOT-MRAM cell, which is important to practical applications.
In this article, we use both micromagnetic and atomistic spin dynamics (ASD) simulations (see "Methods") to study the nucleation of the reversed magnetic domain and its subsequent growth during the two-step switching. Notably, the ASD model enables predicting the evolution of atomistic spin moment of each atom based on the actual lattice structure of the material, which leads to a generally more accurate description of the nucleation process, especially at its early stage when embryos (clusters of several atoms) are too small to be accurately described by micromagnetic model. We show that the lateral geometrical confinement, interfacial DMI, and current-induced SOT synergistically lead to a deterministic nucleation and growth process. Accordingly, it is found that the field-free SOT perpendicular switching occurs when the lateral size of the nanomagnet, the interfacial DMI strength, and the magnitude of applied current are all within an intermediate range. Furthermore, a decision-treebased machine learning model is trained using data from about 1000 groups of benchmarked micromagnetic simulations, which is then used to rapidly and accurately identify the regime of successful two-step switching from a ternary parameter space comprising the lateral size of magnetic layer, interfacial DMI strength, and magnitude of charge current density.

RESULTS
Field-free SOT perpendicular switching in a standard SOT-MRAM cell Consider a 40 nm diameter MgO/Co 20 Fe 60 B 20 (1.1) bilayer nanodisks on top of a Pt layer as an example. Figure 1b shows the evolution of the average perpendicular magnetization <m z > of the 1.1-nm-thick Co 20 Fe 60 B 20 (CoFeB) nanodisk under a timeinvariant current J c , simulated using both the μ-Pro and MuMax3 software (see "Methods"). The interfacial DMI strength D is set to be 0.45 mJ/m 2 based on experimental measurement 23 . When J c is relatively small (top panel of Fig. 1b), the SOT is insufficient to rotate the initially upward magnetization significantly. When J c is large (middle panel), the SOT is large enough to switch the magnetization by 90°to the direction of net spin polarization σ (which is along the +x-axis, see Fig. 1a). This is consistent with the conventional belief that the SOT alone typically cannot reverse the polarity of an out-of-plane magnetization due to the inability to break mirror symmetry 9 . However, an unconventional SOT switching appears under a moderate magnitude of current, showing a reversal of the out-of-plane magnetization component. As shown in the bottom panel of Fig.  1b, <m z > equilibrates at a negative value (approximately −0.17) under J c = 1.4 × 10 13 A/m 2 . This indicates that a majority portion of local magnetization vectors were flipped from up to down, and thereby provides a basis for realizing a two-step switching under a pulsed charge current. Specifically, if turning off the current when <m z > equilibrates at 3.5 ns after turning on the current (see Fig.  1c) or evolves to its lowest negative value (at 0.3 ns, see Fig. 1d), the magnetization state will deterministically relax to its new equilibrium state with a downward magnetization (<m z > approximately −1).
Deterministic > 90°perpendicular switching under a time-invariant current Thus, flipping a major portion (>50%) of local magnetization vectors from up to down (and vice versa) in a deterministic manner using SOT is the basis for the two-step switching, and is called ">90°switching" hereafter for convenience. Figure 2a-d shows the local magnetization distributions at several key time stages, including the initially upward magnetization with tilted vectors along the disk edges (0 ns), where the tilting occurs due to the co-action of interfacial DMI and lateral geometrical confinement (ref. 24 , also see "Methods"); the nucleation of downward domain (0.07 ns); the critical state at which <m z >~0 (0.18 ns), and the equilibrium state (0.71 ns), all under the time-invariant J c of 1.4 × 10 13 A/m 2 . Starting from the initially quasiuniform state where inwards tilting edge magnetization, the domain nucleation always occurs in the bottom left region (the third quadrant) of the disk under a J c flowing along the −y direction. This is because only in this region, the out-of-plane SOT τ SOT z and the out-of-plane SHE effective field H SOT z are both negative (as shown in Fig. 2f and explained in "Methods"). Such a nucleation mechanism is similar to that in a conventional SOT perpendicular magnetization reversal in the presence of in-plane bias magnetic field [25][26][27] . However, the growth kinetics of the downward domain we present herein under zero magnetic field (Fig. 2b-d) is different. Specifically, the curvature of the domain wall changes from an initially negative value (Fig. 2b) to almost zero when <m z >~0 (Fig.  2c), and then to a positive value (Fig. 2d). Figure 2e shows the trajectories of the domain wall motion during the entire process of >90°switching. It can be seen that the growth of downward domain is more favorable in the second and third quadrants of the disk, which is due to the fact that the out-of-plane SOT τ SOT z is z is negative in front of the wall, local magnetization vectors therein rotate deterministically from up to down. As a result, the domain wall moves forwards in a unidirectional manner and drives the magnetization distribution across the critical state of <m z >~0 (Fig.  2c). Second, for most of the time, the τ tot z at the two ends of the domain wall (point "1" and "3" in Fig. 2e) are larger than that in the center (point "2"), leading to the formation of a positive domain wall curvature at equilibrium. Repeatable 180°perpendicular switching induced by either unipolar or bipolar current pulses The above analyses show that the nucleation and growth of the downward domain during the >90°switching (step 1) and the subsequent relaxation to <m z > approximately −1 (step 2) are both deterministic. We note that achieving such a two-step SOT perpendicular switching does not require the breaking of mirror symmetry. This feature allows for switching a perpendicular magnetization repeatedly with either a unipolar or bipolar ( Supplementary Fig. 2) current pulse, which is not possible to achieve with other field-free SOT switching approaches and therefore adds flexibility in the way of writing information. Figure  3a shows the evolution of both the <m z > and the total magnetic free energy density change Δf tot (see "Methods") under a unipolar current pulse. As seen, the forward switching (up to down, <m z > changing from 1 to −1) and the backward switching (down to up, <m z > changing from −1 to 1) show exactly the same profile of Δf tot (t). As seen from the zoom-in view of Δf tot (t) in Fig. 3b, for both the forward and backward switching, the realization of the >90°switching (step 1) requires overcoming an energy density barrier for the nucleation of an oppositely oriented domain and another barrier (albeit small for this particular case) for pushing the domain wall forward at the critical state of <m z >~0, both driven by the out-of-plane SOT torque τ SOT z as discussed above. Figure 3c shows the magnetization distributions at the initial and final stage of the switching as well as the two transitional states (where Δf tot peaks). It can be seen that the evolution of magnetization distributions during backward switching is symmetric to that in the forward switching. This explains why the free energy evolution during the backward switching is exactly the same as that in forward switching as shown in Fig. 3b. However, for the backward switching, the magnetization is initially pointing down and the edge magnetization tilts outwards due to the interfacial DMI, which is exactly opposite to the initial state in the case of forward switching. Therefore, the domain nucleation occurs at the bottom right region (the fourth quadrant) of the disk where the out-of-plane SOT τ SOT z and the out-of-plane SHE effective field H SOT z are both positive since the polarity of the current J remains unchanged. Likewise, the growth of upward domain is more favorable in the fourth and the first quadrants where the τ SOT z is positive.

DISCUSSIONS
Below we show that achieving a >90°out-of-plane SOT switching under a time-invariant current requires a balanced combination of the geometrical confinement (disk diameter), interfacial DMI (strength D), and SOT (magnitude of charge current J c ). Let us first discuss the role of confinement under a constant D of 0.45 mJ/m 2 , which is an experimentally measured value for CoFeB/Pt interface 23 . Figure 4a shows the switching diagram as a function of disk diameter and J c , where the region of successful switching is colored by the equilibrium value of <m z >. It can be seen that the desirable >90°switching appears not only within an intermediate range of J c (as have shown in Fig. 1b) but also an intermediate range of disk diameter (40-63 nm). At smaller disk diameters, the CoFeB disk has a larger perpendicular magnetic anisotropy (PMA), defined as K PMA ¼ Kinter d þ K OOP shape . Here, K inter is the magnetic interfacial anisotropy (~1.3 mJ/m 2 for CoFeB) 28 and d is the disk thickness; the out-of-plane magnetic shape anisotropy K OOP shape is larger in smaller disks due to stronger geometrical confinement (see "Methods"). The enhanced PMA facilitates the alignment of magnetization along the z-axis, thereby reducing the extent of interfacial-DMI-induced magnetization tilting along the disk edge. As shown in Fig. 4b, the magnitude of the normalized in-plane magnetization m IP along the disk edge (see schematic in the inset) decreases as the disk diameter reduces. This in turn makes it harder for the reversed domain to nucleate due to the smaller τ SOT z and H SOT z 25,26 , demonstrated by the larger threshold J c . This trend continues until m IP becomes smaller than a threshold (~7.22 × 10 −2 ), below which the nucleation of reversed domain is not possible. As the disk diameter increases, m IP also increases and gradually approaches its saturation value (~7.31 × 10 −2 ). The latter is linearly proportional to D 24 . A larger m IP will make it easier for the reversed domain to nucleate and hence facilitate the >90°switching, demonstrated by the decreasing J c as well as a more negative <m z > at equilibrium (Fig. 4a). However, the volume fraction of the tilted edge magnetization (denoted as V t , see inset of Fig. 4b) decreases with increasing disk diameter, because the interfacial DMI primarily tilts magnetization along the outermost edge of the disk. When V t becomes smaller than a threshold (~10.88%), the domain nucleation is also disabled, in other words, the numbers of flipped magnetization vectors are too small to trigger a nucleation. This is analogous to a classical nucleation process where the size of the unstable embryo cluster cannot grow large enough to exceed the critical nucleus size.
Furthermore, the fact that there exists a lower and an upper bound for the disk diameter suggests that the desirable >90°p erpendicular SOT switching may be more robust when the disk diameter is far away from the two bounds. In that case, both the m IP and V t can maintain a sufficiently large value, which synergistically facilitate the nucleation and thereby lead to a more robust switching. To test this hypothesis, we compared the success rate of the switching in the presence of a thermal fluctuation field at room temperature (see "Methods") for the cases of 40, 50, and 60 nm based on 100 groups of repeated simulations for each case. It is found that the success rate in the case of 50 nm is as high as 98%, which is much higher than that in the case of 40 nm (50%) and 60 nm (54%) (see Supplementary Fig.  3). In this regard, the switching diagram in Fig. 4a allows us to infer the robustness of the switching under other combinations of current densities and disk diameters. We also note that this finding (that is, higher success rate in an intermediate disk size) is different from the perpendicular magnetization switching driven by current-induced spin-transfer torques 28 or voltage-controlled magnetic anisotropy 29,30 , where the switching in larger-volume single-domain nanomagnets generally has a higher success rate due to the enhanced thermal stability 31,32 . Figure 4c shows the switching diagram as a function of the interfacial DMI strength D and the J c under a constant disk  Fig. 4). However, when D is relatively large (1.2 mJ/m 2 ≤ D ≤ 3.2 mJ/m 2 ), increasing D increases the J c , despite that both the m IP and the V t still increase. We attribute the larger J c to the change in the domain wall structure and the kinetic switching path when D is relatively large ( Supplementary  Fig. 5). Specifically, a large D tends to destabilize both the initial upward (downward) magnetization 33 and the domain wall, thereby making the domain wall motion more turbulent (Supplementary Video 1) which in turn reduces the mobility of the domain wall 34,35 . As a result, a larger J c will be required to move the domain wall. This is reminiscent of the Walker breakdown for the magnetic field or current-induced motion of 180°magnetic domain walls, where domain wall exhibits lower mobility if the field or current is so large that the domain wall structure becomes less stable during motion 36,37 . When D is even larger (≥3.4 mJ/m 2 ), the equilibrium spin structure under an intermediate J c displays an almost pure Néel wall that separates upward and downward domains (note that mixed Néel and Bloch wall features appear at lower D values). Once the current is turned off, this spin structure will relax to a metastable or stable (if D is sufficiently large 33 ) Néel stripe domain, where upward and downward domains are roughly half and half with <m z >~0 ( Supplementary Fig. 6), rather than relaxing to the desirable downward single-domain with <m z > approximately −1.
We also simulated the J c vs. D switching diagrams for other disk diameters in the range of 40-63 nm. It is found that the switching diagrams in the cases of other disk diameters all exhibit two features that are the same as the above-discussed 40 nm diameter case, due to similar physical mechanisms. First, as D increases, the critical J c required for the >90°switching decreases (increases) when D is relatively small (large). Second, the relaxation to single domain of reversed polarity in the second step is prohibited if D is too large. The numerical calculation of such switching diagram is tedious because one needs to search for the desirable range of J c for each combination of D and disk diameter. To accelerate the calculation of switching diagram, we train a decision-tree-regression-based machine learning model (see "Methods") from about 1000 groups of micromagnetic simulation datasets. The machine learning model is then tested with 80 datasets. Each dataset is comprised of an input vector X i (i = 1, 2, 3, representing the disk diameter, D, and J c , respectively) and an output scalar variable Y representing the equilibrium value of <m z >. The trained machine learning model can be used to rapidly regress the Y as a function of X i . As one example, Fig.  4e shows the machine-learning-model predicted J c vs. D switching diagram for the disk diameter of 40 nm, which is well consistent with the diagram calculated via micromagnetic simulations except one outlier (c.f., Fig. 4c). Remarkably, the machine-learning-predicted diagram, which also contains many more data points than the micromagnetic-simulations-based diagram, was calculated in <2 min in a laptop. This is much faster than that by micromagnetic simulations, where the calculation of one switching diagram takes~250 groups of simulations and each group typically takes~4 h to complete with 16 cores running simultaneously on state-of-the-art supercomputers. The prediction accuracy of our machine learning model is >90% with a mean square error of~0.01 (Fig. 4f), and can be further improved with more training datasets. Such machine learning model is thus particularly suited for accelerating the prediction of such switching diagram, e.g., J c vs. X (where X = disk size, D, K inter , M s , etc.), in which case large quantities of micromagnetic simulations are normally needed.
Having shown that the >90°perpendicular switching occurs through a deterministic nucleation and lateral growth of reversed domain (Fig. 2), we further perform ASD simulations to analyze the early-stage nucleation kinetics at which the embryos (clusters of several atoms) are too small to accurately describe with a continuum micromagnetic model. A multilayer stack (similar to the structure in Fig. 1a) with 50 nm diameter MgO/Co bilayer nanodisks on top of a Pt stripe is considered as the model system. There are two reasons for considering Co (instead of CoFeB) as the magnetic layer herein. First, since the goal is to reveal the nucleation and growth kinetics of the reversed domain (within which m z < 0) at the atomic scale, a pure metal Co would be a better model system than the solid-solution CoFeB. Second, it will show that the proposed two-step switching is applicable to other ferromagnets.
We first present the micromagnetic simulations results on the >90°switching in Co as the baseline of the discussion. As shown in Fig. 5a, after a brief incubation period (during which m z > 0), the volume fraction (V r ) of the reversed domain (within which m z < 0) increases gradually from 0 to above 60% at saturation, when moderate charge currents are applied (the range of J c is from 1.1 × 10 13 to 1.5 × 10 13 A/m 2 ). A larger J c yields a larger out-ofplane torque τ tot z which in turn leads to faster nucleation (that is, shorter incubation period) and faster growth. Notably, the growth of the reversed domain (described by the evolution of V r at later time stages) well fits the Kolmogorov-Avrami equation 38,39 , given by V r ¼ V 0 ð1 À e À t=tc ð Þ n Þ, where V 0 is a fitting parameter; t c is the characteristic time for the V r to saturate, which decreases from 0.24 to 0.12 ns as J c increases; n is the effective dimension of domain growth and is best fitted with values around 2, indicating a two-dimensional domain growth as expected. . c Distributions of the normalized perpendicular atomic spin moment S z in three different nucleation clusters P 1 , P 2 , and P 3 (indicated in d) at the same time stage of 0.055 ns. The central atoms are marked by circles (whose radii are not to scale with the atomic radius of cobalt). The lattice parameter of the hexagonal lattice is 0.25 nm. e The evolution trajectories of the normalized atomic spin moment of these three central atoms from 0 ns (at which S z~1 ) to 0.06 ns (at which S z < 0). The schematic in the right shows the polar angle θ of the normalized atomic spin moment. Figure 5b shows the nucleation and growth kinetics of reversed domain simulated using ASD modeling, where a 2D hexagonal lattice of 36,096 atoms is used to approximately describe the 1.1-nm-thick hcp Co nanodisk. The evolution of the average atomic spin moment and free energy terms during the entire twostep switching process are also simulated by ASD modeling (see Supplementary Fig. 7). As shown in Fig. 5b, the nucleation and growth of the reversed domain display generally similar kinetic features to those obtained using micromagnetic modeling. Specifically, the percentage of atoms (N r ) with flipped atomic spin moment (S z < 0, see "Methods") displays a similar saturation value. The growth kinetics of the reversed domain can likewise be fitted using the Kolmogorov-Avrami equation. For early nucleation stage (0.055 ns), Fig. 5c shows the density maps of the perpendicular atomic spin moment S z at three different nucleation clusters (labeled as P 1 , P 2 , and P 3 , as indicated in Fig. 5d) at the same time stage. It can be seen that the switching is fastest at P 3 and then P 2 , which is also indicated by the evolution trajectories of the atomic spin moment in the central atoms (marked by circles) at these three points from 0 to 0.06 ns (Fig. 5e). This locally variant switching speeds are mainly due to the fact that the downwardpointing effective SOT field, the magnitude of which is proportional to |S y |, is the largest at cluster P 3 while smallest at P 1 at this early stage of nucleation. As a result, the magnetization at P 3 can be flipped more quickly.
In summary, we have computationally demonstrated a field-free SOT-mediated perpendicular magnetization switching, the implementation of which does not require any changes in the standard architecture of an SOT-MRAM cell, thus being promising for practical applications. The full 180°switching is achieved through a >90°perpendicular switching (m z changes from~1 to <0, or vice versa from approximately −1 to >0) plus a subsequent precession relaxation to the revered direction. The simulations indicate that there are two key critical steps for achieving the >90°p erpendicular switching, including the nucleation of the reverse domain and its successful growth to exceed the critical "half-up half-down" spin state (e.g., Fig. 2c). Successful realization of both steps relies on striking the balance of the interfacial DMI, geometrical confinement, and SOT by tuning the interfacial DMI strength (D), lateral size of the magnetic layer, and magnitude of charge current density (J c ), respectively.
Our analyses show that the growth of the reversed domain is two-dimensional and can be well described by classical theory of growth kinetics, similarly to the growth of polarization domain in ferroelectric thin films 40 . Our ASD simulations, which resolve the evolution of magnetic moment in every single magnetic atom in a confined system, provide an atomistic picture of the nucleation kinetics of SOT-mediated perpendicular magnetization switching in general. Last but not the least, we have trained a decision-treebased machine learning model using about 1000 groups of micromagnetic simulations that were benchmarked using the open-source micromagnetic simulation package MuMax3. This machine learning model along with the datasets can evolve to an accurate and efficient computational design tool that can be used to quickly determine whether the field-free switching can occur and what are the values of average magnetization at equilibrium under a given set of design parameters (D, size, J c , key magnetic parameters, etc.).

Micromagnetic simulations using μ-Pro
Most of the micromagnetic simulations in this work are performed using the commercial μ-Pro® package (mupro.co/contact), which is CPU (Central Processing Unit) based and parallelized using Message Passing Interface. Consider a MgO/Co 20 Fe 60 B 20 (1.1)/Pt multilayer structure, where the CoFeB disk is discretized using a cuboid-shaped cell of 1 nm × 1 m × 0.55 nm. We note that the thickness of the CoFeB (1.1 nm) is significantly smaller than the magnetic exchange length l ex ¼ .54 nm, where A is the Heisenberg exchange coefficient, μ 0 is the vacuum permeability, M s is saturation magnetization. In combination with the fact that the interfacial DMI field does not modify the gradient of magnetization along the thickness direction (shown later), we expect the magnetization distribution to be spatially uniform along the thickness direction, which is confirmed by micromagnetic simulations. The simulations performed using a smaller cell size along the z-axis (i.e., 0.275 nm) yield the same results. Other magnetic nanostructures with sufficiently strong room temperature PMA, which can arise due to either the magnetic interface anisotropy (e.g., 0.5-nm-thick Fe 80 Co 20 , ref. 41 ) or the intrinsic magnetocrystalline anisotropy such as FePt 32 , can also be used to demonstrate the proposed field-free SOT perpendicular magnetization switching.
The magnetization in each cell M = M s m, where m = (m x , m y , m z ) is the normalized magnetization vector. The evolution of the normalized magnetization m is obtained through solving the Landau-Lifshitz-Gilbert (LLG) equation with an SOT term τ SOT , where α is the Gilbert damping coefficient and γ 0 is the gyromagnetic ratio. The effective magnetic field H eff ¼ À 1 the total magnetic free energy density f tot = f stray + f anis + f DMI + f exch is the sum of the densities of the magnetic stray field energy, perpendicular anisotropy energy, interfacial DMI energy, and Heisenberg exchange energy, respectively. Each energy density term is associated with an effective magnetic field. Therefore, one can also write H eff = H stray + H anis + H DMI + H exch . Among them, the magnetic stray field H stray is obtained through solving magnetostatic equilibrium equation ∇ Á μ 0 H stray þ M À Á ¼ 0 under a finite-size boundary condition, using a Fast Fourier Transform accelerated convolution theorem 42 . The anisotropy field H anis ¼ À 1 μ 0 Ms ∂fanis ∂m , where the uniaxial magnetic anisotropy energy density f anis ¼ Kinter For the ultrathin (1.1-nm thick) CoFeB in this work, it is the contribution from the magnetic interface anisotropy Kinter d that outweighs the out-of-plane shape anisotropy K OOP shape . The latter can be evaluated based on numerically calculated out-of-plane stray field. Experimentally, a room temperature thermally stable perpendicular magnetization has been demonstrated in ultrathin (1.3 nm or thinner) CoFeB nanodisks with diameter as small as 40 nm 28 (down to 20 nm according to a very recent report 43  , where k B is the Boltzmann constant, T = 298 K is the temperature, ΔV is the volume of each simulation cell, Δt is the time interval in real unit. η = η(r, t) = (η x , η y , η z ), where η x , η y , η z are Gaussian-distributed random number with a mean of zero, which are uncorrelated both in space and time.
Let us further consider that the input charge current flowing along the −y direction (J c,−y ) of the Pt layer. Due to the SHE 3 , electrons with opposite spin orientations are deflected transversely towards the top and bottom surfaces along the z-axis (yielding a spin current, denoted as J s,z ) as well as the two lateral surfaces along the x-axis (yielding a spin current J s,x ). We note that the spin current is a flow of angular momentum, and that the net charge currents along both x and z are zero. Since the magnetic nanodisk is laid on the top surface of the Pt layer, only the zaxis spin current J s,z , which has a spin polarization σ along x according to the right-hand rule of the SHE 3 , can be injected into the nanomagnet (Fig. 1a). In this case, the transfer of angular momentum from the Pt to nanomagnet can only occur along the z-axis, and in turn switch the magnetization in the latter via the SOT 44 . Experimental demonstration of this switching scheme can be found in for example refs. 8,9,15,45 . The spin-orbit torque τ SOT is expressed as 8,25,26 , where the corresponding effective field H SOT = H 0 (m × σ s,x ). Here, the M. Dai and J.-M. Hu prefactor H 0 ¼ μ B JcθSH γ 0 edMs , where μ B is the Bohr magneton; J c is the density of the charge current; θ SH is the spin hall angle, e is the elementary charge, and d is the thickness of CoFeB layer. Therefore, the z-component of the τ SOT is given by τ SOT z ¼ Àγ 0 H 0 m x m z , while the z-component of the H SOT is H SOT z ¼ ÀH 0 m y . In an initially +z magnetized disk (m z > 0) with inward tilted magnetization (see Fig. 2a

Micromagnetic simulation using MuMax3
We used MuMax3, which is GPU (Graphics Processing Unit) based, to validate the micromagnetic simulation results from the μ-Pro package. The results obtained from both packages are well consistent (e.g., see Figs. 1 and 3). In MuMax3, the spin torque term τ SL is Slonczewski type, given as 52 , Þþ Λ 2 À1 ð ÞmÁσ ð Þ . By setting Λ = 1, ϵ0 ¼ 0, the expression of τ SL in Eq. (3) can be written as, Given that μ B ¼ e h 2me , the expression of τ SL is the same as that of τ SOT in Eq. (2). This allows us to model the SOT switching using MuMax3.

Machine learning the switching diagram
The 1069 groups of micromagnetic simulation datasets comprised of tabulated {X i , Y} are divided into 989 training groups and 80 testing groups, where X i (i = 1, 2, 3) refers to the disk diameter, interfacial DMI strength D, and the charge current J c ; the scalar variable Y refers to the <m z > at equilibrium. Ten different machine learning models that are available in Scikit-learn 53 (including linear regression, Gaussian process regression, polynomial regression, etc.) were separately trained to regress Y as a function of X i . Among them, the decision-tree regression model delivers the highest accuracy. The accuracy is represented by R 2 score.
where y and f are the <m z > calculated from micromagnetic simulation and machine learning, respectively.

Atomistic spin dynamics (ASD) simulations
The ASD simulations are performed using our in-house package called AtomMag (under development) which has both a GPU-accelerated and a CPU-based version. The physical validity of AtomMag is tested by making comparison to Fidimag 54 , an existing open-source ASD simulation package. The testing results are summarized in Supplementary Fig. 8. Consider a MgO/Co/Pt multilayer structure, where the Co disk is described by arranging 36,096 atoms in a 2D hexagonal lattice with lattice parameters a = 2.5 Å, corresponding to a 50-nm disk diameter. The magnetic moment of each atom is given as μ i = μ s S i (i = 1, 2…36,096), where S i is the unit vector of the atomic spin and μ s is the magnitude of spin magnetic moment. The evolution of the unit spin direction S i is obtained through solving the atomistic LLG Equation with a spin-orbit torque term τ i SOT , where α is the Gilbert damping coefficient and γ = γ 0 /μ 0 is the gyromagnetic ratio. The effective magnetic field at each atom site where H is the total spin Hamiltonian; H tot ¼ H dipole þ H anis þ H DMI þ H exch is the sum of the Hamiltonians of magnetic dipole-dipole interaction, uniaxial magnetic anisotropy, interface DMI and Heisenberg exchange interaction of all atoms. Since each Hamiltonian is associated with an effective magnetic field, one can write 55 . Among them, where r ij is the distance between atom site i and atom site j, b r ij is the unit vector along the direction of r ij = r j −r i .
where K 0 is the anisotropy constant, u is the direction vector of the magnetic easy axis.
where D ij is the Dzyaloshinskii-Moriya vector between two neighboring spins at atom site i and j. For interfacial DMI, D ij ¼ D 0 b z r ij , where D 0 is the interfacial DMI constant.
where J ij is the exchange parameter describing the Heisenberg-type exchange coupling between two spins at atom site i and j. For hexagonal Co, first-principles calculations indicate that the second nearest-neighbor exchange parameter is negligibly small 56 . Therefore, the summation in Eq. (10) is truncated to include the nearest-neighbor exchange coupling only. Furthermore, the thermal fluctuation field B i therm is given by 57 , Similarly to the above-described micromagnetic model, the Gaussiandistributed stochastic vector η = η(r, t) = (η x , η y , η z ) is generated via Box-Muller transform of the uniformly distributed random numbers. The latter is generated using the high-performance GPU-accelerated random number generator cuRAND. In contrast to the fact that the peak amplitude of the thermal fluctuation field in a micromagnetic model is dependent on the size of the simulation cell, the thermal fluctuation field is directly applied to every single spin in ASD modeling. Therefore, temperaturedependent magnetic properties (e.g., see refs. 58,59 ) can be accurately modeled via ASD simulations. In this work, the main purpose of performing ASD simulations is to more accurately model the early-stage nucleation kinetics (Fig. 5b-e), e.g., in which area the spins flip the fastest? What is the intrinsic time scale of nucleation? To address these questions, B i therm is not included, otherwise the associated fluctuation on every atomic spin will obscure the analysis. Finally, the SOT τ i SOT is expressed as 60 , where β FL is the strength of field-like torque term, β DL is the strength of damping-like torque, σ is the orientation of spin polarization, which is along +x direction in our case. The parameters in ASD simulations can be derived from their continuumscale counterparts used in micromagnetic simulations as follows: 61 J ¼ 2az A ffiffi Here, we are considering a 2D hexagonal lattice with a thickness of only one monolayer (~4.0 Å) for reducing the computational cost (which is common for ASD simulations 60,61 ), by assuming the variation of magnetization along the thickness direction is negligible. This approximation would lead to significantly stronger demagnetization (dipolar) field in the system than it is supposed to be in a 1.1-nm-thick Co sample, which in turn reduce the PMA. To balance the stronger demagnetization field, we use a larger anisotropy constant K 0 = 5.50 × 10 −23 J. It is shown that under this combination of parameters, the extent of edge magnetization tilting (m IP~0 .3, c.f., Fig. 4b) is the same as that in the micromagnetic simulations of 1.1-nm-thick Co film. Strength of field-like torque and damping-like torque is proportional to