Surface diffusion-limited lifetime of silver and copper nanofilaments in resistive switching devices

Silver/copper-filament-based resistive switching memory relies on the formation and disruption of a metallic conductive filament (CF) with relatively large surface-to-volume ratio. The nanoscale CF can spontaneously break after formation, with a lifetime ranging from few microseconds to several months, or even years. Controlling and predicting the CF lifetime enables device engineering for a wide range of applications, such as non-volatile memory for data storage, tunable short/long term memory for synaptic neuromorphic computing, and fast selection devices for crosspoint arrays. However, conflictive explanations for the CF retention process are being proposed. Here we show that the CF lifetime can be described by a universal surface-limited self-diffusion mechanism of disruption of the metallic CF. The surface diffusion process provides a new perspective of ion transport mechanism at the nanoscale, explaining the broad range of reported lifetimes, and paving the way for material engineering of resistive switching device for memory and computing applications.

The manuscript "Surface diffusion-limited lifetime of silver/copper nanofilaments in resistive switching devices" demonstrates simulations as well as experimental validation for filament lifetimes in Resistive switching devices (ReRAM or CBRAM). The manuscript is clear, to the point and explains for the first time on an atomic scale what happens when and how conductive paths are broken.
Using their simulation and model, the authors were able to predict lifetimes of CF for a wide range of diameters and sizes. What is more, the authors experimentally validate those lifetimes, not only for their own silver filament-based devices but also using previously demonstrated work from other groups.
This work is highly relevant for memory-, access device-and neuromorphic computing-research fields and significantly aids the design of devices such that specific characteristics are obtained. The trade of between life-time and energy for forming and erasing a memory state is thus clearly shown. This relation is crucial for optimizing devices for specific applications.
I do have a couple of (minor) questions/comments: 1. Can the authors be sure that the distance between AgNW is similar each time in the experimental data? The compliance current is used to control the final thickness of the CF. In the simulations, this distance (between top and bottom electrodes) is set to a certain value (10 nm). However, in the experimental data, a number of AgNW (randomly) exist in the silk dielectric. It seems that once a filament has formed, this will draw all the current (and perhaps still grow in diameter) and no other filaments will form, is that correct? And if so, the CF will not always grow between similar (electrode/AgNW) distances as is modelled, won't that affect the comparison with the model? 2. If I understand correctly, the MD simulation shows how the atoms diffuse which indicates that energy minimization is responsible for that, either driven by the gradient of atomic vacancy concentration or chemical potential. Using this knowledge, a differential equation is presented that describes this relation which in turn is used for the second, morphological model. Is MD in any way also used to verify the other model, for instance having an identical CF with similar dimensions of height and diameter for the two models as presented in 3. Some of the current issues with memristive (ReRAM) devices, such as non-linear conductance tuning and stochastic behaviour, are hampering efficient operation in neuromorphic applications, so I am wondering whether this model can also be used to optimize forming and breaking conditions as such that linear/analogue tuning of the conductance is achieved/improved? I could imagine the correct voltage and current pulses necessary to achieve a certain conductance can be predicted, resulting in accurate tuning without read-actions between the tuning (which significantly decreases the efficiency of those neuromorphic arrays). For some neuromorphic applications for instance, only a relatively short life-time (~mins) is necessary as long as the conductance tuning is done efficiently and predictable. Can the authors comment on that?
4. Finally, would it also be possible to use this model to calculate the exact energy necessary for switching? That could also be a valuable metric to know, to design devices for (energy) efficient memory and neuromorphic applications.
The manuscript is a very interesting account on molecular dynamics modelling of Ag conductive filaments and switching data on nanorods embedded in a silk matrix sandwiched between Au electrodes. The key message is that the lifetime of the device "ON" state, that is, the device low resistance state, is governed by the size of the filament. The filament size can be controlled by the compliance current, presumably due to some Joule heating/electromigration cooperative phenomena.
The key point for the explanation of device performance is given by surface diffusion, and several movies corresponding to simulations of conductive filaments of various diameters show a power law dependence in which the lifetime scales with the fourth power of the filament diameter. This in turn is in agreement with Herring's law. This all sits very well together, and seems to explain well the rupture phenomena. I do have a couple of points to make though. Surface diffusion and Herring's law seem to explain the observed phenomena, but I would think that the Gibbs-Thomson equation leading to Ostwald ripening phenomena could bear similar results. In Ostwald ripening, the critical issue is the radius of curvature, the smaller the less stable. To be more specific, the vapor pressure of the nanoparticle is inversely proportional to its radius, and therefore the atoms just want to leave the particle. From that perspective, and in qualitative terms, the manuscript does not add anything new; I could say that the observed results are indeed to be expected. My first point then is: could we explain the observed phenomena by the Gibbs-Thomson equation, and what is missing from that formalism to properly capture the observed results? Very carefully pointed out is that in matrices, the surface (or should it be interface?) diffusion processes should change.
Could we still talk about universal behavior? How does that fit in with Ostwald-ripening (see for example [1])? My second point is: I am assuming that the devices can switch multiple times. From the learnings of this work, and knowing that the devices "die" most often in the "ON" state, can one find a compromise between endurance and retention? In summary I find the paper appropriate for Nature Comms, however to have a more general impact, the Gibbs-Thomson phenomenon should be discussed/ruled out/accommodated so the reader can connect different fields and gather a more broad perspective. From a more area specific impact, I would like to see some discussion on retention x endurance. Could this work shed some light into this aspect? Are those two issues inextricably connected? After comments from the authors addressing these two points I find the paper suitable for publication. [1] see, for example: dx.doi.org/10.1021/ja309034n, Simo et al., J. Am. Chem. Soc. 2012, 134, 18824−18833

Reply:
We thank the Referee for the insightful comments on our work. All the Referee's comments have been very helpful to improve the presentation of our findings. Following the Referee's concerns and suggestions as well as other Referees' comments, we have extended our study to make the work more comprehensive and fully support our conclusions. More detailed answers to the Referee's comments are reported in the following.  (Refs. 17 and 26). In certain conditions, it might be possible for the out diffusion to become dominant during the dissolution of filaments. In the present study, the consideration on host materials is limited to their impacts on interface conditions and surface diffusions rates. It may therefore be helpful to further check the dependence of out diffusion on host materials, e.g. solid electrolytes with high solubility of ions, to strengthen the discussions.

Reply:
We agree with the Referee that the contribution of the out-diffusion of filament atoms might be affected by the properties of the host materials. However, we believe that, at least for the material systems considered in our manuscript, namely Ag or Cu in silicon oxide, metal oxide or polymer host materials, surface diffusion, rather than out-diffusion, dominates the filament dissolution. This conclusion is based on the following two points: 1) Recent high resolution TEM observations reveal the presence of Ag or Cu clusters rather than homogeneously distributed Ag within the doped dielectric layer [R1-R4] . This was shown for Ag-doped SiO x (x<2) [R1] , Cu-doped SiO x (x<2) [ 2) The experimental size-dependent lifetime supports surface diffusion as opposed to out-diffusion as the fundamental mechanism for filament dissolution. The surface diffusion theory predicts a lifetime ~0 with an exponent n = 4, which is consistent with the experimental results in Fig. 5c. On the other hand, one should expect an exponent n = 2 for out-diffusion as reported by previous simulation studies [R8,R9] .
We agree with the Referee that out-diffusion might play a non-negligible role in other materials systems, such as when the metallic element (Ag, Cu) has a high solubility in the host materials, e.g., metal-chalcogenide materials, Ag 2 S for Ag filament [R10] , or CuS for Cu filament [R11] . In these cases, a different analysis and numerical approach should be conducted to properly take out-diffusion into account. However, in our experiments, we focused on polymer (silk) as dielectric layer between two AgNW electrodes. Ag atoms are more likely to cluster together rather than diffuse homogeneously as single atoms, given the low solubility of , and others, which all share a low solubility with the active metals Ag or Cu. In fact, at room temperature (300 ℃), the relative solubility of Ag in SiO 2 is 10 -6 , namely three orders of magnitude smaller than in Ag 2 S beta phase, where the solubility is 10 -3 [R15,R16] . Thus, our main finding about the leading role of surface diffusion in controlling the lifetime of RRAM should be restricted to these material systems with relatively-low solubility, rather than solid electrolyte materials with relatively-high solubility.
To clarify this point, we added one paragraph in the section 'Discussion' (second paragraph of the section, page 7). In the new paragraph, we describe the additional evidences for surface-diffusion as the fundamental mechanism for retention, as opposed to out-diffusion, namely i) TEM observations of clustering in the asdeposited state, and ii) the different size-dependent law for out diffusion (~0 2 instead of ~0 4 ). We also added a more comprehensive study of size-dependent lifetime for out-diffusion in the Supplementary Information (newly added Part X). References R8-R9 were added accordingly.
[ Reply: We agree with the Referee that asymmetric filament shapes and asymmetric electrodes are more common and more realistic in state-of-the-art resistive switching devices. We have added more simulation results regarding asymmetric electrodes and filament shape in the Supplementary Information, as detailed in the following.
We extended the simulation analysis in the Supplementary Part V, addressing the effect of different electrode materials. This has been modeled by varying the boundary conditions, namely the contact angle between the filament and the electrode, which is dictated by the wettability of the filament/electrode materials (see Fig. S4b). Simulation results show that changing the electrode materials does not significantly affect the size dependence of the lifetime ( Fig. S8a and S8b). On the other hand, when the top electrode is hydrophobic with respect to the filament, no stable capillary bridge can be formed anymore, thus the lifetime follows Herring's law across the whole range of the initial diameter 0 (Fig. S8c).
In the newly added Supplementary Part VI, we addressed the effect of the different initial filament shapes.
To this purpose, we varied the conical angle , defined in the Supplementary Part III. Similar size dependences of the CF lifetime for various  can be observed in Fig. S9. A cylindrical filament ( = 0 ∘ ) has a shorter lifetime compared to the conical filament with higher angle, e.g. = 5 ∘ and = 10 ∘ . However, the general trend of lifetime as a function of the initial filament diameter does not change. From the simulation results in  3. This study plots the lifetime dependence on filament size by considering the filament geometry with a variable of d/h (Fig. 4b). In this case, a filament length of 10 nm was assumed, but typical filament length could have a large variation, ranging from ~1 nm (if the filament has a critical switching region) to tens of nm. Will the variation in filament length significantly impact the results, or they will collapse into a similar curve? The authors are recommended to check on this.
Reply: As the filament disconnection generally takes place at the bottleneck, the minimum filament diameter 0 is the most relevant factor for the filament lifetime. Any variation of the filament length has less impact than variations of 0 . We addressed the impact of the filament length on lifetime in the newly added Supplementary Part VII and Fig. S10. From these results, it can be seen that a relatively small variation of filament length plays a minor role in the lifetime.
With respect to the original version of the manuscript, we decided to change the axis scale in Fig. 4b, where data are now plotted as a function of 0 , instead of 0 /ℎ. This is because the original figure might suggest that is only a function of 0 /ℎ, i.e., equal 0 /ℎ but different dimension scale would lead to the same retention time. This is of course not correct, since, according to our theoretical analysis in Eqs. (1) and (2) and based on Herring's scaling law, the retention time follows the scaling rule ~4, where and are the scaling factors of the time and the filament size, respectively. The variable 0 /ℎ only defines the initial aspect ratio of the filament. Assuming a constant 0 /ℎ and scaling the size of the filament by a factor , the shape evolution of the filament would be identical, although on a different time scale ~4. For instance, a filament with initial diameter 0 = 0.2 and filament length ℎ = 10 shows the same evolution process (e.g., fragmentation in clusters, etc.) as a filament with initial diameter 0 = 80 and filament length ℎ = 4 , except for their dimension scaling of = 400 and lifetime scaling of = 2.56 × 10 10 . According to our simulation, the lifetime of the former case is about 10 , thus, a silver nanowire with diameter 80 will have a lifetime of 2.56 × 10 5 s (several days), which is consistent with recent results [R17] . We added one paragraph in the main manuscript (last paragraph of the sub-section 'Size-dependent lifetime', page 6) to explain the use of the variable 0 /ℎ for initial shape identity and the extrapolation to the lifetime for silver nanowires.
[R17] Deignan, G. & Goldthorpe, I. A. The dependence of silver nanowire stability on network composition and processing parameters. RSC Adv. 7, 35590-35597 (2017). It should be noted that such bipolar switching behavior only applies for a stable (non-volatile) filament, which is the case for relatively large compliance current (hence diameter of the filament). On the other hand, the device operated in the volatile switching regime leads to bi-directional switching where the filament formation in the on state can take place by ion migration either from the positively-biased top electrode, or from the positively-biased bottom electrode, as indicated in Fig. 1e. Fig. 5c, where lifetime exceeding 1 second is marked as nonvolatile. It may be inappropriate.

Reply:
We agree with the Referee that the boundary between nonvolatile and volatile behaviors is not straightforward, as it derives from an assumed criterion for the filament lifetime. To avoid misunderstanding, we decided to delete the dashed line in Fig. 5c and avoid an arbitrary separation between volatile and nonvolatile regimes. preventing the formation of other filaments because of the compliance system which causes limitation of the voltage to maintain a certain compliance current I C . After the formation, the filament can then be electrically or spontaneously disconnected, which represent the cases of nonvolatile and volatile switching, respectively. In the subsequent set operation, the filament would likely form in the same active region, thus the distance between the Ag NWs remains the same. It is true that each device has a different distance between the active NWs, as a result of the random location of NWs in the host material. In our previous experimental work [R13] , we found that the threshold voltage for switching shows a remarkably uniform distribution, thus suggesting a relatively small variation of the Ag NW distance at the weakest point. As a result, the assumption of a constant h throughout the simulation seems a reasonable assumption. We added one paragraph in the revised manuscript (last paragraph of the sub-section 'Volatile and non-volatile switching,' pages 3 and 4) to clarify this point. It should also be mentioned that the filament lifetime depends only weakly on the filament length ℎ, as reported by simulation results in the newly added Part VII of the Supplementary Information. Therefore, random variations of filament length from device to device should negligibly affect the lifetime.
On the other hand, the filament diameter 0 at the critical position dominates both the lifetime of the filament and the electrical conductance of the volatile device. More simulation results are provided in the newly added Part VI of the Supplementary Information to support this point. In particular, from the reported simulation results, the filament lifetime is less sensitive to the filament height ℎ than to the bottleneck diameter 0 , thus supporting the assumption of a constant filament length ℎ in the simulation of our devices.
[ show strikingly similar evolutions. The time evolution of the filament gap length according to the two models is also shown in the movie. It should be noted that the MD simulation is extremely time consuming and can only be conducted in practical times by assuming a high accelerating temperature in the simulation (e.g., 800 K in Fig. S7). The parameters for MD simulations are solely calculated from first principle, while the parameters for the numerical model are obtained from the literature and, in part, from the best fitting of the experimental data.
For these reasons, the timescales of the two models are not consistent with each other, therefore we assumed a normalized time axis in the comparison of the two simulations in Fig. S7c.  Reply: We agree with the Referee that linear and analogue tuning of the ReRAM conductance is crucial for the understanding and control of the resistive device for both memory and neuromorphic applications.
Unfortunately, the processes of electrical force induced increasing/decreasing the filament conductance are generally governed by the field-induced migration, which is not included in our model at the present stage. We currently have some ideas about how to deal with this limitation, however extending the numerical model to describe field-induced drift in addition to surface diffusion would take quite some time and probably be a challenging project on its own.
In this work, we focused on filament spontaneous disconnection induced by surface diffusion, where the gradient of the surface atomic chemical potential is acting as the primary driving force. From this conceptual view, we may extrapolate that, under an applied voltage and the consequent electric field, the bulk atoms might be still hard to move while the surface atoms can migrate by surface drift driven by the gradient of the electrostatic potential, i.e., the electric field. By describing the cooperation and competition between drift and diffusion of the surface atoms, the evolution of filament for memory set/reset processes, or synaptic potentiation/depression, can be predicted by the model, thus allowing to optimize the linear/analogue tuning of the conductance. We are currently extending the model along this line, however the model development and validation with experimental data are still in the early stages.
 4. Finally, would it also be possible to use this model to calculate the exact energy necessary for switching?
That could also be a valuable metric to know, to design devices for (energy) efficient memory and neuromorphic applications.

Reply:
We agree with the Referee that the switching energy is a valuable metric. As mentioned in the previous point, the filament evolution induced by field-induced drift is not yet included in our model. Therefore, a direct calculation of the switching energy E = V*I*t is not yet feasible by the model. However, from the MD simulation, we can extract the system total energy of the initially formed filament (Fig. 2a) and the reduced filament energy after some relaxation time. This energy difference also provides the minimum energy to be provided to the system during the switching operation. The corresponding energy relaxation curve as a function of time was added and commented in Supplementary Part II, where the energy decrease after relaxation is about 65 eV, or about 10 aJ. Note, however, that this is just the free energy difference, or G, rather than the real energy expense needed to accelerate the switching transition to the on state, which will instead depend on the voltage, current, and time to achieve the transition. Reply: Indeed our surface diffusion mechanism has its basic physical origin in the Gibbs-Thomson equation. In the Gibbs-Thomson effect, the vapor pressure of a nanoparticle increases with the inverse of its radius [R18], [R19] .
Ostwald ripening is one example of the Gibbs-Thomson effect, where atoms released by small particles may attach to larger particles, which then grow at the expense of smaller particles. Similar to Ostwald ripening, also our picture for the filament disconnection can be viewed as a consequence of the Gibbs-Thomson effect, which drives the minimization of the surface energy. However, there is a major difference between our model and the Ostwald ripening. In Ostwald ripening, the transfer of atoms from small to large particles occurs via outdiffusion in the gas (or liquid, or solid) phase surrounding the particles. Our model for the filament evolution, instead, is entirely based on surface diffusion, as opposed to out-diffusion. It would cost too much energy for atoms to leave the filament and get dispersed in the host material, especially for relatively low solubility of filament metallic atoms in the host material. Therefore, we believe that, although our model for filament evolution and the Ostwald ripening share the same physical origin, i.e., the Gibbs-Thomson effect, they are totally different processes.
We conclude that, although our model has its physical origin in the Gibbs-Thomson effect, and although some of the simulation results are not unexpected based on the Gibbs-Thomson effect, our model is indeed new, because it allows, for the first time, to predict the time evolution of a filament based on its geometry, materials, and conditions (e.g., temperature). Most importantly, through our model we can derive a universal law bridging together many experimental results in the literature under the same physical picture for the first time.
 My first point then is: could we explain the observed phenomena by the Gibbs- Thomson equation, and what is missing from that formalism to properly capture the observed results? Very carefully pointed out is that in matrices, the surface (or should it be interface?) diffusion processes should change. Could we still talk about universal behavior? How does that fit in with Ostwald-ripening (see for example [1])?
Reply: Gibbs-Thomson effect has several equivalent forms. The most popular version of the Gibbs-Thomson effect gives the vapor pressure at the liquid-vapor interface as a function of the local curvature radius according to [R18] : where 0 is the vapor pressure for a flat surface ( 1 , 2 = ∞), the surface tension, is the density of vapor, the density of liquid, and 1 and 2 are the principal radii of the curved surface. The Gibbs-Thomson effect can be extended to solid-state nanoparticles to describe the surface concentration of the particle atoms by [R19] : where is the particle radius, 0 is the surface concentration of atoms in an infinitely large particle,  is the atomic volume, is the Boltzmann constant and is the temperature. For instance, Eq. (R2) can be used to describe Ostwald ripening of nanoparticles in the solid state. [R20,R21] Note that Eq. (R1) and Eq. (R2) are somehow equivalent, as they both predict that larger particles grow at the expense of smaller ones. The reason is that the surface of smaller particle has larger pressure (Eq. (R1)) or higher concentration (Eq. (R2)) of the particle atoms. However, neither Eq. (R1) nor Eq. (R2) can be used to directly predict the surface diffusion phenomenon and the evolution of the filament shape, which is instead obtained by Eqs. (1) and (2) in our numerical model. Yet, a key term in Eqs. (1) and (2) is the sum of inverse principal radii, similar to Eq. (R1). We thus conclude that, although our model has its roots in the Gibbs-Thomson effect, it stands as an original model, for both the formalism and the application. In other words, we are not reinventing the physics, but we are providing a new tool to physically describe an important phenomenon in RRAM technology, that is the universal volatile behavior.
We agree with the Referee that the key element in our theory, which differentiates our model from the Ostwald ripening, is that 'in matrices, the surface (or should it be interface?) diffusion processes should change'.
We rephrase this concept by saying that, in our system made of Ag or Cu filament within silk or SiO 2 , the main physical process driving the filament shape evolution and controlling the device lifetime is the surface diffusion, rather than out-diffusion as in Ostwald ripening. This is because atomic movement at the surface (or interface) is easier than movement of atoms across the surrounding host material. While out-diffusion is essential to describe Ostwald ripening of separate particles [R20,R21] , surface diffusion is the most appropriate mechanism to minimize the surface-volume ratio within a single filament, or when particles are touching one to each other [R22,R23] . The