The influence of the brittle-ductile transition zone on aftershock and foreshock occurrence

Aftershock occurrence is characterized by scaling behaviors with quite universal exponents. At the same time, deviations from universality have been proposed as a tool to discriminate aftershocks from foreshocks. Here we show that the change in rheological behavior of the crust, from velocity weakening to velocity strengthening, represents a viable mechanism to explain statistical features of both aftershocks and foreshocks. More precisely, we present a model of the seismic fault described as a velocity weakening elastic layer coupled to a velocity strengthening visco-elastic layer. We show that the statistical properties of aftershocks in instrumental catalogs are recovered at a quantitative level, quite independently of the value of model parameters. We also find that large earthquakes are often anticipated by a preparatory phase characterized by the occurrence of foreshocks. Their magnitude distribution is significantly flatter than the aftershock one, in agreement with recent results for forecasting tools based on foreshocks.


I. INTRODUCTION
Earthquakes occur in brittle regions of the crust characterized by a velocity weakening friction, which is at the origin of the stick-slip behaviour.The distribution of friction along the fault plane is highly heterogeneous with strong spots, usually called asperities [1].Asperities are expected to be surrounded by weak zones with a rheological behavior better described by a velocity strengthening friction.When the stress accumulated in the surroundings of the hypocenter overcomes the local friction, an abrupt slip takes place and stress is redistributed in the surrounding regions.The stress redistribution along the brittle, velocity weakening, part of the crust triggers the occurrence of other earthquakes, the aftershocks.They follow well established empirical laws that can be put in the form of power laws with quite universal values for the exponents [2].In particular the aftershock rate exhibits a roughly hyperbolic decay with time since the mainshock, an empirical law known as the Omori-Utsu law [3].
At the same time, the stress redistributed by the mainshock in velocity strengthening regions induces some slow deformations, commonly defined as afterslip In ref. [4], we have demonstrated this proportionality in a model with only two elastically coupled degrees of freedom.The first described the fault displacement, with an heterogeneous velocity weakening friction, while the second corresponded to the ductile region displacement, with a velocity strengthening friction.This very simple description can model different tectonic contexts and suggests that the coupling with a velocity strengthening layer and the heterogeneity in the fault friction are the two key ingredients controlling aftershock triggering.The same two ingredients are central in the pre-slip hypothesis [5][6][7] according to which small earthquakes, usually named foreshocks [8,9], are expected to anticipate the mainshock occurrence.According to this hypothesis, because of friction heterogeneity, there are small regions on the fault that have less resistive power than the large fault and can break before it, in presence of an underlying slow deformation process.This mechanism can produce an increase of the seismic activity, as the occurrence time of the mainshock is approaching but, because of the limited number of foreshocks, it is very difficult to be identified [10][11][12][13].Nevertheless, accurate investigations before some recent large earthquakes have elightened the presence of foreshocks together with a phase of slow slip of the plate interface [14][15][16].Other precursory patterns are observed if one considers the distribution in space of foreshocks [17][18][19][20] and/or their distribution in magnitude [21][22][23][24].In particular, very recently, Gulia & Wiemer [24] have shown that the magnitude distribution during aftershock activity is steeper than during foreshock activity.This result is however achieved for only two mainshocks and by means of different selection criterions for the foreshock identification [25].
In this article we show that friction heterogeneities and the slow deformation of a velocity-strengthening layer are sufficient ingredients to explain the whole ensemble of instrumental findings regarding the organization in time, space and magnitude of both aftershocks and foreshocks.To this extent we combine the model of two blocks of ref. [4] with the description of the fault plane originally proposed by Burridge & Knopoff (BK) [26]: a two-dimensional elastic interface with many degrees of freedom, each being subject to a velocity weakening friction law.Therefore our model of the fault consists in a collection of sliding blocks connected to a more ductile region, itself treated as an extended interface subject to velocity strengthening rheology.This system has a clear geophysical justification and allows us to study the organization of simulated earthquakes not only over time but also in space and in magnitude.We find that the model reproduces the most relevant empirical laws observed for instrumental aftershocks and foreshocks, quite independently of the precise value of model parameters.

II. THE MODEL
The model we propose is composed by a first layer H that represents the brittle part of the fault.H is elastically coupled to a second layer U that mimics the ductile region below the fault and is driven by the tectonic dynamics at the (very small) velocity V 0 .Each layer is an extended object made of many interacting degrees of freedom labelled i = 1, 2, . . ., N , organized on a square lattice.For simplicity we assume a motion restricted along the V 0 direction, with scalar displacements h i (t) in the layer H and u i (t) in the layer U.In Fig. 1 we present a schematic description corresponding to a one-dimensional cut of the mechanical model along the V 0 direction.The model also extends in the other direction, which is orthogonal to V 0 .
From continuum mechanics, the elastic cost of the displacement field is k h j =i (h j − h i )/r 2 ij , where r ij is the distance between points i and j.The constitutive equations for the displacements h i in the layer H are obtained from the balance between the elastic forces and the velocity weakening friction force τ h : To improve the efficiency of our numerical scheme, we restrict the sum in Eq. ( 1) to nearest neighbors |r ij | = 1, which corresponds to replacing the elastic force with the discrete Laplacian k h ∇ 2 h i .The total stress on i simplifies to

Brittle zone
Fault plane

Ductile region
FIG. 1.The mechanical model.Mechanical sketch of the model (one-dimensional cut: the other direction is orthogonal to the plane).This is the direct extension of Fig. 1 from [4]: each fault is modeled as a two-dimensional layer (and no longer as a single block).The fault plane H is subject to velocity weakening friction τ h , in the form of randomly placed pinning points (red disks) with varying pinning strength τ th i (disk radius).The ductile region U is subject to velocity strengthening friction τu, and is pulled at constant velocity V0 by distant regions.Within this ductile region, interactions are visco-elastic (Maxwell model), with dashpots having viscosity ηu and elasticity ku.The relative elongations of dashpots around site i is denoted zi = ϕi − ui − (ϕi−1 − ui−1).The two layers are connected elastically with a stiffness k.
is balanced by the friction τ h ).We also apply this short-range approximation to the layer U, which is however more ductile.For this reason we assume that the visco-elastic interactions [27] in U are implemented assuming that neighbouring degrees of freedom are connected by means of a dashpot and a spring placed in series (Fig. 1) [28,29].The constitutive equations for the layer U reads: where z i is the visco-elastic degree of freedom and the dot indicates a temporal derivative.The visco-elastic force k u (∇ 2 u i − z i ) has an intrinsic time scale t η = η/k u .When u i moves, for times shorter than t η the dashpot variable z i remains frozen, so the term k u (∇ 2 u i − z i ) acts as a genuine elastic stress and the layer U is solid-like.At longer times the variable z i (t) evolves to suppress the visco-elastic force (z i = ∇ 2 u i ) and the layer U displays a liquid-like behaviour.
Finally we have to define the form of the friction forces.For the force τ u of the ductile layer U, we assume a velocity strengthening friction, taking the stationary form of the rate-and-state friction (RSF) law [30][31][32]: where σ N is the effective normal stress, µ c is the friction coefficient when the block U slides at the steady velocity V c and A > 0 for a velocity strengthening material.For the friction in the brittle fault H, a random Coulomb failure criterion is adopted.As soon as the force overcomes a local random frictional stress threshold τ th i , the position h i becomes unstable and moves forward by a random amount (∆h) i .Slips of this kind are the bulk of earthquakes and occur on the very fast timescale t s , typically of the order of seconds.It is reasonable to assume that t s is the shortest time scale in the problem, and by far, t s t η , i.e. we assume it is instantaneous.Thus during an earthquake the layer U behaves elastically and Eq.( 2) can be approximated by the term k u z i being constant at these time scales, it plays no role in the dynamics of τ u , u i .As a consequence, for each slip (∆h) i at position i in H there are slips (∆u) j = q rij (∆h) i at all positions j in the layer U, where q rij is a decreasing function of the distance r ij .In general, the precise form of the q rij depends on the details of the dynamics of h i (t), z i (t), 0 < t < t s and can be quite complicated.Indeed, when we apply the RSF laws combined with all other equations (Eq.(3) in particular) to compute the true form of q rij , we find a very fast decay as a function of r ij , and thus decide to neglect terms that are not nearest neighbor to the slipping site.Thus in practice we use a short range form for q rij : q rii = q 0 , q rij = q 1 if |r ij | = 1 and 0 for all others.After the earthquake, at times t > t η , the dashpots of the layer U are relaxed and have dissipated some elastic stress (the k u ∇ 2 u i term is exactly compensated by −k u z i ).In this phase the u i 's are decoupled (η żi = 0) and Eq. ( 2) becomes Implementing the velocity strengthening friction (Eq.( 4)), Eq. ( 6) admits an explicit solution [4,33].More precisely, the time t R = Aσ N k0V0 represents the long timescale associated with the afterslip of the layer U, and for t η < t < t R one obtains where ρ 0 = Aσ N k+k0 is a characteristic length and D is a constant.Conversely, at later times t > t R the logarithmic motion becomes linear u i (t) ∼ V c t with V c = k0 k+k0 V 0 .To summarize, there are four timescales: (1) The slip time scale t s , which characterizes the duration of a single earthquake, (2) t η related to the visco-elastic response in the layer U , (3) t R which corresponds to the posteismic phase and (4) the inter-sequence time scale t d ∼ ∆h/V c which corresponds to the typical waiting time between consecutive seismic sequences.
We assume an infinite time separation (t s t R t d ), which is a realistic approximation for geophysical parameters together with t s t η < t R .Under this hypothesis three distinct phases are identified: coseismic phase (t ∼ t s t η ), post-seismic phase t ∼ t R , (t s t t d ) and interseismic phase t ∼ t d t R .Furthermore assuming that the local displacement ∆h is a constant independent of the position i, the temporal evolution of the model can be numerically implemented via a cellular automaton, for which each slip is infinitely fast.In this approximation the dynamics of the layer H at location i is completely characterized by the two contributions to the stress acting on that site, namely the intra-layer stress f i (t) = k h ∇ 2 h i and the inter-layer stress g i = k(u i − h i ).The sum f i + g i is thus the total stress acting on block i.The details of the evolution of the variables f i and g i are given in the Methods Section.In general, when f i + g i ≥ τ th i there is a slip in the site i and the stress evolves at i and at nearest neighboring sites j: The stress drop ∆f is extracted from a Gaussian distribution with average value ∆f and standard deviation σ.
During the coseismic phase, the stress evolution is driven by all the slips in layer H. Conversely, during the postseismic phase the stress evolution is driven by the ductile behavior of the layer U .More precisely, since u i evolves according to Eq.( 7) one has g i (t) = g i (t 0 )Φ(t − t 0 ) where Φ(t) is a logarithmic decreasing function of time.During the interseismic phase the stress g i (t) grows linearly in time at the very slow tectonic rate k 0 V c .
Since the specific value of ∆f is not relevant, we set ∆f = 1 and the model presents only three parameters: σ, Θ and .The standard deviation σ quantifies the level of friction heterogeneity whereas Θ quantifies the elastic interaction between the two layers and in the limiting case Θ = 0 the layer H is decoupled from the layer U. Finally, the parameter ∝ 1 − j q rij = 1 − q 0 − 4q 1 controls the amount of dissipation.In absence of friction in the layer U (τ u = 0) and neglecting k 0 from Eq.( 5), mechanical equilibrium imposes j q rij = 1.However in general, for a finite k 0 and taking into account the inelastic deformations in the U layer (the z i dynamics), N j=1 q rij < 1. Accordingly, controls the value of an upper magnitude cut-off m U −1.5log 10 (see Suppl.Fig. 3).In the main text we present results for a fixed value of = 0.008 which allows us to explore a sufficiently large magnitude range without finite size effects.The role of and of the system size L is explicitly investigated in Supplementary Figures.

III. FUNDAMENTAL QUANTITIES AND THEIR STATISTICAL FEATURES IN INSTRUMENTAL CATALOGS
A key quantity is the seismic moment M 0 = AD, where A is the fractured area and D is the average displacement.In spring-block models, A corresponds to the number of blocks which have slipped at least once during the earthquake and M 0 = i n i ∆h, where n i is the number of slips performed by the i-th block during the earthquake.We next introduce the moment magnitude m = (2/3) log 10 M 0 .In instrumental catalogs m is distributed according to the Gutenberg-Richter (GR) law: P (m) ∼ 10 −bm , with quite a universal value [34] of b 1.It is worth noticing that the GR law corresponds to a power law decay of the distribution of the seismic moment . Furthermore M 0 is related to the fractured area A by the scaling relation M 0 ∼ A 3/2 equivalent to the proportionality between m and the logarithm of A, m = γ 0 log 10 A + cnst, with quite a universal coefficient γ 0 = 1 [35,36].

IV. COMPARISON WITH PREVIOUS SPRING-BLOCK MODELS
The description of a seismic fault in terms of spring and blocks was originally proposed by Burridge & Knopoff (BK) [26].Bak & Tang [37] have enlightened the similarity between the BK model and the evolution of a simple cellular automaton model, the BTW model [38].In the BTW model the stress of each block increases in time with a constant rate ḟ , which models the tectonic loading, and when it reaches a uniform threshold f th , an earthquake starts by distributing stress to surrounding blocks.In the limit ḟ → 0, once the bond network is assigned the BTW model does not have tunable parameters and is usually considered the paradigmatic example of self-organized system, since it spontaneously evolves towards a state where the size of avalanches is power law distributed.Identifying an avalanche with an earthquake, since the earthquake size is proportional to M 0 , self-organized criticality provides a theoretical explanation for the GR law even, if it gives a too small, non-realistic value of b.Olami, Feder and Christensen (OFC model) [39] have subsequently shown that, keeping the limit ḟ → 0, the BK model can be exactly mapped in a cellular automaton.The model we present coincides with the OFC model in the limit cases Θ = 0 and σ = 0 and, in turn, the OFC model coincides with the BTW model when = 0. Interestingly, the OFC model presents an intermediate range of values such that M 0 is power law distributed with a b value close to one.On the other hand, for any finite value of , in the OFC model M 0 ∝ A leading to γ 0 = 2/3 for the coefficient of the m − logA scaling, different from γ 0 1 of instrumental catalogs.
Many modifications of the original OFC model have been proposed in the literature [2,40], and we group them in three classes: I) Those introducing a second time scale besides ḟ ; II) Those introducing heterogeneity in the friction thresholds f th ; III) Those introducing both a second time scale and friction heterogeneity.A second time scale is usually implemented in order to reproduce the temporal decay of the aftershock number which, indeed, can be attributed to a variety of time-dependent stress transfer mechanisms [41].Major examples of class I models are those implementing a viscous relaxation [42][43][44][45] or a reductions in fault friction by means of RSF laws [44].Concerning class II, the relevance of frictional heterogeneities in earthquake triggering has been deeply investigated [46] and, in particular, the OFC model with a random f th corresponds to the quenched Edwards-Wilkinson (qEW) model [28,29,47].This is a typical model for driven elastic interfaces in a random media and, in this case, it is well established that the seismic moment is power law distributed with b independently of the value of [2].Nevertheless, statistical patterns of seismic occurrence are better reproduced by class III models as shown in ref. [29,44,[48][49][50][51][52][53][54][55][56].
According to the value of the parameters Θ and σ, our model can belong to the different classes.In particular our conjecture is that class III models, and in particular the model we present with finite values of Θ > 0 and σ > 0, belongs to the same universality class of seismic occurrence.This conjecture is supported by the results of the subsequent section.In particular we observe that for finite values of Θ and σ our model is very similar to the Viscoelastic quenched Edwards-Wilkinson (VqEW) model introduced by Jagla et.al. [28].The key difference lies in the functional form of Φ(t).Indeed in our model the use of a realistic velocity strengthening rheology induces a logarithmic variation of Φ(t) with time, which is the crucial ingredient leading to the Omori-Utsu hyperbolic decay of the aftershock rate.In the VqEW model instead, an exponential relaxation of Φ(t) is obtained.

V. RESULTS
For each earthquake we record the occurrence time t, the hypocentral coordinates i (i.e. the coordinate of the block which nucleates the instability), the magnitude m and the fractured area A. The simulated catalog contains about 10 7 earthquakes, however we exclude the first 10% of events so that results are independent of initial conditions.In the main text we present results for different values of Θ and σ, keeping = 0.008 fixed.Results for different are discussed in the Supplementary Notes.
The full separation of time scales allows us to clearly distinguish separate seismic sequences.We define a seismic sequence as the set of earthquakes triggered by the relaxation of the layer U, according to Eq. ( 7), i.e. the set of earthquakes triggered during the postseismic phase.A new sequence starts at much later times when an earthquake is triggered during the interseismic phase with the slow stress rate increase k 0 V c .Interestingly, as it is often observed in instrumental catalogs [57], this first earthquake in the sequence is not always the largest one.We adopt the convention used to classify events of real seismic sequences: the mainshock is the largest event in the sequence, the foreshocks are all events occurring before it and the aftershocks are all the subsequent ones.In Fig. 2a we plot an excerpt of the whole catalog.
The lag time between two consecutive sequences depends on the specific value of t d .Before studying the features of aftershocks and foreshocks, we investigate the behavior of the global catalog.
In Fig. 3 we plot the magnitude distribution P (m) for different values of Θ and σ.
In particular the OFC model [39] (corresponding to Θ = σ = 0) gives an exponential decay with b = 0.12 ± 0.02 up to a system-size dependent upper cut-off m U , whereas for the qEW model (Θ = 0, σ > 0) we find b = 0.40 ± 0.02 independently of , with m U controlled by .Surprisingly, even for small values of Θ, the presence of the velocity strengthening layer U induces a dramatic and robust change in the b-value.For Θ ≥ 0.1, in very good agreement with instrumental catalogs, we always observe b = 1.06 ± 0.05 for intermediate magnitudes ranging from a lower cut-off m L related to lattice-specific details, up to an upper cut-off m U .Keeping Θ > 0.1 fixed we also find that the result b = 1.06 is independent of σ (inset of Fig. 3), except for the singular choice σ = 0 where the magnitude distribution presents a non monotonic behavior (not shown).The parameters Θ and σ only affect the value of m U , which increases with them (Fig. 3).In particular m U tends to m L when (Θ, σ) are very small, shrinking to zero the range where b ≈ 1.We also note an initial exponential decay P (m) ∼ 10 −b m for small magnitudes (smaller than m L ), with b monotonically increasing with Θ from b = 0.65 for Θ = 0.1 to b = 1.42 when Θ = 1.In particular, we find an intermediate range of Θ values (Θ ∈ [0.4,0.6]) where b b, i.e. for which the b 1 regime extends down to small magnitudes.
Realistic b-values of the GR law have already been found by Jagla et.al. [52,53] in models of only one layer but including viscoelastic couplings or aging effect in the static friction coefficients [29,[48][49][50][51][52][53][54][55][56], which in both cases results in an effective additional degree of freedom per lattice site (all these models are spatially extended).
Let us now consider the properties of aftershocks and foreshocks.Results do not depend on the specific values of Θ and σ, thus we only present them for intermediate value of Θ = 0.5 and for σ = 5.With these parameters the GR law is obeyed over a sufficienly large magnitude range.The spatial organization of a typical fore-main-aftershock sequence is plotted in Figs.2c,2d which present the contour of the area fractured by a mainshock (here m M = 5.1) and the contours of fracured area of the largest aftershocks and foreshocks (m > 1.5).Fig. 2d just indicates that the whole sequence is concentrated in a narrow region of the fault plane close to the mainshock epicenter, while a zoom inside this region (Fig. 2c) provides details of the spatial organization of events.First of all we observe that most aftershocks occur close to the border of the mainshock's fractured area.This is consistent with the gap hypothesis according to which the increase of stress on the border of the fractured area triggers the aftershocks, whereas the stress reduction inside the fractured region strongly reduces their occurrence probability.This scenario is strongly supported by recent observations of the aftershock organization after big mainshocks [58].The same analysis of ref. [58] for the distribution of the aftershock hypocentral distance, from the contour of the mainshock fractured area, is presented in Suppl.Fig. 5.In our model also, foreshocks occur close to the border of the area that will be fractured by the mainshock.In order to be more quantitative we plot (inset of Fig. 5) the number of aftershocks n aft (m M ) and foreshocks n fore (m M ) as a function of the mainshock magnitude.We find an exponential behavior n aft (m M ) ∼ 10 αm M which is also observed in instrumental catalogs and known as the productivity law [59,60].Also in this case we find quantitative agreement with the value α 1 observed in instrumental catalogs.The inset of Fig. 5 also shows an exponential behavior n fore (m M ) ∼ 10 αm M for the foreshock number with α 1, a result also observed in instrumental catalogs [18,19].We also find that the number of foreshocks is usually about one hundred times smaller than the aftershock one and we remark that only for the largest mainshock magnitude m M we do have a sufficient number of aftershocks (n aft (m M ) > ∼ 1000) to study their statistical features inside a single main-aftershock sequence.For this reason, to improve the statistics, we group sequences according to their mainshock's magnitude, as it is usually done in instrumental catalogs.More precisely we consider the magnitude distribution of aftershocks (foreshocks) occurring after (before) a mainshock with magnitude m ∈ (m M , m M + 1].Results (Fig. 5) confirm that aftershock magnitudes are distributed according to the GR law with b 1. Interestingly we observe that also foreshocks follow the GR law but with a significantly smaller b-value b 0.8.This result is consistent with the existence of an inverse relation between b-value and local stress level, as indicated by many laboratory measurements and field observations [61][62][63].Accordingly, a smaller b-value (larger proportion of large earthquakes) is expected to be observed before the occurrence of the mainshock and close to its hypocenter, as a signature of high stress conditions.Indeed, several studies report the decrease of the b-value while approaching the mainshock, and identifies it as a precursory pattern which can improve mainshock forecasting [21][22][23][24].Our study represents the first identification of this pattern in a mechanical model simultaneously presenting realistic features of aftershock occurrence.We further note that our measure of the b-value is not biased by the foreshock selection criterion (since we have a perfect separation of sequences) nor is it affected by problems of magnitude completeness (we have access to the smallest earthquakes), which are typical of instrumental catalogs and can be responsible for spurious behaviours of the b-value.
In Fig. 6 we plot the number of aftershocks (foreshocks) n af t (t|m M ) (n f ore (t|m M )) as function of the time t since (before) the mainshock with magnitude m ∈ [m M , m M + 0.8), divided by the total number of mainshocks with m ∈ [m M , m M + 0.8).We find that the aftershock organization in time follows the Omori-Utsu law n aft (t) ∼ t −p with p = 1 over several decades.The inverse Omori law [64,65] n fore (t) ∼ t −p , with p = 1, is also found to characterize the temporal organization of foreshocks.It is worth noticing that, at variance with the aftershock occurrence, a clear temporal behavior cannot be extracted from a single foreshock sequence because of the very small number of foreshocks (we find at most 32 foreshocks during one sequence).Thus the inverse Omori law is only observed after stacking many sequences.The vertical shift of curves for different mainshock magnitudes is consistent with the productivity law, in agreement with the inset of Fig. 5.At short times there is an abrupt transition from an about flat behavior to the 1/t decay.We expect that assuming a finite ratio t η /t R would smooth this transition and help better reproduce instrumental observations.
In Fig. 7 we plot the density of aftershocks or foreshocks ρ(δr, m M ) as a function of the distance δr between their hypocenter and their mainshock's hypocenter, grouping events by intervals of mainshock magnitude m M ∈ [m M , m M +0.8).There is a clear dependence on m M , and at the same time for any m M the aftershocks and foreshocks share very similar spatial distributions, in agreement with instrumental findings [17,18,20].Foreshocks occur mostly over the area fractured by the mainshock, supporting the idea that their spatial organization contains information on the size of the incoming mainshock (in that given region) [17,18].Concretely, we find that ρ(δr, m M ) obeys the Aftershocks and foreshocks magnitude distributions.We report the total number of aftershocks n af t (m, mM ) (open symbols) and foreshocks n f ore (m, mM ) (filled symbols), with magnitude m, grouped by their mainshock's magnitude mM (see legend).We always consider Θ = 0.5, σ = 5, L = 1000 and = 0.008.Lines correspond to the GR law with b = 1.05 (green dot-dashed) and b = 0.83 (magenta dashed).(Inset) The aftershock number n af t (mM ) (empty symbols) and the foreshock number n f ore (mM ) (filled symbols) versus mM .The green line is the productivity law with α = 1.
FIG. 6.The direct and inverse Omori law.The number of aftershocks n aft (t, mM ) (b) and the number of foreshocks n fore (t, mM ) (a) as function of the time t since (and before) the mainshock occurrence.Different colors correspond to different mainshock magnitude classes mM .The dashed line is the hyperbolic Omori-Utsu decay 1/t.The wide range of the vertical scale makes difficult to appreciate the difference between the foreshock number and the corresponding aftershock number.This difference is better ennlightened by results plotted in the inset of Fig. 5.We always consider Θ = 0.5, σ = 5, L = 1000 and = 0.008.Different colors correspond to different mainshock magnitude classes mM .In the inset we show the same data after rescaling by the size of the aftershock area L(mM ) = 10 γm M and γ = 0.54.We always consider Θ = 0.5, σ = 5, L = 1000 and = 0.008.
with L(M m ) ∼ 10 γm M and γ 0.57±0.05.Similar collapses are observed in instrumental catalogs [17,18,[66][67][68], although a smaller value γ 0.5 is usually observed.A second difference lies in the decay of the scaling function Q(δr): in our model it is exponential while power-law tails are reported in instrumental catalogs [68].This overly fast decay can be attributed to our approximate modelling of elastic interactions within each layer, the correct long-range interaction expected from elasto-static theory [2] being replaced (see Eq. 1) with the short-range term k h ∇ 2 h i .Indeed under this short-range approximation aftershocks can be triggered only within or at the boundary of the rupture zone, as was shown in [28,29].Finally we note that this spatial clustering law can be related to the m-logA scaling of Fig. 4, with γ = 1/(2γ 0 ).Indeed since aftershocks are mostly distributed on the border of the area fractured by the mainshock, one has L(m M ) ∼ √ A ∼ √ 10 m M /γ0 ∼ 10 m M γ .

VI. CONCLUSIONS AND PERSPECTIVES
We have implemented a minimal model for earthquake triggering, modelling the interaction between the brittle part of the crust (an elastic and velocity weakening region) and the ductile zone (a visco-elastic region with velocity strengthening rheology).We assume short-range elasticity and infinite time separation, which allows to develop a cellular automaton model controlled by only two parameters, Θ and σ.Very interestingly, we find that as soon as Θ and σ are sufficiently different from zero, we recover the established statistical features of aftershock occurrence, with realistic values of the parameters b, α, γ, p.This robustness strongly suggests that our model captures the universality class of instrumental earthquakes.Our model thus provides useful insights on the mechanisms of aftershock triggering.For example, a deviation from the stationary behavior of the b-value is found during foreshock sequences, supporting its interpretation as a precursory pattern for the mainshock occurrence.
Although our model misses some features of instrumental earthquakes, such as the power law decay of the spatial density ρ, it can be very useful.Thanks to its simplicity, we can easily produce very complete synthetic catalogs to test forecasting hypothesis, or mechanisms of stress evolution and how it is related to foreshocks, mainshocks or aftershocks.It is also possible to extend the single fault model presented in this study to a more realistic description as a network of faults.One could then study the interaction between different branches of the network.

A. Derivation of the Cellular automaton rules
We consider two square layers of sides L = 1000.In our lattice geometry there are exactly 4 neighbors j for each site i: the stress diffusion terms of the type (∇ 2 h) i at site i with positions x, y thus stand for (∇ 2 h) x,y = (h x+1,y + h x−1,y + h x,y+1 + h x,y−1 − 4h x,y ).We use absorbing boundary conditions (h = 0 is fixed at the boundary) which means that some stress is absorbed at the boundaries, and the slip cannot propagate further.
We now recall the main assumptions of our continuous model, before explicitly deriving the corresponding cellular automaton.These assumptions are summarized in the mechanical sketch of Fig. 1, from which the equations can be derived.The stress at site i in the layer H is the sum of intra-layer and inter-layer stresses, respectively: The total stress f i + g i at site i is balanced by a velocity-weakening (Coulomb failure style) friction force τ h , which takes a new random value, denoted τ th i , after each slip: where G(τ ) is a gaussian distribution with average 1 and standard deviation σ.As long as τ h ≥ f i + g i the site i is pinned, that is ḣi = 0.The constitutive equations operate on various time scales: The first two equations are the force balance between applied stresses and local friction force, and are thus instantaneous.The third is the internal stress dynamics of the visco-elastic layer U , evolving over an intermediate time scale t η .The fourth is the time evolution of the velocity-strengthening friction, slowly evolving over a time scale t R .We recall the constants: -Initialization -At time t = 0 we assign a local frictional threshold τ th i extracted from G(τ ).We also choose the initial value f i (0) of the local stress at random in the interval [0, τ th i ) and suppose that at u i (0) = h i (0) in all sites.-Interseismic phase -At time scales larger than t s , t η , t R , we have ui = V c and the equations above simplify.Using Eq. ( 15) we get τ u = µ c .At these long time scales (t t η ) we have η żi = 0 so that using Eq. ( 14), z i = ∇ 2 u i .Using Eq. ( 13), this combines to yield k 0 V 0 t = (k + k 0 )V c t, which explains the necessary definition V c = k0V0 k+k0 .We finally have f i + g i = const.+ kV c t, and using Eq. ( 12), we can compute the distance to failure (time before failure): with t 0 the time at the beginning of this phase.The site i * corresponding to the smallest value of t (drive) i is thus identified as the hypocenter of the next earthquake.An amount of stress τ th i * − f i * (t 0 ) − g i * (t 0 ) is then added to all sites and the coseismic phase is entered, with exactly one site being unstable (the one where f i * (t) + g i * (t) = τ th i * ).-Coseismic phase -Each site with f i (t) + g i (t) ≥ τ th i is unstable and slips of a constant amount ∆h, h i → h i + ∆h.A slip in the layer H at site i induces a coseismic slip u j → u j + q rij ∆h inside the U layer.As explained in the main text, we set q r = 0 for r > 1, i.e. we only keep the local coseismic slip q rii = q 0 > 0 and the nearest neighbor coseismic slip q rij = q 1 > 0 (when |r ij | = 1).Because of the ductile nature of the layer U there is some dissipation occurring during the coseismic slip, in the sense that the total coseismic slip is less than the slip: = 1 − q 0 − 4q 1 > 0 ( = 0 would be the dissipationless case).This coseismic slip is considered instantaneous and corresponds to the following stress evolution, for the site i itself and for its first neighbors j: At this time scale, the internal degrees of freedom z i are fixed and do not evolve.By introducing the parameters we can factorize: Which shows that the coseismic slip stabilizes g i but increases the g j stresses.After a slip the block h i is in a different frictional condition, i.e. a new value of τ th i is extracted from the distribution G(τ ).If τ th i is such that f i (t)+g i (t) ≥ τ th i then the process of Eq.s (23-26) is iterated immediately, until f i (t) + g i (t) < τ th i .Because of the stress redistribution, nearest-neighbor sites j can be unstable and slip at the same time.In practice, we perform the updates of Eqs.(23)(24)(25)(26) on all sites for which f j (t) + g j (t) ≥ τ th j , until all sites satisfy f j (t) + g j (t) < τ th j .We follow a sequential updating scheme, which implies the slip of just one unstable block at a time.Preliminary results with an updating scheme where all unstable blocks simultaneously slip indicate no important difference.
Shortly after the end of the earthquake, the visco-elastic couplings (internal degrees of freedom z i of the layer U ) relax their stress (over a time scale t η = η/k u ), which in practice means that η ż = 0 = k u (∇ 2 u i − z i ).As we explained in the main text, the way in which this relaxation affects the stress in the layer U has already been included implicitly in the coslip dynamics, via the coefficients q 0 , q 1 , so that on longer time scales we can simply consider that k u (∇ 2 u i − z i ) = 0.In particular, Eq. ( 5) simplifies into Eq.( 6), i.e. the blocks u i become independent.
-Post-seismic phase -At the end of an avalanche the h i are stuck, so that the intra-layer stress f i remains constant.However the g i may evolve.Indeed, the u i are subject to a velocity-strengthening rheology and evolve according to Eq. ( 6), which can be integrated to give Eq.( 7), that we recall here: u i (t) = u i (t 0 ) + ρ 0 log 1 + D t−t0 t R , where D = exp (−g(t 0 )/Aσ N ) is a constant (over time, but different for each site).This solution can be checked, using Eq. ( 15) on one side and Eq. ( 6) on the other.One may also consult our solution for the case of a two-blocks model [4], which is the same since all sites u i evolve independently during the afterslip (in the two-block model there was only one block u).Writing g i (t) = g i (t 0 ) + k(u i (t) − u i (t 0 )), we can identify g i (t) = g i (t 0 )Φ(t − t 0 ) (27) with and the local stress value evolves, at times t ≥ t 0 , according to the equation We note that this happens independently in all the sites (there is no stress transfer between sites, which makes the evolution very simple to compute).It is important to remark that Φ(t) is a monotonically decreasing function of t.
For g i (t 0 ) < 0 (which does happen), D is large and at times t − t 0 t R , Φ(t − t 0 ) ∼ 0. Thus Φ decreases from 1 to ≈ 0. This relaxation of the inter-layer stress g i (t) happens to all sites simultaneously.If for a site i, f i (t 0 ) > τ th i , there will be a time t af t > t 0 such that f i (t af t ) + g i (t af t ) = τ th i .For some sites (depending on the slips dynamics) this condition is not fulfilled and no such time exists.
Computationally, we compute the time t af t for all sites where it is defined by inverting Eq. ( 29).Then we pick the smallest t af t , that we may call t * af t , and relax the stress in all sites according to Eq. ( 29) using t = t * af t .At this point there is thus exactly one site that became unstable (the one with the smallest t af t ), i.e. a new earthquake is triggered.We then proceed with the coseismic phase, until the earthquake is complete.This is the physical mechanism which triggers aftershocks.After a number of aftershocks, there is a point where no site fulfills the condition f i (t 0 ) > τ th i , i.e. no time t af t can be defined.In that case we perform relaxation "for infinite time" (several O(t R )), or more concretely we set g i = 0 at all sites.At this point the fore-main-aftershock sequence is finished and the interseimic phase resumes, triggering a new sequence.

FIG. 2 .
FIG. 2. The numerical catalog (a) A typical example of a part of the simulated catalog containing five sequences.We plot the magnitude of each event m versus its occurrence time in units of t d .(b) A zoom on the second sequence plotted in panel (a).(c) We plot the contour of the area fractured by the mainshock (black line) of the m > 1.5 aftershocks (blue lines) and foreshocks (green lines) for the sequence plotted in panel (b).Red rhombus, cyan and green triangles indicate the hypocentral location of the mainshock, of the m < 1.5 aftershocks and foreshocks, respectively.We include only aftershocks up to the time t = 0.2tR after the mainshock occurrence.(d) As in panel (c) for the whole fault plane, during the temporal window of the sequence considered in panel (b).

FIG. 3 .
FIG. 3. The magnitude distribution.Magnitude distribution P (m) for different values of Θ and σ.In the main panel we fix σ = 5.0 and change Θ, except for the OFC model with σ = 0.In the inset we fix Θ = 0.5 and change σ.Lines correspond to the GR law P (m) ∝ 10 −bm either with b = 1.06 (green dashed), consistent with the instrumental value, or b = 0.12 (turquoise dot-dashed) or b = 0.40 (black dotted).

FIG. 7 .
FIG.7.The spatial clustering of aftershocks and foreshocks.The spatial density of aftershocks ρ aft (δr, mM ) (open symbols) and foreshocks ρ fore (δr, mM ) (filled symbols) as function of the hypocentral distance from the mainshock epicenter δr.Different colors correspond to different mainshock magnitude classes mM .In the inset we show the same data after rescaling by the size of the aftershock area L(mM ) = 10 γm M and γ = 0.54.We always consider Θ = 0.5, σ = 5, L = 1000 and = 0.008.