Abstract
The distribution of seismic moment is of capital interest to evaluate earthquake hazard, in particular regarding the most extreme events. We make use of likelihoodratio tests to compare the simple GutenbergRichter powerlaw (PL) distribution with two statistical models that incorporate an exponential tail, the socalled tapered GutenbergRichter (Tap) and the truncated gamma, when fitted to the global CMT earthquake catalog. Although the Tap distribution does not introduce any significant improvement of fit respect the PL, the truncated gamma does. Simulated samples of this distribution, with parameters β = 0.68 and m_{c} = 9.15 and reshuffled in order to mimic the time occurrence of the order statistics of the empirical data, are able to explain the temporal heterogeneity of global seismicity both before and after the great SumatraAndaman earthquake of 2004.
Introduction
The GutenbergRichter (GR) law is not only of fundamental importance in statistical seismology^{1} but also a cornerstone of nonlinear geophysics^{2} and complexsystems science^{3}. It simply states that, for a given region, the magnitudes of earthquakes follow an exponential probability distribution. As the (scalar) seismic moment is an exponential function of magnitude, when the GR law is expressed in terms of the former variable, it translates into a powerlaw distribution^{4,5}, i.e.,
with M seismic moment, f(M) its probability density, (fulfilling ), the sign “∝” denoting proportionality, and the exponent 1 + β taking values close to 1.65. This simple description provides rather good fits of available data in many cases^{6,7,8,9}, with, remarkably, only one free parameter, β. A totally equivalent characterization of the distribution uses the survivor function (or complementary cumulative distribution), defined as
for which the GR power law takes the form S(M) ∝ 1/M^{β}.
The powerlaw distribution has important physical implications, as it suggests an origin from a critical branching process or a selforganizedcritical state^{3,10,11}. Nevertheless, it presents also some conceptual difficulties, due to the fact that the mean value 〈M〉 provided by the distribution turns out to be infinite^{4,12}. These elementary considerations imply that the GR law cannot be naively extended to arbitrarily large values of M, and one needs to introduce additional parameters to describe the tail of the distribution, coming presumably from finitesize effects. However, a big problem is that the change from power law to a faster decay seems to take place at the highest values of M that have been observed, for which the statistics are very poor^{13}.
Kagan^{7} has enumerated the requirements that an extension of the GR law should fulfil; in particular, he considered, among other: (i) the so called tapered (Tap) GutenbergRichter distribution (also called Kagan distribution^{14}), with a survivor function given by
and (ii) the (left) truncated gamma (TrG) distribution, for which the density is
Note that both expressions have essentially the same functional form, but the former refers to the survivor function and the later to the density. As f(M) = −dS(M)/dM, differentiation of S_{tap}(M) in (i) shows the difference between both distributions. In both cases, parameter θ represents a crossover value of seismic moment, signalling a transition from power law to exponential decay; so, θ gives the scale of the finitesize effects on the seismic moment. The corresponding value of (moment) magnitude (sometimes called corner magnitude) can be obtained from , when the seismic moment is measured in N · m^{15,16}.
Kagan^{7} also argues that available seismic catalogs do not allow the reliable estimation of θ, except in the global case (or for large subsets of this case), in particular, he recommends the use of the centroid moment tensor (CMT) catalog^{17,18}. From his analysis of global seismicity, and comparing the values of the likelihoods, Kagan^{7} concludes that the tapered GR distribution gives a slightly better fit than the truncated gamma distribution, for which in addition the estimation procedure is more involving. In any case, the β−value seems to be universal (at variance with θ), see also refs 9, 19 and 20.
Nevertheless, the data analyzed by Kagan^{7}, from 1977 to 1999, comprises a period of relatively low global seismic activity, with no event above magnitude 8.5; in contrast, the period 1950–1965 witnessed 7 of such events^{21}. Starting with the great SumatraAndaman earthquake of 2004, and following since then with 5 more earthquakes with m ≥ 8.5 (up to the time of submitting this article), the current period seems to correspond to the past higher levels of activity.
Main et al.^{22} and Bell et al.^{23} have reexamined the problem of the seismic moment distribution including recent global data (shallow events only). Using a Bayesian information criterion (BIC), Bell et al.^{23} compare the plain GR power law with the tapered GR distribution, and conclude that, although the tapered GR gives a significantly better fit before the 2004 Sumatra event, the occurrence of this changes the balance of the BIC statistics, making the GR power law more suitable; that is, the power law is more parsimonious, or simply, is enough for describing global shallow seismicity when the recent megaearthquakes are included in the data. Similar results have been published in ref. 24.
In the present paper we revisit the problem with more recent data, including also the truncated gamma distribution, using other statistical tools, and reaching somewhat different conclusions: when data includes periods of high seismic activity, indeed, the tapered GR distribution does not introduce any significant improvement with respect to the power law^{23}, but the truncated gamma does.
Data, Models and Maximum Likelihood Estimation
As Main et al.^{22} and Bell et al.^{23}, we analyze the global CMT catalog^{17,18}, in our case for the period between January 1, 1977 and October 31, 2013, with the values of the seismic moment converted into N · m (1 dyn · cm = 10^{−7} N · m). We restrict to shallow events (depth <70 km) and, in order to avoid incompleteness, to magnitude m > 5.75 (equivalent to M > 5.3 · 10^{17} N · m), as Main et al.^{22} and Bell et al.^{23}. This yields 6150 events.
As statistical tools, we use maximum likelihood estimation (MLE) for fitting, and likelihoodratio (LR) tests for comparison of different fits. Maximum likelihood estimation is the bestaccepted method in order to fit probability distributions, as it yields estimators which are invariant under reparameterizations, and which are asymptotically efficient for regular models, in particular for exponential families^{25} (the three models under consideration here are regular, and the PL and the TrG belong to the exponential family). When maximum likelihood is used under a wrong model, what one finds is the closest model to the true distribution in terms of the KullbackLeibler divergence^{25}.
Model selection tests based on the likelihood ratio have the advantage that the ratio is invariant with respect to changes of variables (if these are onetoone^{25}). Moreover, for comparing the fit of models in pairs, LR test is preferable in front of the computation of differences in BIC or AIC (Akaike information criterion), as the test relies on the fact that the distribution of the LR is known, under a suitable null hypothesis, which provides a significance level (or level of risk) to its value. So, LR tests constitute probabilitybased model selection (in contrast to BIC and AIC). But note that the loglikelihoodratio is equal to the difference of BIC or AIC when the number of parameters of the two models is the same.
In order to perform MLE it is necessary to specify the densities of the distributions, including the normalization factors. In our case, all distributions are defined for M above the completeness threshold a, i.e., for M > a, being zero otherwise (as mentioned above, a is fixed to 5.3 × 10^{17} N · m). For the powerlaw (PL) distribution (which yields the GR law for the distribution of M) Eq. (1) reads
with β > 0. For the tapered GutenbergRichter,
with β > 0 and θ > 0. And for the lefttruncated (and extended to β > 0) gamma distribution;
with −∞ < β < ∞ and θ > 0, and with the upper incomplete gamma function, defined for z > 0 when γ < 0.
We summarize the parameterization of the densities as f(M; Θ), where Θ = {β, θ} for the Tap and TrG distributions and Θ = β for the power law. Note that for the TrG distribution, it is clear that the exponent β is a shape parameter and θ is a scale parameter; in fact, these parameters play the same role in the Tap distribution, which turns out to be a mixture of two truncated gamma distributions, one with shape parameter β and the other with β − 1, but with common scale parameter θ. Exactly,
(in our case, the contribution of the second TrG will be only about 0.14%). In contrast, the power law lacks a scale parameter. In all cases the completeness threshold a is a truncation parameter, but it is kept fixed and is not a free parameter, therefore.
Other authors consider the upper truncated powerlaw distribution^{13,26}, given by f(M; β, θ) ∝ 1/M^{1+β} for a < M < θ, and zero otherwise; then θ becomes a truncation parameter. We disregard this model because such an abrupt truncation is unphysical^{7}, because the occurrence of one single earthquake with size larger than the resulting value of θ invalidates the selected model, and because the fact that the support of the distribution involves the unknown parameter θ leads to a violation of the regularity conditions for which standard likelihood theory holds^{25}.
The knowledge of the probability densities allows the direct computation of the likelihood function as where M_{i} are the N observational values of the seismic moment. Maximization of the likelihood function with respect the values of the parameters leads to the maximumlikelihood estimation of these parameters, with the value of the likelihood at its maximum. Note that the independence assumption that is implicit in the expression for L(Θ) arises in fact as the maximumentropy solution when there is no information about dependence^{27}. If the data cannot be considered independent, the MLE results will just describe a marginal distribution f(M; Θ) of the sample under consideration, and inference about the underlying population will not be possible, as the sample may be not representative of the population. In any case, the results of MLE for our three models are reported in Table 1, and an illustration of the corresponding fits is provided in the Supplementary Information (SI). Although the TrG model has the highest likelihood one has to perform a proper model comparison.
Model Comparison
A powerful method for comparison of pairs of models is the likelihoodratio test, specially suitable when one model is nested within the other, which means that the first model is obtained as a special case of the second one. This is the case of the powerlaw distribution with respect to the other two distributions; indeed, the power law is nested both within the Tap and within the truncated gamma, as taking θ → ∞ in any of the two leads to the powerlaw distribution. This is easily seen taking into account that S_{tap}(M) = (a/M)^{β}e^{−(M−a)/θ}, or just performing the limit in the expression for f_{tap}(M) above. For the truncated gamma distribution, when doing the θ → ∞ limit in f_{trg}(M) one needs to use that, for γ < 0, z^{γ}/Γ(γ, z) → −γ when z → 0, see ref. 28 for γ ≠ −1, −2, …
Given two probability distributions, 1 and 2, with 1 nested within 2, the likelihoodratio test evaluates , where is the likelihood (at maximum) of the “bigger” or “full” model (either Tap or TrG) and corresponds to the nested or null model (power law in our case). Taking logarithms we get the loglikelihoodratio
with , where f_{i} denotes the probability density function of the distribution j for every j = 1, 2, and the MLE corresponds to and . In order to compare the fit provided by the two distributions, it is necessary to characterize the distribution of .
Let n_{1} and n_{2} be the number of free parameters in the models 1 and 2, respectively. In general, if the models are nested, and under the null hypothesis that the data comes from the simpler model, the probability distribution of the statistic in the limit N → ∞ is a chisquared distribution with degrees of freedom equal to n_{2} − n_{1} > 0. So, for n_{2} = n_{1} + 1,
with a level of risk equal to 0.05. Note that the chisquared distribution provides a penalty for model complexity as the “range” or “scale” of the distribution is given directly by the number of the degrees of freedom. This likelihoodratio test constitutes the best option to choose among models 1 and 2, in the sense that it has a convergence to its asymptotic distribution faster than any other test^{29}. The null and alternative hypotheses correspond to accept model 1 or 2, respectively, although the acceptance of model 1 does not imply the rejection of 2, it is simply that the “full” model 2 does not bring any significant improvement with respect the simpler model 1, which is more parsimonious.
On the other hand, when the nesting of distribution 1 within 2 takes place in such a way that the space of parameters of the former one lies within a boundary of the space of parameters of distribution 2, the approach just explained for the asymptotic distribution of is not valid^{30,31}. This happens when testing both the Tap or the TrG distributions in front of the powerlaw distribution, as the θ →∞ limit of the latter corresponds to the boundary of the parameter space of the two other distributions, and then, what one should obtain for is a mixture of a chisquare and a Dirac delta function. Nevertheless, this latter result is also unapplicable in our case, as the powerlaw distribution does not fulfil the sufficient conditions stated in ref. 30, due to the divergence of the second moment^{32}. This illustrates part of the difficulties of performing proper model selection when fractallike distributions are involved^{33}. In order to obtain the distribution of and from there the p−values of the LR tests, we are left to the simulation of the null hypothesis. We advance that the results seem to indicate that the distribution of , for high percentiles, is close to chisquare with one degree of freedom, so that Eq. (10) is approximately valid, but we lack a theoretical support for this fact.
Let us proceed, using this method, by comparing the performance of the powerlaw and Tap fits when applied to the global shallow seismic activity, for time windows starting always in 1977 and ending in the successive times indexed by the abscissa in Fig. 1(a) (as in ref. 23). The loglikelihoodratio of these fits (times 2), is shown in the figure together with the critical region of the test. In agreement with Bell et al.^{23}, we find that: (i) the powerlaw fit can be safely rejected in front of the Tap distribution for any time window ending between 1984 and before 2004; and (ii) the results change drastically after the occurrence of the great 2004 Sumatra earthquake, for which the power law cannot be rejected at the 0.05 level. So, for parsimony reasons, the power law becomes preferable in front of the Tap distribution for time windows ending later than 2004. The fact that, for these time windows, the Tap distribution cannot be distinguished from the power law is also in agreement with previous results showing that the contour lines in the likelihood maps of the Tap distribution are highly nonsymmetric and may be unbounded for smaller levels of risk^{7,24,34}.
When we compare the powerlaw fit with the truncated gamma, using the same test, for the same data, the results are more significant, see Fig. 1(b). The situation previous to 2004 is nearly the same, with an extremely poor performance of the power law; but after 2004, despite a big jump again in the value of the likelihood ratio, the power law remains nonacceptable, at the 0.05 level. It is only after the great Tohoku earthquake of 2011 that the p−value of the test enters slightly into the nonrejection region, but keeping values very close to the 0.05 limit. From here we conclude that, in order to find an alternative to the powerlaw distribution, the truncated gamma distribution is a better option than the Tap distribution, as it is more clearly distinguishable from the power law (for this particular data).
At this point, a direct comparison between these two distributions (Tap and TrG) seems pertinent. In this case we may use the likelihoodratio test of Vuong for nonnested models^{35,36}. As the number of parameters is the same for both models, their loglikelihoodratio coincides with the difference in BIC or AIC, but the LR procedure incorporates a statistical test which specifies the distribution of the statistic under consideration. Unfortunately, the results are inconclusive, as no significance difference shows up. This is not surprising if one considers that the LR test for nonnested models is less powerful than the LR test for nested models used above.
In order to check the possible influence of the different heterogeneous populations present in global seismicity, associated to different tectonic zones, we have separately analyzed subduction zones, similarly as done in ref. 37, using FlinnEngdahl’s regionalization^{6}. The results for the LR tests are qualitatively the same, with the main difference that the values of l_{trg} − l_{pl} become somewhat smaller (not shown); nevertheless, as long as a time window of several years is considered, the powerlaw hypothesis can always be rejected except after the Tohoku earthquake. The resulting MLE parameters for the TrG are and ( N · m) for N = 4067 events. Then, the slightly larger value of with respect the global case (Table 1) makes the power law a bit harder to reject.
Simulated Data with Temporal Reshuffling
As we have seen, in contrast to the Tap, the TrG distribution does bring an improvement with respect the PL, so, we concentrate on further comparisions between TrG and PL. With the purpose of gaining further insight, we simulate random samples following the truncated gamma distribution, with the parameters and obtained from MLE of the complete dataset (Table 1), with the same truncation parameter a and number of points (N = 6150) also. To avoid that the conclusions depend on the time correlation of magnitudes in the empirical data, we reshuffle the simulated data in such a way that the temporal occurrence of the order statistics in the seismic moment is the same as for the empirical data; in other words, the largest simulated event is assigned to take place at the time of the 2011 Tohoku earthquake (the largest of the CMT catalog^{23}), the second largest at the time of the 2004 Sumatra event, and so on. In this way, we model earthquake seismic moments as arising from a gamma distribution with fixed parameters, with occurrence times given by the empirical times, and with practically the same seismicmoment correlations as the empirical data.
We simulate 1000 datasets with N = 6150 each. The results, summarized in Fig. 2 using boxplots^{38}, show that the behaviour of the empirical data is not atypical in comparison with this gamma modelling. In nearly all time windows the empirical data lies in between the first and third quartile of the simulated data, although before 2004 the empirical values are close to the third quartile whereas after 2004 they lay just below the median. This leads us to compute the statistics of the jump in the loglikelihoodratio between 2004 and 2005. The estimated probability of having a jump larger than the empirical value is around 4.5%, which is not far from what one could accept from the gamma modelling explained above. Thus, a TrG distribution, with fixed parameters, is able to reproduce the empirical findings, if the peculiar time ordering of magnitude of the real events is taken into account. Notice also that, although the simulated data come from a TrG distribution, they are not distinguishable from a power law for about half of the simulations of the last time windows, as the critical region is close to the median indicated by the boxplots.
We can also compare the evolution of the estimated parameters for the empirical dataset and for the reshuffled TrG simulations, with a good agreement again, see Fig. 3. There, it is clear that although the exponent β reaches very stable values relatively soon (around 1990), the scale parameter θ (equivalent to m_{c}) is largely unstable, and the occurrence of the biggest events makes its value increase.
As a complementary control we invert the situation, simulating 1000 synthetic powerlaw datasets with β = 0.685 (Table 1), a = 5.3 × 10^{17} N · m, and N = 6150, for which the same time reshuffling is performed, in such a way that the order of the order statistics is the same. In this case, the results of the simulations lead, on average, to much smaller values of the logratio in comparison with the empirical data, which corresponds to the limit of rejection for many time windows, see Fig. 4. So, a powerlaw distribution with temporal reshuffling cannot account for the empirical results as clearly as a truncated gamma distribution. Doing the same with a Tap distribution one finds something in between, see SI.
Discussion
Testing different statistical models for the distribution of seismic moment of global shallow seismicity (using the CMT catalog) we have found that, in contrast to the Tap distribution, the truncated gamma brings significant improvement with respect to the power law. Moreover, in order to reproduce the time evolution of the statistical results, it suffices that independent seismic moments following a truncated gamma distribution with fixed parameters β = 0.68 and m_{c} = 9.15 are reshuffled so that the peculiar empirical time sequence of magnitudes is maintained (note that after reshuffling independence is broken). So, despite the fact that the future occurrence of more and larger megaearthquakes could significantly change the value of parameter m_{c}^{13}, the current value is enough to explain the available data. Although ref. 13 claims that no less than 45,000 events are necessary for the reliable estimation of m_{c}, our simulations with 6150 events indicate otherwise, see for instance the last boxplot for the estimation of m_{c} in Fig. 3, which yields a mean value of 9.11, with a standard deviation of 0.24, totally consistent with the results in Table 1. We conclude that the fundamental problem in the estimation of m_{c} is not the number of available data but the temporal heterogeneity of the seismic moment distribution. We have also found, with a similar reshuffling procedure, that a powerlaw distribution cannot account for the empirical findings. Direct comparison of Figs 2 and 4 shows how the TrG distribution outperforms the power law. Additionally, it would be very interesting to investigate if the high values of the likelihood ratio attained before the 2004 Sumatra event could be employed to detect the end of periods of low global seismic activity. Certainly, more case studies would be necessary for that purpose.
As extra arguments in favour of the truncated gamma distribution in front of the tapered GR, we can bring not statistical evidence but physical plausibility and statistical optimality. On the one hand, the former distribution can be justified as coming from a branching process that is slightly below its critical point^{12,39}. Further reasons that may support the truncated gamma are that this arises (i) as the maximum entropy outcome under the constrains of fixed (arithmetic) mean and fixed geometric mean of the seismic moment^{40}; (ii) as the closest to the power law, in terms of the KullbackLeibler divergence, when the mean seismic moment is fixed^{41}; and (iii) as a stable distribution under a fragmentation process with a powerlaw transition rate^{41}. We are not aware of similar theoretical support in favour of the Tap distribution. On the other hand, it is straightforward to check that the truncated gamma belongs to the exponential family^{25}, in contrast to the Tap distribution. And it is well known that estimators in the exponential family achieve the CramérRao lower bound for any sample size, in contrast to other regular models, where the bound is only achieved asymptotically.
Additional Information
How to cite this article: Serra, I. and Corral, Á. Deviation from power law of the global seismic moment distribution. Sci. Rep. 7, 40045; doi: 10.1038/srep40045 (2017).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1
T. Utsu . Representation and analysis of earthquake size distribution: a historical review and some new approaches. Pure Appl. Geophys. 155, 509–535 (1999).
 2
B. D. Malamud . Tails of natural hazards. Phys. World 17(8), 31–35 (2004).
 3
P. Bak . How Nature Works: The Science of SelfOrganized Criticality. Copernicus, New York (1996).
 4
L. Knopoff & Y. Kagan . Analysis of the theory of extremes as applied to earthquake problems. J. Geophys. Res. 82, 5647–5657 (1977).
 5
A. Corral. Scaling and universality in the dynamics of seismic occurrence and beyond. In A. Carpinteri & G. Lacidogna. editors Acoustic Emission and Critical Phenomena, pages 225–244 Taylor and Francis, London (2008).
 6
Y. Y. Kagan . Universality of the seismic momentfrequency relation. Pure Appl. Geophys. 155, 537–573 (1999).
 7
Y. Y. Kagan . Seismic moment distribution revisited: I. statistical results. Geophys. J. Int. 148, 520–541 (2002).
 8
A. Deluca & A. Corral . Fitting and goodnessoffit test of nontruncated and truncated powerlaw distributions. Acta Geophys. 61, 1351–1394 (2013).
 9
Y. Y. Kagan . Earthquakes: Models, Statistics, Testable Forecasts. Wiley (2014).
 10
D. VereJones . A branching model for crack propagation. Pure Appl. Geophys. 114, 711–725 (1976).
 11
I. Main . Statistical physics, seismogenesis, and seismic hazard. Rev. Geophys. 34, 433–462 (1996).
 12
A. Corral and F. FontClos. Criticality and selforganization in branching processes: application to natural hazards. In M. Aschwanden. editor SelfOrganized Criticality Systems, pages 183–228 Open Academic Press, Berlin (2013).
 13
G. Zöller . Convergence of the frequencymagnitude distribution of global earthquakes: Maybe in 200 years. Geophys. Res. Lett. 40, 3873–3877 (2013).
 14
D. VereJones, R. Robinson & W. Yang . Remarks on the accelerated moment release model: problems of model formulation, simulation and estimation. Geophys. J. Int. 144(3), 517–531 (2001).
 15
H. Kanamori . The energy release in great earthquakes. J. Geophys. Res. 82(20), 2981–2987 (1977).
 16
H. Kanamori & E. E. Brodsky . The physics of earthquakes. Rep. Prog. Phys. 67, 1429–1496 (2004).
 17
G. Ekstrom, M. Nettles & A. M. Dziewonski . The global CMT project 2004–2010: Centroidmoment tensors for 13,017 earthquakes. Phys. Earth Planet. Int. 200–201, 1–9 (2012).
 18
T. A. Chou, A. M. Dziewonski & J. H. Woodhouse . Determination of earthquake source parameters from waveform data for studies of global and regional seismicity. J. Geophys. Res. 86, 2825–2852 (1981).
 19
C. Godano & F. Pingue . Is the seismic momentfrequency relation universal? Geophys. J. Int. 142, 193–198 (2000).
 20
Y. Y. Kagan . Earthquake size distribution: Powerlaw with exponent β ≡ 1/2? Tectonophys. 490, 103–114 (2010).
 21
T. Lay . Why giant earthquakes keep catching us out. Nature 483, 149–150 (2012).
 22
I. G. Main, L. Li, J. McCloskey & M. Naylor . Effect of the Sumatran megaearthquake on the global magnitude cutoff and event rate. Nature Geosci. 1, 142 (2008).
 23
A. F. Bell, M. Naylor & I. G. Main . Convergence of the frequencysize distribution of global earthquakes. Geophys. Res. Lett. 40, 2585–2589 (2013).
 24
E. L. Geist & T. Parsons . Undersampling powerlaw size distributions: effect on the assessment of extreme natural hazards. Nat. Hazards 72, 565–595 (2014).
 25
Y. Pawitan . In All Likelihood: Statistical Modelling and Inference Using Likelihood. Oxford: UP, Oxford, (2001).
 26
M. Holschneider, G. Zöller & S. Hainzl . Estimation of the maximum possible magnitude in the framework of a doubly truncated GutenbergRichter model. Bull. Seismol. Soc. Am. 101(4), 1649–1659 (2011).
 27
T. Broderick, M. Dudk, G. Tkacik, R. E. Schapireb & W. Bialek . Faster solutions of the inverse pairwise Ising problem. arXiv 0712.2437 (2007).
 28
NIST Digital Library of Mathematical Functions. 2014. http://dlmf.nist.gov/8.7#E3.
 29
P. McCullagh & D. R. Cox . Invariants and likelihood ratio statistics. Ann. Statist. 14(4), 1419–1430 (1986).
 30
S. G. Self & K.Y. Liang . Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Stat. Assoc. 82, 605–610 (1987).
 31
C. J. Geyer . On the asymptotics of constrained Mestimation. 22(4), 1993–2010 (1994).
 32
J. del Castillo & A. LópezRatera . Saddlepoint approximation in exponential models with boundary points. Bernoulli 12(3), 491–500 (2006).
 33
Y. Y. Kagan . Why does theoretical physics fail to explain and predict earthquake occurrence? In P. Bhattacharyya & B. K. Chakrabarti. editors Modelling Critical and Catastrophic Phenomena in Geoscience, Lecture Notes in Physics 705, pages 303–359 Springer, Berlin (2006).
 34
Y. Y. Kagan & F. Schoenberg . Estimation of the upper cutoff parameter for the tapered Pareto distribution. J. Appl. Probab. 38A, 158–175 (2001).
 35
Q. H. Vuong . Likelihood ratio tests for model selection and nonnested hypotheses. Econometrica 57(2), 307–33 (1989).
 36
A. Clauset, C. R. Shalizi & M. E. J. Newman . Powerlaw distributions in empirical data. SIAM Rev. 51, 661–703 (2009).
 37
Y. Y. Kagan, P. Bird & D. D. Jackson . Earthquake patterns in diverse tectonic zones of the globe. Pure Appl. Geophys. 167(6), 721–741 (2010).
 38
B. Rosner . Fundamentals of Biostatistics. Cengage Learning (Boston), 8th edition (2016).
 39
K. Christensen & N. R. Moloney . Complexity and Criticality. Imperial College Press, London (2005).
 40
I. G. Main & P. W. Burton . Information theory and the earthquake frequencymagnitude distribution. Bull. Seismol. Soc. Am. 74(4), 1409–1426 (1984).
 41
D. Sornette & A. Sornette . General theory of the modified GutenbergRichter law for large seismic moments. Bull. Seismol. Soc. Am. 89(4), 1121–1130 (1999).
 42
G. Casella & R. L. Berger . Statistical Inference. Duxbury, Pacific Grove CA, 2nd edition (2002).
Acknowledgements
We are grateful to J. del Castillo, Y.Y. Kagan, I.G. Main, M. Naylor, and F. Schoenberg for their feedback. Research expenses were founded by projects FIS201231324, FIS201571851P, and MTM201231118 from Spanish MINECO, 2014SGR1307 from AGAUR, and the Collaborative Mathematics Project from La Caixa Foundation (I.S.).
Author information
Affiliations
Contributions
Both authors discussed the problem. I.S. performed the statistical analysis. A.C. wrote a draft of the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Serra, I., Corral, Á. Deviation from power law of the global seismic moment distribution. Sci Rep 7, 40045 (2017). https://doi.org/10.1038/srep40045
Received:
Accepted:
Published:
Further reading

Probability estimation of a Carringtonlike geomagnetic storm
Scientific Reports (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.