Testing the power-law hypothesis of the interconflict interval

War is an extreme form of collective human behaviour characterized by coordinated violence. We show that this nature of war is substantiated in the temporal patterns of conflict occurrence that obey power law. The focal metric is the interconflict interval (ICI), the interval between the end of a conflict in a dyad (i.e. a pair of states) and the start of the subsequent conflict in the same dyad. Using elaborate statistical tests, we confirmed that ICI samples compiled from the history of interstate conflicts from 1816 to 2014 followed a power-law distribution. We then demonstrate that the power-law properties of ICIs can be explained by a hypothetical model assuming an information-theoretic formulation of the Clausewitz thesis on war: the use of force is a means of interstate communication. Our findings help us to understand the nature of wars between regular states, the significance of which has increased since the Russian invasion of Ukraine in 2022.


Introduction
Investigating the statistical patterns behind historical cases of interstate conflicts is pivotal, especially from the following perspectives: First, if identified, these patterns encourage an inductive approach to conflict mechanisms.This approach, used to formulate hypotheses based on experimental or empirical observations and to test their predictions, has been reliably employed in the natural sciences.In international relations theory, in contrast, the theory-based approach, which attempts to derive a novel theory from plausible assumptions by deduction, has been more favored for solving, for instance, war's inefficiency puzzle (Fearon 1995).Second, robust statistical patterns are a prerequisite for estimating the probability of the future occurrence of an interstate conflict in a specified dyad (a pair of states) (Herge et al. 2017).Forecasting the future occurrence of an interstate conflict is beneficial for scholarly understanding of conflict processes and supporting policymaking by international organizations to prevent, manage, or resolve armed conflict, or by individual states to establish national security.From this practical perspective, finding robust statistical patterns in any aspect of interstate conflict is desirable.
In fact, the prediction of event occurrences, such as war, alliance formation, and revolutions, is at the heart of international relations theory, and scholars in the discipline have realized that a proper arrangement of time is required to achieve sophisticated predictions (Beck et al. 1998;Crescenzi & Enterline 2001).Whether we should explicitly theorize the relationship between time and the events under study (Crescenzi & Enterline 2001; Carter & Signorino 2010) or not (Beck 2010), it is now standard practice to account for time in modeling the occurrence of events by including splines or polynomials (Beck et al. 1998;Carter & Signorino 2010) or by using Cox duration models (Metzger & Jones 2022).These attempts suggest that time plays a vital role in international events, the most notable of which is war.Therefore, exploring the temporal structures of conflict processes is a prerequisite to understanding and predicting wars.
Notably, scholars of complexity science, instead of those of mainstream international relations theory, have enthusiastically sought the statistical patterns of conflict.Outstanding findings were made by the English physicist Lewis Fry Richardson more than three-quarters of a century ago (Richardson 1948;Richardson 1960).He found that the severity of war, measured by battle deaths, followed the power law.These findings were later confirmed in more detailed studies ( This study shows that the power-law property also resides in the temporal aspects of interstate conflicts.This study focuses on the temporal structure of conflict occurrence rather than the spatial structure, such as the scale of war, because the temporal structure is thought to be more fundamentally related to decision-making by states regarding military action, which is the core process of conflict occurrence.Additionally, robust statistical patterns in the temporal structure can be used more directly to predict the future occurrence of conflict because prediction is a task along the temporal dimension.As a key quantity in our analysis, we define the inter-conflict interval (ICI) as the interval between the end of a conflict in a dyad and the start of the subsequent conflict in the same dyad.ICI samples eligible for statistical analysis were obtained from a dataset provided by the Correlates of War Project.Using an elaborate statistical method to test the power-law hypothesis, we confirmed that the ICIs follow a power-law distribution.We formulated a hypothetical model that accounted for the power-law distribution of ICIs under minimal assumptions.We then tested the prediction of this model, which states that the power law holds true for individual dyads.

Results
Terminology: interstate war, militarized interstate dispute, and interstate conflict First, we specify the definitions of interstate wars, militarized interstate disputes, and interstate conflicts.Precise definitions of the first two terms are provided by the Correlates of War (COW) Project (https://correlatesofwar.org/).An interstate war is a series of sustained battles between the armed forces of two or more states that have resulted in at least 1,000 battle deaths (Maoz et al. 2019).A militarized interstate dispute is a historical case of conflict in which the threat, display, or use of military force short of interstate war by one state is explicitly directed towards the government, official representatives, official forces, property, or territory of another state (Jones et al. 1996; Maoz et al. 2019).In this study, we use the third term, interstate conflict, to express the union of militarized interstate disputes and interstate wars.
In a militarized interstate dispute or an interstate war, military action taken by one or both states is often preceded by political issues between them, such as conflicting national interests or disagreements over foreign policy.The inclusion of militarized interstate disputes and interstate wars in our analysis is appropriate if we stand on the view that the use of force, in any form, should be a way to resolve international issues, whereas previous studies on the severity of interstate conflict have focused only on interstate wars (Cederman et al. 2011; Cirrilo & Taleb 2016; Clauset 2018).Our view resonates with the famous thesis of Prussian general and war philosopher Carl von Clausewitz (Howard 2002): "WAR IS MERELY THE CONTINUATION OF POLICY BY OTHER MEANS" (Clausewitz 1832) He derived a corollary from this thesis, arguing that: "The political object---the original motive of the war---will thus determine both the military objective to be reached and the amount of effort it requires."(Clausewitz 1832) The use of force in an actual war must be proportional to the political objectives.Thus, Clausewitz's arguments motivated us to address militarized interstate disputes (MIDs) and interstate wars without distinction.

Dataset of interstate conflict
We used Dyadic MID Version 4.02 (MID4.02), a dataset provided by the COW Project (Maoz et al 2019).The dataset records interstate conflicts between 1816 and 2014.Each interstate conflict is specified by a dyad (a pair of states) engaged in the conflict and the start and end dates of the conflict.

Inter-conflict interval
We sought to identify robust statistical patterns behind the temporal structure of the occurrence of interstate conflicts.The inter-conflict interval (ICI) is the critical quantity to this end and is defined as the interval between the end of a conflict in a dyad and the start of the next conflict in the same dyad (Fig. 1).We obtained 2,369 ICI samples from MID4.02, each measured in days.These ICI samples were collected from all dyads.

Testing the power-law hypothesis of ICIs
First, the distribution of these 2,369 ICIs binned with a width of 365.25 days (~one year) was examined in log-log and linear-log plots.Falling into a straight line in a log-log or linear-log plot is characteristic of a power-law or exponential distribution, respectively.The linear regression results in both plots suggested that the ICIs followed a power-law distribution ( 2 = 0.9179, Fig. 2a) instead of an exponential distribution ( 2 = 0.6905, Fig. 2b).We also examined the power-law fitting using a more elaborate statistical test proposed by Clauset et al. (2009), which we call the Clauset-Shalizi-Neuman (CSN) test.The empirical distributions of the ICI samples would have an artificial upper bound because of the limitations of the recording period, even if they were generable from power-law distributions with infinitely extended tales (see Materials and Methods for further details).To consider the possible existence of artificial upper bounds, we used a modified version of the CSN test (mCSN test).
The power-law hypothesis to be examined using the mCSN test is mathematically expressed as follows: () =  − () ⁄ ( min ≤  ≤  max ).Here, the argument  takes an integer value;  is the exponent of the power-law distribution;  min and  max are the lower and upper bounds of the range where the power law holds, respectively; is the normalization constant equal to the partition function.The details of the mCSN test are provided in the Materials and Methods.In brief, we first estimated the exponent  and lower bound  min , and then calculated the -value.Let  � and  � min be the estimated values of  and  min , respectively.The upper bound  max was used as the control parameter.Therefore,  � and  � min as well as the -value were given as a function of  max .Clauset et al. (2009) proposed the conservative decision criteria: If  ≤ 0.1, the power-law hypothesis is ruled out; otherwise, it is plausible.The same criteria were used in this study.
The results of the mCSN tests are shown in Fig. 3.The -value exceeded the criteria of 0.1 (indicated by the horizontal dashed line in Fig. 3a) for up to  max slightly longer than 20,000 days (~55 years) (Fig. 3a).In Figs.3b and c, we observe that  � and  � min are almost constant with  max ;  � is approximately 1.3 and  � min is approximately 250 days (<1 year).From these observations, we conclude that the ICI obeys the power law for the range of 250-20,000 days.Approximately 80% of the ICI samples were within this range for  max = 20,000 (Fig. 3d).

Information-theoretic model of interstate conflict
Next, we built a hypothetical model that accounted for the observed power-law properties of ICIs.Consider a dyad of states A and B. Suppose that the -th conflict   is provoked by either state.We refer to the state that triggers conflict as the provoker and the opponent state as the withstander.Conflict   is characterized by the time of its occurrence and the military actions taken during the conflict.Let this time and the military actions be represented by stochastic variables   and   , respectively.For simplicity, we assume that the period bounded by the start and end of a conflict contracts to a point.Therefore,   takes the real value   ∈  1 .In contrast, corresponding to the various possibilities of the course of a war,   would take multidimensional values   that would be categorical or numerical.Furthermore, each military action may be led by either the provoker or the withstander, as contingent switching between offense and defiance is the case during the course of war.Nevertheless, in the following discussion, we formally deal with   without addressing its mathematical details.
After the settlement of conflict   , a post-conflict order is established, whether or not it is what the provoker desires.Then, either state, which is discontent with the status of this order and wants to change it to what is more favorable to it, intends to provoke the next conflict  +1 .The provoker of conflict  +1 may or may not be the same as that of conflict   .The time of conflict  +1 and military actions taken during this conflict are represented by the stochastic variables  +1 and  +1 , respectively.hand side describes our hypothesis that the amount of information transferred from {  ,   } to { +1 ,  +1 } is equivalent to the amount of information mutually exchanged between the two states through their engagement in consecutive conflicts   and  +1 .(b) A graphical model representing the causal relation between   and  +1 , which is obtained by marginalizing the graphical model on the left side of (a) over   and  +1 and corresponds to the probability (  ,  +1 ).
The end (purpose) of war is to attain a political objective, and military action is a means to achieve this objective.Both the provoker and the withstander conceived their own purposes.The provoker's purpose is to compel the other to submit to its will, whereas the withstander's purpose is to compel the provoker to withdraw.The variable   describes when a political disagreement between the two states becomes critical and either or both states decide to resolve this by force.In this respect,   reflects the purpose of the war.In fact,   encodes when the purpose is conceived but does not say what it is.Because the means should be aligned with the purpose,   instead of   reflect what the purpose is.As conceiving a purpose precedes choosing the means,   causally precedes   .In summary, the causal relationships between   ,   ,  +1 , and  +1 are expressed by the graphical model shown on the left side of Fig. 4a, which corresponds to the joint probability (  ,   ,  +1 ,  +1 ).
According to information theory, the amount of information carried by stochastic variables {  ,   } is measured by entropy: The amount of information successfully received by the stochastic variables { +1 ,  +1 } out of the total sent by {  ,   }, which is precisely the entropy [  ,   ], is measured by mutual information: It is reasonable to hypothesize that the amount of information transferred from {  ,   } to { +1 ,  +1 } corresponds to the amount of information mutually exchanged between the two states through their engagement in consecutive conflicts   and  +1 (see the illustration on the right-hand side of Fig. 4a).
Equation ( 2) can be arranged as where Thus, the total amount of information exchanged between the two states through their engagement in consecutive conflicts   and  +1 is equivalent to the sum of the mutual ] as that exchanged at the military operation level.The latter is particularly relevant to the extent to which battle lessons from military operations taken during conflict   influence those taken during conflict  +1 .We considered conflicts to be event units, and the success or failure of military operations conducted during each conflict was outside the scope of this study.That is, our main interest is the communication between the two states at the national strategy level.Therefore, we focus on [  ,  +1 ].In doing so, we marginalize the graphical model in Fig. 4a over   and  +1 to obtain the graphical model in Fig. 4b, which corresponds to the probability (  ,  +1 ) = ∫    +1 (  ,   ,  +1 ,  +1 ).
Thus, our consideration leads to the intriguing notion that the amount of information exchanged between the two states at the national strategy level depends only on the relative timing of their engagement in consecutive conflicts.In the present study, we followed this notion without further verification.Future studies should address historical cases of interstate conflicts to verify this notion empirically.
The interval between conflicts   and  +1 , now given by  +1 −   , also served as a stochastic variable.Once (  ,  +1 ), the joint probability of   and  +1 , is known, ( +1 −   ), the distribution of  +1 −   , can be easily calculated.Therefore, we determined the functional forms of (  ,  +1 ).Information theory states that a probability distribution that actually exists maximizes entropy.In general, entropy maximization is performed under constraints that specify the objects or phenomena of interest.To define the constraints in our case, we assume that states A and B, struggling with their national interests and survival, will behave according to the trade-off between the principle of promptness and the principle of seriousness.
The need for the first principle of promptness can be easily understood.Suppose that the status quo is unfavorable for state A. The longer this status continues, the more state A will incur losses in national interest.To prevent further losses, state A intends to take military action in any form against state B to change the status quo as promptly as possible.The principle of promptness implies a behavioral tendency to avoid wasting time.
The second principle, seriousness, is closely related to communication in an information-theory sense.Remind Clausewitz's fundamental thesis: "War is merely the continuation of policy by other means."We now interpret this thesis from the perspective of modern information theory, rephrasing it as follows.The use of military force is a means of interstate communication.To formulate interstate communication through force within the framework of information theory, it is useful to note Clausewitz's argument.
"War is no pastime; it is no mere joy in daring and winning, no place for irresponsible enthusiasts.It is a serious means to a serious end, …" (Clausewitz 1832).This implies that the state responds seriously to an opponent's move.(Serious responses do not necessarily mean rational responses; see Discussion).From an informationtheoretic perspective, a pair of states acts in such a way that the communication between variables {  ,   } and { +1 ,  +1 } (Fig. 4a) is as efficient as possible.Even after marginalization (Fig. 4b), the remaining variables   and  +1 should be as mutually dependent as possible.The principle of seriousness implies that there is no room for behavioral redundancy in the theater, where rival states act per their national interests and survival.
To achieve 'a serious means to a serious end,' the principle of promptness alone is inadequate.Suppose that conflict occurs at a high frequency, following this principle; however, the timing of each conflict occurrence is statistically independent of that before it (this is the case if a conflict occurs following a Poisson process).This implies that conflict occurs only erratically, which is the opposite of seriousness.
The constraints for entropy maximization to determine the functional forms of (  ,  +1 ) are defined by the principles above.For mathematical simplicity, we consider the case where   and  +1 take continuous values: −∞ <   <  + Δ ≤  +1 < +∞, where Δ (> 0) is the minimum length of ICI.The constraint representing the principle of promptness is defined as the force required to reduce  +1 −   .Because  +1 −   is a stochastic variable, its statistical mean is reduced, not its raw value.There are several types of statistical methods, such as arithmetic or geometric.Therefore, the question arises: What kind of statistical means should we choose?More specifically, what kinds of statistical means of  +1 −   do states behave to reduce?We leave aside this problem and instead consider the generalized mean, which can express a variety of statistical means by varying the parameterization.We later demonstrate that the parameterization is determined by the second principle.
The generalized mean of  +1 −   is given by [ ] ( )( ) where  is the parameter characterizing the generalized mean and Δ (> 0) is the minimum length of the possible interval between conflicts   and  +1 .By varying , Eq. ( 4) yields various statistical methods.For example, Eq. ( 4) is equal to the arithmetic mean for  = 1 and approaches the geometric mean for  → 0.
The joint entropy of  +1 and   is hence given by The first term on the right-hand side represents the Shannon's entropy.The second term is introduced according to the first principle of promptness and expresses the force required to reduce the generalized mean of  +1 −   ; the coefficient  (> 0) controls the strength of this force.The third term, where  is a Lagrange multiplier, ensures the normalization condition that (  ,  +1 ) are summed to unity.Maximizing entropy (5) with respect to (  ,  +1 ), with rescaling of where ( ) ( ) is the normalization factor.As expected, Eq. ( 6) becomes equal to the exponential distribution (  ,  +1 ) ∝ exp[−( +1 −   )] for  = 1 and approaches the power-law distribution (  ,  +1 ) ∝ ( +1 −   ) − for  → 0 (Visser 2013).For (  ,  +1 ) to be normalized,  should be nonzero positive.
Next, we demonstrate that the value of  is determined by the second principle of seriousness.As previously discussed, this principle makes stochastic variables   and  +1 mutually dependent as much as possible.Information theory states that the mutual dependence between stochastic variables can be estimated by mutual information: where are marginal probabilities.Using the forms of Eqs. ( 6) and ( 7), and taking Δ → 0, we can analytically calculate the right-hand side of Eq. ( 8) to obtain [ ] ( ) ( ) where Γ(⋅) denotes the gamma function.The principle of seriousness argues that mutual information [  ,  +1 ] should be maximized.Fig. 5 shows (, ) as functions of  (> 0) and  (≥ 1).For each value of , (, ) is maximized for  → +0.Thus, the principle of seriousness, which is embodied by the maximization of mutual information, leads to the power-law distribution of  ≡  +1 −   : ( ) ( ) Fig. 5: Mutual information [  ,  +1 ] = (, ) + constant as functions of  and .The analytical form of (, ) is given by Eq. ( 10).The curves in the coordinate plane plot (, ) as a function of  for different values of  (varied from 1.0 to 3.0 in 0.01 increments, with the lowest curve for  = 1.0).

Dilution of the power-law process: relation between the model and observation
Because the interval between consecutive occurrences of conflict, but not the duration of each conflict itself, was of interest, we prescribed the duration of each conflict to be contracted to a point in time.With this mathematical simplification, the conflict occurrence in each dyad can be viewed as a point process (Fig. 6a, filled black circles).
Our information-theoretic model predicts that the point-to-point intervals of this process follow a power-law distribution.However, we should be aware of the possibility of a recording bias.Some cases of militarized interstate disputes may have been overlooked in the collection of data and were not recorded.Therefore, the point process compiled from the dataset is obtained by diluting the original point process generated by the model (Fig. 6, blank red circles).
Therefore, it is necessary to examine whether the diluted point process also follows the power law and, if so, whether the power-law exponent for the diluted point process is equal to that for the original point process.The power-law distribution given by Eq. ( 11) has a lower bound Δ (> 0) in the domain.For Δ → 0, the diluted point process follows the same power law as the original process, owing to the scale-invariant property of the power-law distribution.However, in reality, Δ would be slightly greater than zero because a minimum length of time is required (for example, to redeploy resources) before the invocation of the next conflict.To examine whether the point process obtained by diluting the original power-law process with Δ > 0 also follows the power law, we conducted the following numerical experiments: A sample of the point process is generated so that pointto-point intervals follow a power-law distribution with Δ = 1.The generated point process is then diluted with probability ; that is, each point is left and abandoned with probabilities  and 1 − , respectively.The point-to-point intervals collected from the diluted process then undergo the mCSN test to calculate the -value and estimate the fitted power-law exponent  �.
The experimental results are shown in Figs.6b and c.The -value (Fig. 6c) and fitted power-law exponent  � (Fig. 6b) are plotted as a function of the probability .For any , the -value averaged over 100 calculations is substantially larger than criteria of 0.1 (Fig. 6c), indicating that the power-law hypothesis for the diluted point process is plausible.The fitted power-law exponent decreases from the original value for  as  decreases (Fig. 6b).These results demonstrate that point-to-point intervals collected from the diluted process, which model observed ICIs, also follow a power-law distribution, although its exponent is reduced from the original .

Mixture of power-law distributions
Our information-theoretic model predicts that conflict occurrences in each dyad follow the power law.In addition, the power-law exponent may differ for each dyad because it originates from a predefined value for , which is not necessarily consistent for every dyad.We will later see that the power-law exponent inferred from the real data varies from dyad to dyad.Therefore, the distribution of ICIs collected from all dyads, which we have shown to follow a power-law distribution (Fig. 2), should be a mixture of power-law distributions originating from different dyads, with possibly different exponents.Therefore, verifying whether a mixture of power-law distributions can be approximated accurately using a single power-law distribution is necessary.Indeed, we have accomplished this, the detailed descriptions of which are provided in Materials and Methods.

Testing the power-law hypothesis in individual dyads
Our information-theoretic model predicts that ICIs collected from individual dyads will follow separate power laws.To test this prediction, we applied the mCSN test to seven dyads: CHN-RUS, CHN-US, GMY-FRN, IND-PAK, IRN-IRQ, ISR-SYR, and RUS-US.We used the following abbreviations: CHN (China), FRN (France), GMY (Germany), IND (India), IRN (Iran), IRQ (Iraq), ISR (Israel), SYR (Syria), and US (the United States).We chose these seven dyads because they provided the number  ≥ 20 of ICI samples that were likely eligible for statistical examination.
The results of the mCSN tests are presented in Table 1.In this test,  max was chosen as max    , which was the maximum ICI sample for each dyad.Noticeably, the power-law hypothesis of ICIs was plausible for all the dyads we examined ( > 0.1 for every dyad).The ratio  D /, where  and  D are the total number of ICIs and the number of ICIs equal to or larger than the estimated lower bound  � min , respectively, was substantially large (>0.7) for every dyad, indicating that the power law holds for a wide range of ICI.
We also compared the power-law hypothesis with the alternative hypothesis that the ICIs follow an exponential distribution.From a set of ICI samples such that  � min ≤ ICI ≤  max , 100 pseudo datasets were synthesized using the bootstrap process.We calculated the maximum log-likelihood of the exponential and power-law distributions for each synthesized dataset.A paired -test was conducted to examine whether the maximum log-likelihood of the power-law distribution (log  (p.l.) ) was significantly larger than that of the exponential distribution (log  (exp) ).The results summarized in Table 2 show that the power-law distribution is significantly more plausible than the exponential distribution for every dyad.Thus, we concluded that the ICIs in each dyad followed a power-law distribution, which is consistent with the predictions of our model. .: the number of ICI samples for each dyad. � min : the estimated value of  min . D : the number of ICI samples within the domain  � min ≤  ≤  max . D  ⁄ : the ratio of ICI samples within the domain. �: the estimated value of the power-law exponent .The bottom row lists the -value of the mCSN test.For the -value larger than the criteria of 0.1, as indicated by the asterisk (*), the power-law hypothesis is plausible.

CHN-RUS CHN-US FRN-GMY IND-PAK IRN-IRQ ISR-SYR
RUS-US 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  The upper row lists the mean difference 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  for each dyad.The mean 〈log  � (p.l.) 〉  was calculated by averaging the loglikelihood for the power-law hypothesis over  = 100 pseudo series of ICIs generated using the bootstrap process.The mean 〈log  � (exp) 〉  of the loglikelihood for the exponential-distribution hypothesis was calculated similarly.Positive values of the quantity 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  indicate that the power-law hypothesis is more likely than the exponentialdistribution hypothesis.The bottom row lists the -value of the paired -test for each dyad to demonstrate the significance of the positivity or negativity of this quantity.
The estimated power-law exponent  � varies from dyad to dyad, ranging from ~1.0 to ~2.0 (Table 1).These estimated values were robust, as confirmed by bootstrap analysis (Fig. 7).Variable  � across dyads, albeit robustly estimated in each dyad, supports the notion that the distribution of total ICIs, which has been shown to obey the power law with an exponent of ~1.3 (Fig. 2), is a mixture of power-law distributions with variable exponents.

Figure 7:
The power-law exponent varies from dyad to dyad.The estimated power-law exponent  � for each dyad is indicated by the filled black bar.To confirm the stability of this estimation, 100 pseudo series of ICIs were synthesized using the bootstrap process, for each of which the power-law exponent was re-estimated.The filled blue circle and error bar indicate the mean and standard deviation of  � calculated using the bootstrap process.
The mCSN test ensures that the power-law hypothesis is more plausible than the exponential-distribution hypothesis in the estimated domain  � min ≤  ≤  max .However, this does not necessarily exclude the possibility that the exponential-distribution hypothesis is more plausible than the power-law hypothesis in another domain.This may have occurred, especially when the number of ICIs was small, as in the present case.To examine this possibility, we conduct an mCSN test to examine the exponential-distribution hypothesis.The results are summarized in Table S1.In contrast to the mCSN test of the power-law hypothesis, which gives a -value larger than the criteria of 0.1 for any of the seven dyads, the -value of the mCSN test of the exponential-distribution hypothesis is below the criteria for three dyads (CHN-US, IRN-IRQ, and RUS-US).Therefore, the exponential distribution hypothesis for the estimated domains was excluded from these dyads.The mean difference 〈log  (p.l.) 〉  − 〈log  � (exp) 〉  calculated using the bootstrap process was positive for two dyads (CHN-US and FRN-GMY) (Table S2), implying that the power-law hypothesis is more likely than the exponential distribution hypothesis in the estimated domains for these dyads.Furthermore, the variability in the estimated  ̂ appeared to be more sprawling across the dyads (Fig. S1) than the estimated  � (Fig. 7), which implies a less robust estimation of  ̂.Although it is difficult to judge which hypothesis is more plausible, comparing the results shown in Table 1, Table 2, and Fig. 7 with those shown in Table S1, Table S2, and Fig. S1 strongly suggests that fitting a power-law distribution to the ICI samples for each dyad is more suitable.

ICIs are independent and identically distributed
Our information-theoretic model predicts that the ICIs in each dyad are generated independently from an identical power-law distribution.In contrast, the interval   between the timing of the ( − 1)-th and -th fatal attacks in insurgency and terrorism approximately follows a power-law progress curve   =  1  − , most typically with escalation ( > 0) and sometimes with de-escalation ( < 0) (Johnson et al. 2011;Johnson et al. 2013).We conducted the following statistical experiment to confirm that the actual generation of ICIs in each dyad was independent and identically distributed and that the observed power-law distribution of ICIs was due to neither escalation nor de-

CHN-RUS CHN-US FRN-GMY IND-PAK IRN-IRQ ISR-SYR RUS-US
escalation.Let   be the -th ICI generated in a certain dyad and  (1) () be the first-order autocorrelation calculated for the ICI series  = { 1 , ⋯ ,   } (see Materials and Methods for details).If series  followed escalation or de-escalation,  (1) () would be significantly high.From this series,  = 10,000 pseudo series were synthesized by bootstrapping.These pseudo series follow independent and identically distributed processes.We then calculated the distribution of the first-order autocorrelations over these pseudo series.For this distribution, which normally has a single peak around zero, a rejection area is defined rightward with a significance level   , for which we chose a conservative value (  = 0.1).This rejection area (red shaded in Fig. 8) corresponds to the possibility that the positive correlation between   and  +1 is significantly high, as is the case for escalation and de-escalation.Another rejection area was defined to the left at the same significance level (  = 0.1).This area (blue shaded in Fig. 8), for which the negative correlation between   and  +1 is significantly high, indicates the tendency that a longer ICI is followed by a shorter ICI, and vice versa, thereby producing oscillatory progress.
We can test the null hypothesis that series  fails to have nontrivial (i.e., significantly positive or negative) first-order autocorrelation by examining the location of the value for  (1) () in the distribution; if it enters either of the rejection areas, the null hypothesis is ruled out.For any of the seven dyads (CHN-RUS, CHN-US, GMY-FRN, IND-PAK, IRN-IRQ, ISR-SYR, and RUS-US), the value for  (1) (), indicated by the vertical lines in Fig. 8, is outside the rejection areas.Therefore, the null hypothesis is not rejected for these dyads.It is unlikely that a series without a first-order autocorrelation will have a higherorder autocorrelation.Thus, we concluded that the actual generation of ICIs in each dyad follows an independent and identically distributed process, which is consistent with the predictions of our model.From the actual series of ICIs in each dyad, 10,000 pseudo series were synthesized by bootstrapping.ICIs in each pseudo series conform to independent generation from an identical distribution.The curve in each panel shows the distribution of first-order autocorrelations calculated for the 10,000 pseudo series.The leftward and rightward 10% areas (blue and red shading, respectively) reject the null hypothesis that the actual series of ICIs fails to exhibit nonvanishing first-order autocorrelation.The vertical solid line in each panel indicates the first-order autocorrelation  (1) () of the actual series.

Discussions
The severity of war, measured by battle deaths, has been well documented to follow power law since Richardson's proposition in his seminal works (Richardson 1948;Richardson 1960).This study demonstrated that power law also resides in the temporal aspects of interstate conflicts.To this end, we define the inter-conflict interval (ICI) as a critical quantity for exploring temporal statistics.We find that the ICIs compiled from the history of interstate conflicts from 1814 to 2014 follow a power-law distribution.We then propose an information-theoretic model to account for our empirical findings.The model assumes that a pair of states constituting a dyad, struggling with their national interests and survival, will act to balance the principles of promptness and seriousness.The former and latter principles are mathematically formulated as constraints to reduce the generalized mean of the ICI and maximize the mutual information between the timings of consecutive occurrences of conflict.Under these constraints, entropy maximization yields a point process with a point-to-point interval that obeys the power law.The model predicts that a series of ICIs in each dyad are independently generated from an identical power-law distribution.However, the power-law exponent may vary from dyad to dyad.We statistically analyzed individual dyads separately and obtained results consistent with the predictions of the model.
To test the power law hypothesis of ICIs collected from all dyads or collected separately from individual dyads, we used a modified version of the rigorous statistical method proposed by Clauset et al. (2009).This method, which we call the mCSN test, calculates the -value and then judges whether a specific hypothesis (e.g., the power-law hypothesis or the exponential-distribution hypothesis) of ICIs is plausible if the obtained -value is larger than the criteria of 0.1; otherwise, it is ruled out.Noticeably, the -value for ICIs collected from all dyads, plotted as a function of  max (Fig. 3a), shows a conspicuous trough at approximately 9,000 days (~25 years), even though the -value around this trough is slightly larger than 0.1, indicating that the power-law hypothesis is barely plausible.We suppose that this trough was attributable to the interwar period bounded by the end of WWI (1918) and the beginning of WWII (1939).The power-law hypothesis of ICIs premises that the conflict process in each dyad is independent of those in other dyads.However, this premise was apparently violated during WWI and WWII when many countries became involved in war almost simultaneously and automatically, according to either side of the opposing camps they had taken.The resulting excess number of ICIs, whose lengths were comparable to those of the interwar period, eventually caused a substantial deviation in the tail shape of the distribution from the power law.To confirm this supposition, we removed ICIs related to either of the world wars by leaving ICIs whose end and start dates were before the start of 1914 and after the end of 1945, respectively, and then applied the mCSN to the remaining samples.With this prescription, trough levels disappeared (Supplementary Materials, Fig. S2).This implies that WWI and WWII, in which interstate wars occurred worldwide and cooperatively, were historically unique events.
Except during WWI and WWII, the results obtained in the present study support the idea that conflict processes in individual dyads are independent of each other.Nevertheless, examining in more detail the influence of conflict processes in some dyads on others, if any, is of substantial interest, as recent studies suggest that higher-order interactions, as well as pairwise (i.e., first-order) interactions between states, affect conflict occurrence ( The ICI samples collected from all dyads were well-suited to a power-law distribution with an exponent of ~1.3 (Fig. 3b).In general, the value of the power-law exponent is related to the frequency of event occurrence; the larger the power-law exponent, the more frequent the events.As the dataset includes interstate conflicts over the past 200 years (1816-2014), a question arises: Is the power-law exponent consistent or changing over the past 200 years?A recent study conducting out-of-sample cross-validation demonstrated that causal models of war vary periodically (Jenke & Gelpi 2017).To address this, we also examined the power-law hypothesis by dividing the entire period (1816-2014) into the following eras: (i) the first half of the 19th century (1816-1858); (ii) the second half of the 19-th century (1859-1899); (iii) the first half of the 20-th century lasting from 1900 to just after WWII (1946); (iv) the Cold War era , and (v) the post-Cold War era (1990~the present ( 2014)).The results obtained demonstrate that, as time passes, the value of the power-law exponent gradually increases ( � =1.18, 1.0, 1.22, 1.7, and 1.71 for eras (i), (ii), (iii), (iv), and (v), respectively; see Supplementary Materials, Table S3, and Fig. S3).A gradual increase in the frequency of conflict over the last 200 years is the most naïve interpretation of these observations.However, the observed increase in the powerlaw exponent can be attributed to a recording bias.Some interstate conflicts, especially those in older eras, may have been overlooked when compiling the data.
The -value for the second era (iii), lasting from 1900 to 1946, was 0.0286 (< 0.1) (Table S3).Therefore, the power-law hypothesis is not plausible.As demonstrated in Fig. S2, the ruling out of the power law hypothesis was most likely caused by the inclusion of the interwar period in this era.Therefore, we trimmed the last six years of this era.ICIs compiled from this trimmed-off era, lasting from 1900 to 1938, no longer involved an excess number of ICIs compared to the interwar period.Indeed, we obtained  = 0.449 (> 0.1) for this trimmed-off era, say (iii'), confirming the plausibility of the powerlaw hypothesis (Table S3).
This study demonstrates that the ICI, the interval bounded by consecutive conflicts occurring in the same dyad, follows the power law.In contrast, Richardson's earlier works (Richardson 1944(Richardson , 1945) ) suggested that the timing of onset of full-scale wars (interstate wars, in our terminology), occurring anywhere in the world, obeys a Poisson process; that is, the interval between the timing of onset of consecutive wars occurring anywhere in the world follows an exponential distribution.Therefore, we sought to examine whether the timing of onset of interstate conflicts, counted without specifying the dyad, also follows an exponential distribution.To this end, we defined the dyad-unconditioned inter-conflict interval (DUC-ICI, Fig. S4).Considering the possibility that the rate of conflict occurrence may change over the years (Clauset 2018), we used the above division of the entire period into five eras.DUC-ICI samples were compiled separately for each era and then underwent the mCSN tests.The results of the mCSN tests for the power-law hypothesis (Fig. S5, Table S4, and Table S5) and the exponential-distribution hypothesis (Fig. S6, Table S6, and Table S7) indicate that the DUC-ICIs for each era is more likely to follow an exponential distribution than a power-law distribution, consistent with the Richardson's suggestion.
We do not ask whether the action taken by either state is a rational means of achieving its political objective, whether the political objective itself is reasonable, or whether it is achieved as intended by settling the conflict.This contrasts with the game-theoretical approach to interstate wars, which assumes that actors behave rationally.This approach has been favored in mainstream international relations theory.For instance, in his game-theoretic model with the assumption that states are 'rational' actors, James Fearon (1995) demonstrated that 'inefficient' (in the sense that they cannot reach a deal that is mutually less costly than an armed confrontation) war can take place between them due to a lack of communication, intentional or unintentional.Our information-theoretic model, which regards armed violence as a means of communication in and of itself, may appear contradictory to Fearon's model as we argue that serious acts do not necessarily conform to rational ones.Nevertheless, we also argue that states only consider the timing of the previous conflict in deciding when to initiate an armed conflict, thereby disregarding its means and costs.Therefore, our insights resonate with the motivation behind the Fearon model.
Our information-theoretic model argues that ICIs are independently generated from an identical power-law distribution in each dyad.The absence of a first-order autocorrelation for a series of actual ICIs in the individual dyads supports this notion.However, the present study did not examine whether autocorrelation resides in the size of interstate conflicts; for instance, every time a conflict occurs, its size grows or shrinks.This issue will be addressed in future studies.
Our model is based on an information-theoretic formulation of the hypothesis that military force is a form of interstate communication.The lines of evidence obtained by the statistical analysis of the MID4.02dataset support the plausibility of this hypothesis.This hypothesis might contradict the widespread view that interstate war arises from a lack of communication between states.However, from an information-theoretic perspective, the observed power-law property of ICI is the hallmark of maximally efficient communication through violent means.
Power laws are ubiquitously observed in the time course of human behavior, such as email/surface-mail correspondence and web browsing (Barabashi 2005;Vazquez et al. 2006).What makes our findings unique lies in the argument that the power-law property of ICI arises from the interaction between rival states, which we model as a form of communication in the information-theoretic framework.Here, we draw on Clausewitz's statement, which supports this argument."War, however, is not the action of a living force upon a lifeless mass … but always the collision of two living forces."(Clausewitz 1823).Indeed, two classes of models, priority queuing models and modulated Markov processes, have been discussed to account for the power-law property of inter-event intervals empirically observed in human behavior.In contrast to our information-theoretic model, these models assume that individuals behave independently.Priority queuing models (Barabashi 2005;Vazquez et al. 2006;Wu et al. 2010;Vajna et al. 2013) assume that a person has a prioritized list of tasks and executes any task at a time that is probabilistically selected from this list according to this priority.The waiting time of a task from its entry into the list to its execution follows the power law (of exponent 1.0 or 1.5).Each individual creates a prioritized list of tasks that are independent of others.), the power law is accounted for as a consequence of the combination of Poisson processes, which model the behavior of every individual as the sporadic execution of tasks with circadian or weekly cycles.Thus, modulated Markov process models lack the perspective of the interaction between living agents.The above overview of priority queuing and modulated Markov process models suggests that these models cannot explain the observed power-law properties of ICIs, which are thought to be an essential consequence of the interaction between rivalling states.The interactions between rival agents are at the heart of our information theory model.This implies that our information-theoretic model is more favorable than priority-queuing models or modulatory Markov processes for accounting for the power-law properties of ICIs.
A relevant example can be found in a completely different field of neuroscience, where inter-spike intervals (ISIs) in neuronal spike trains have been observed to follow a power law (Kemuriyama et al. 2010;Tsubo et al. 2012).Computational neuroscientists examined the power-law properties of ISIs using the principles of information theory (Tsubo et al. 2012).Neurons, like states, are communicators, and information processing in the brain is the totality of the communication between neurons.Power laws may be a hallmark of communication between real-world actors, such as states or neurons.
This study focuses on armed conflicts between normal states.This contrasts with the recent trend in the discipline, which is devoted to asymmetric warfare, such as insurgency or terrorism, rather than armed conflicts between normal states (Clauset et  However, the full-scale war in Ukraine, started by the Russian invasion on February 24, 2022, disenchanted us from the illusion that armed conflict between normal states may be outdated (Kaldor 2013).

Dataset
This study used the dataset MID4.02, which can be downloaded from a public repository run by the COW Project (https://correlatesofwar.org).The dataset records militarized interstate disputes (MIDs) and interstate wars from 1816 to 2014.Each MID or interstate war in the dataset is specified with a dyad (a pair of states) engaged in the interstate conflict, the start and end days of the conflict, and values for other covariates.For instance, the covariate WAR takes 1 for interstate war and 0 for MID.

Inter-conflict interval (ICI)
The inter-conflict interval (ICI) is defined as the interval between a conflict in a dyad and the start of the next conflict in the same dyad (Fig. 1).Let   (start) and   (end) be the start and end times of the -th conflict that occurred in a certain dyad, respectively.ICIs were collected from this dyad by calculating  +1 (start) −   (end) in ascending order of .If

Goodness-of-fit test for the power-law hypothesis
Clauset et al. ( 2009) proposed a goodness-of-fit test to examine whether a given set of samples { 1 , ⋯ ,   } follows a power-law distribution.This test, which we call the Clause-Shalizi-Newman (CSN) test, was designed to examine the power-law properties of spatial features, such as war size, earthquake magnitude, and urban population.Samples of spatial features, if they follow power-law distributions, include a number of large-sized events because the long tails characterizing power-law distributions imply the likely occurrence of large-sized events.Therefore, the power-law hypothesis to be examined by the original CSN test is mathematically expressed as () =  − (,  min ) ⁄ ( min ≤ ), where ∑  − +∞  min = (,  min ) is the generalized zeta function.Note that the domain,  min ≤ , has no upper bound.
In contrast, caution is required when applying the CSN test to temporal features such as ICIs.Sampling ICIs from the dataset MID4.02 is restricted by the recording period used to construct this dataset, which is approximately 200 years, from 1816 to 2014.Therefore, even if ICIs were generable from power-law distributions without upper bounds, the lengths of the ICI samples collected from MID4.02 would never exceed the recording period.This means that the empirical distribution of ICIs has an upper bound, above which no sample exists.Furthermore, individual dyads had their own ages, some of which were much shorter than the recording period.For instance, the Russia-Ukraine dyad was approximately 23 years old in 2014 (the final year of the recording period).The length of the ICI samples collected from the dyads never exceeded their age.Consequently, the empirical distribution of the ICIs collected from all dyads would have an effective upper bound that might be much smaller than the recording period, above which the power-law distribution would no longer fit the data well.
Therefore, to fit a power-law distribution to temporal features, such as ICIs, we must consider the upper bound  max in addition to the lower bound  min .The power law implies the likely occurrence of long-term events for temporal features.However, such long-term events could not be recorded because the recording period was limited.In contrast, when recording spatial features, large-sized events, such as huge wars (WWI or WWII), huge earthquakes, or megacities, would never be overlooked.
To examine the power-law hypothesis of the ICIs, the original CSN test should be modified by considering the possible presence of upper bounds.The procedure for the modified CSN (mCSN) test, used to examine the power-law hypothesis for ICIs in the present study, was as follows: Let () be the probability distribution of variable .We consider the case where  takes discrete values measured in days.The power-law hypothesis to be examined by the mCSN is mathematically expressed in the following form: () =  − () ⁄ , where  is the power-law exponent and () = ∑  − max = min is the normalization factor equivalent to the partition function.Let  = { 1 , ⋯ ,   } be the data.Samples that are smaller than  min or larger than  max , if they exist, are excluded from  because we want to test the hypothesis in the domain  min ≤  ≤  max .The loglikelihood is then given as ( ) ( ) ( ) The value of the power exponent  is determined using the maximum likelihood estimate (MLE).The estimation can be performed by direct numerical maximization of ().The model fitted by MLE is denoted as ℳ.
The distance between data  and the hypothesis is measured by the Kolmogorov-Smirnov (KS) statistic  KS defined by ( ) ( ) where () = (the number of   ≥ )  ⁄ is the cumulative distribution function (CDF) for the empirical data.() = ∑ ( ′ ) is the CDF for the fitted model ℳ.
A large number  of power-law distributed data,  1 , ⋯ ,   , are synthesized from ℳ.Each data has the same number  of elements as the empirical data .We fit each synthetic data   to its own power-law model ℳ  .Then we calculate the KS statistics   for   relative to ℳ  .Then, we count the fraction of time that   is larger than , which serves as the -value of this test.Clauset, Shalizi, and Newman (2009) set the conservative decision criteria for the test: If  ≤ 0.1, the power-law hypothesis for the data  is ruled out; otherwise, it is plausible.We conducted a goodness-of-fit test for the ICI samples with  = 10, 000 times the generation of synthetic data.

Mixture of power-law distributions well approximated by a single power-law distribution
We prove that the likelihood of a mixture of power-law distributions is as close as possible to that of a single power-law distribution.Although the proof is not mathematically rigorous, it provides an intuitive understanding of why a mixture of power-law distributions can be approximated using a single power-law distribution in several cases.Consider a mixture of power-law distributions: where (|) is the power-law distribution with exponent   (> 1), ( ) ( ) We assume that the domain of each component distribution has a lower bound  min but infinitely extends rightward without an upper bound.The loglikelihood of the data  = { 1 , ⋯ ,   } is ( ) Using Jensen's inequality, one can arrange this as ( ) ( ) ( ) where  ≡ ∑ ()   =1 . The right-hand side is hence denoted by ( ) ( ) The log-likelihood for a single power-law distribution of the exponent  is given as follows: ( )( ) ( ) ( ) To derive the inequality in the third row of Eq. ( 19), Jensen's inequality was used.to fit the data more closely.If log  single ≥ log  mix , then the single power-law distribution inherently fits the data better than the mixture.Now, consider the case where log  single < log  mix .Since log  mix > log  single >  mix , the mixture actually fits the data better than a single power-law distribution.Nevertheless, as  mix becomes as close to log  mix as possible by its maximization, log  single , which lies between them, approaches log  mix .This implies that the mixture can be approximated using a single power-law distribution.

Testing the power-law hypothesis of ICIs in a single dyad
The set of 2,369 ICIs collected from all dyads, for which the power-law hypothesis was examined using the mCSN test, was a collection of subsets of ICIs collected from individual dyads.Our information-theoretic model predicts that the power-law hypothesis holds for individual dyads.To test this prediction, we examined whether the ICIs collected from a single dyad followed a power-law distribution.For this purpose,  max is set to the maximum ICI.
We examined seven dyads (CHN-RUS, CHN-US, GMY-FRN, IND-PAK, IRN-IRQ, ISR-SYR, and RUS-US dyads), each of which provided the number of ICIs eligible for statistical analysis.Nevertheless, the number was relatively low (from 20 to 39 ICIs), which may have caused an overestimation of the -value of the CSN test.Therefore, the obtained -value larger than 0.1, which implies the plausibility of the power-law hypothesis, does not necessarily mean that competing hypotheses, typically the exponential distribution hypothesis, are ruled out.To confirm that the power law hypothesis is more likely than the exponential distribution hypothesis, we compared the log-likelihood between the power law and exponential distribution hypotheses.
Let  = { 1 , ⋯ ,   } be the set of ICIs collected from a certain dyad, where   ( = 1, ⋯ , ) denotes the -th ICI.We conducted the mCSN test to estimate the lower bound  min and the power-law exponent , while choosing the upper bound as  max = max    .

Testing independent generation of the ICI series from an identical distribution
Our information-theoretic model also predicts that the ICI series for each dyad is generated independently from an identical power-law distribution.To validate this, we conducted a statistical test to examine whether the autocorrelation was significantly different from zero.The first-order autocorrelation of  is given by where  ≡ ∑    =1  ⁄ and  2 ≡ ∑ (  − ) 2  =1  ⁄ are the mean and the variance, respectively (Goh & Barabashi 2008).If the ICIs are independently and identically distributed, the autocorrelation theoretically vanishes.However, as the number of ICIs in each dyad is limited (from 20 to 39 ICIs) in the dyads examined,  (1) () actually takes either a positive or negative value.Therefore, we tested the null hypothesis that  (1) () is approximately zero.
To this end, we generated  = 10,000 pseudo series   ( = 1, ⋯ , ) from  by bootstrapping.Each pseudo-series satisfied the independent and identically distributed conditions.We then calculated the first-order autocorrelations  (1) (  ) for these pseudo-series and examined their distributions.The 10% left and 10% right areas of this distribution were selected as rejection areas.The null hypothesis is rejected if  (1) () enters either the left or the right rejection area.If  (1) () entered the rightward rejection area, it was considered significantly positive.A positive  (1) () indicates the tendency of ICIs to become progressively longer or shorter.If  (1) () entered the left rejection area, it was considered significantly negative.A negative  (1) () implies oscillating ICI series.If  (1) () enters neither the rightward nor the leftward rejection areas, the null hypothesis cannot be ruled out.It is unlikely that higher-order autocorrelations are significantly positive or negative.whereas first-order autocorrelation vanishes.Therefore, if the above statistical test does not reject the null hypothesis, we conclude that the ICI series is free from autocorrelation, that is, the ICIs are independently generated from an identical distribution.

Fig. S1.
The exponential-distribution parameter  was estimated for each dyad using the mCSN test.The estimated parameter  ̂ for each dyad is indicated by the black horizontal bar.To examine the stability of this estimation, 100 pseudo series of ICIs are synthesized by bootstrapping, for each of which the parameter value is re-estimated.The filled blue circle and error bar indicate the mean and standard deviation of these values, respectively.From 2,369 ICI samples collected from all dyads over the entire period (1816~2014), those supposed to be related to the WWI, WWII and interwar period were removed.The same procedures of the CSN test as those for Fig. 3   .: the number of ICI samples for each dyad. � min : the estimated value of  min . D : the number of ICI samples within the domain  � min ≤  ≤  max . D  ⁄ : the ratio of ICI samples within the domain. ̂: the estimated value of .The bottom row lists the -value of the mCSN test.For the -value larger than the criteria of 0.1, as indicated by the asterisk (*), the exponential-distribution hypothesis is plausible.The -value below the criteria is colored red.The upper row lists the mean difference 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  for each dyad.The mean 〈log  � (p.l.) 〉  was calculated by averaging the loglikelihood for the power-law hypothesis over  = 100 pseudo series of ICIs generated using the bootstrap process.The mean 〈log  � (exp) 〉  of the loglikelihood for the exponential-distribution hypothesis was calculated similarly.Positive values of the quantity 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  , colored red, indicate that the exponentialdistribution hypothesis is less likely than the power-law hypothesis.The bottom row lists the value of the paired -test for each dyad to demonstrate the significance of the negativity or positivity of this quantity.The upper row lists the mean difference 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  for each era.The bottom row lists the -value of the paired -test for each era to demonstrate the significance of the negativity or positivity of this quantity.Notations and calculations used here are the same as those used for Table 2 The upper row lists the mean difference 〈log  � (p.l.) 〉  − 〈log  � (exp) 〉  for each era.The bottom row lists the -value of the paired -test for each era to demonstrate the significance of the negativity or positivity of this quantity.Notations and calculations used here are the same as those used for Table S2.
Bohorquez et al. 2009; Cederman et al. 2011; Friedman 2015; Cirrilo & Taleb 2016; Gonzalez 2016; Clauset 2018; Spagat et al. 2018; Cunen et al. 2020).The power-law distribution of war sizes, characterized as fat-tailed, implies the possible occurrence of black swan events, such as World War I (WWI) or World War II (WWII).It has also been shown that the severity of other forms of human violence, such as civil war, insurgency, or terrorist attacks, follows the power law (Clauset et al. 2007; Bohorquez et al. 2009; Clauset & Gleditsch 2012; Johnson et al. 2013).Finding statistical patterns in the severity of war and other human violence has inspired the exploration of the mechanism for the escalation of violence, typically attributed to 'critical phenomena' resulting from the operation of positive feedback loops (Cederman, 2003; Bohorquez et al. 2009; Cederman et al. 2011; Johnson et al. 2011; Clauset & Gleditsch 2012; Johnson et al. 2013; DiVita 2020; Johnson-Restrepo et al. 2020).This robust statistical pattern is also used to infer the actual number of casualties of inadequately recorded wars or to examine the risk of the future occurrence of huge wars such as WWI or WWII (Scharpf et al. 2014; Friedman 2015; Cirrilo & Taleb 2016; Clauset 2018; Cunen et al. 2020).

Figure 1 :
Figure 1: Inter-conflict intervals (ICIs).The ICI is the interval between the end of a conflict in a dyad and the start of the next conflict in the same dyad.Each conflict is indicated by the red rectangle.

Figure 2 :
Figure 2: Distribution of 2,369 ICI samples collected from all dyads is shown in log-log (a) and linearlog (b) plots.The bin width for the distribution was chosen as 365.25 days (~one year).The dashed blue line in each panel indicates the linear regression results.

Figure 3 :
Figure3: Results of the mCSN test applied to 2,369 ICI samples collected from all dyads.This test reveals the plausibility of the power-law hypothesis expressed in the form: () ∝  − for  min ≤  ≤  max , where  min and  max are the lower and upper bounds of the domain in which the power law holds, respectively.The upper bound  max is treated as a control parameter, and the optimal values of  and  min are estimated for each value of  max .(a) The -value of the mCSN test is plotted as a function of  max .The horizontal dashed line indicates the criteria of 0.1, for the -value above which the power-law hypothesis is plausible.(b) The estimated power-law exponent  � is plotted as a function of  max .(c) The estimated lower bound  � min is plotted as a function of  max .(d) The ratio of ICIs (out of the total, 2,369) that fall in the power-law holding domain ( � min ≤  ≤  max ) is plotted as a function of  max .

Figure 4 :
Figure 4: Graphical models describing the causal relations between stochastic variables representing consecutive occurrences of conflicts   and  +1 .Stochastic variables   and   represent the time of occurrence of conflict   and the military operations taken during the course of this conflict, respectively.(a) A graphical model representing the causal relations between   ,   ,  +1 , and  +1 is shown on the left side, which corresponds to the joint probability (  ,   ,  +1 ,  +1 ).The illustration on the right-

Figure 6 :
Figure 6: (a) Illustration of a point process whose point-to-point intervals are supposed to follow a power-law distribution.The chain of filled black circles represents an original point process supposed to follow a power-law distribution.This process is diluted by probabilistically maintaining or discarding each point.The chain of maintained points, indicated by the blank red circles surrounding them, constitutes a diluted point process.The original point process models true occurrences of conflict in the history, regardless of whether they are recorded in the dataset.The diluted point process models conflict occurrences that are actually recorded in the dataset.(b) An original point process following a powerlaw distribution (() ∝  − for  = 1, 2, 3, ⋯) is diluted with the maintaining probability  (hence with the discarding probability 1 − ).We conducted the mCSN test applied to the point-to-point intervals of 100 processes obtained by the probabilistic dilution.The power-law exponent  � estimated by this test is plotted as a function of  (upper and lower panels for  = 1.5 and  = 2.0, respectively).The horizontal dotted lines in both panels indicate the power-law exponent  of the original process.(c) The -value of the mCSN test is plotted as a function of  (upper and lower panels for  = 1.5 and  = 2.0 , respectively).The horizontal dotted lines in both panels indicate the criteria of 0.1, for the -value above which the power-law hypothesis is plausible.In (b) and (c), the error bars indicate the standard deviations.

Figure 8 :
Figure8: ICIs in each dyad are independently generated from an identical distribution.From the actual series of ICIs in each dyad, 10,000 pseudo series were synthesized by bootstrapping.ICIs in each pseudo series conform to independent generation from an identical distribution.The curve in each panel shows the distribution of first-order autocorrelations calculated for the 10,000 pseudo series.The leftward and rightward 10% areas (blue and red shading, respectively) reject the null hypothesis that the actual series of ICIs fails to exhibit nonvanishing first-order autocorrelation.The vertical solid line in each panel indicates the first-order autocorrelation (1) () of the actual series.

Table 1 :
Results of the mCSN test of the power-law hypothesis expressed in the form: () =  − () ⁄ for  min ≤  ≤  max .Here, the value of  max is chosen as the maximum length of ICI samples, and the normalization factor is given by () = ∑
According to probabilistic machine-learning theories (Bishop 2006), we can solve () and   by maximizing  mix .For log  single ≥  mix , an increase in  mix leads to an increase in log  single .Therefore, maximizing  mix causes the single power-law

Table S1 .
Results of the mCSN test of the exponential-distribution hypothesis expressed in the form:() =  − () ⁄ for  min ≤  ≤  max .Here, the value of  max is chosen as the maximum length of ICI samples and the normalization is given by () = ∑

Table S5 .
.Results of the mCSN test of the exponential-distribution hypothesis for the DUC-ICIs for each of the following eras: (i) 1815-1858, (ii) 1859-1899, (iii) 1900-1946, (iv) 1947-1989, and (v) 1990-2014.Notations used here are the same as those used for TableS1.For the -value of the mCSN test larger than the criteria of 0.1, as indicated by the asterisk (*), the exponentialdistribution hypothesis is plausible.The -value below the criteria is colored red.