Explicit size distributions of failure cascades redefine systemic risk on finite networks

How big is the risk that a few initial failures of nodes in a network amplify to large cascades that span a substantial share of all nodes? Predicting the final cascade size is critical to ensure the functioning of a system as a whole. Yet, this task is hampered by uncertain and missing information. In infinitely large networks, the average cascade size can often be estimated by approaches building on local tree and mean field approximations. Yet, as we demonstrate, in finite networks, this average does not need to be a likely outcome. Instead, we find broad and even bimodal cascade size distributions. This phenomenon persists for system sizes up to 107 and different cascade models, i.e. it is relevant for most real systems. To show this, we derive explicit closed-form solutions for the full probability distribution of the final cascade size. We focus on two topological limit cases, the complete network representing a dense network with a very narrow degree distribution, and the star network representing a sparse network with a inhomogeneous degree distribution. Those topologies are of great interest, as they either minimize or maximize the average cascade size and are common motifs in many real world networks.


Supporting Information (SI)
Derivation of the cascade size distribution for complete networks We offer two different representations of the final cascade size distribution for complete networks: one that enumerates all possible combinations of failures that could have led to a cascade outcome, and one which can be deduced by the inclusion-exclusion principle and whose implementation requires less computational resources. Both representations rely on combinatorial arguments, which we explain next. First, we note that each non-failed node in the network is exactly in the same state. A node carries the load λ[k], if k other nodes have failed already. If ρ = k/N is the final cascade outcome we know that all surviving N − k nodes must have thresholds larger than this load λ[k]. The probability for this event is )) N −k , since we assume that the thresholds are indepen-dently distributed. There are N k different combinations of k surviving nodes out of the total N nodes. The remaining k nodes fail altogether with a probability p k that needs to be determined.
In summary, we can write the cascade distribution as: The probability p k that all k of the remaining nodes fail is equivalent to the failure of all nodes in a network with k nodes, since non-failed nodes do not influence the amount of load that any of the k nodes carries. The load required to cause a node to fail is determined by the node's threshold, while the entire configuration of all thresholds defines the order of the nodes' failure.
But not all threshold configurations lead to k failures. For instance, it is not enough that all k and sum the probabilities of all those configurations in order to calculate p k . Since each node's A formal proof is given below.

Proof of Equation (4) for compete networks
We proof that the final cascade size distribution, P (ρ = k/N ) on complete networks can be expressed as given in Eqn. (1) with the p k as given in Eqn. (4) The probabilistic inclusionexclusion principle (1) states that for events A 1 , . . . , A n in an arbitrary probability space, one of these events occurs with probability: where I j = {l 1 , . . . , l j } is a set containing exactly j distinct indicees l i ∈ {1, . . . , n}. In case that the probability q j = P l∈I j A p only depends on the number j of events that are intersected, we have as there exist n j different subsets of {1, . . . , n} with j elements.
To proof Equation (4), we need to define appropriate events A j so that p k = P k j=1 A j and q j = F (λ[k − j]) j p k−j , where we associate index j with j = k − l. Then Equation (4) follows immediately from the stated principle. First, we recall that p k is the probability that k We simply add the probabilities for all different cases and obtain: .The first summand considers the case when the center does not fail, while the second term adds the probability for the case when the center fails initially. Then, each of the other k − 1 failures of leaves can either occur initially or because of a load distribution by the center. The size of this load might depend on the number of nodes l that fail initially together with the center, since these nodes cannot receive load after the failure of the center. The index j in the third term finally takes the events into account when j leaves have failed before the center.