Reducing Cascading Failure Risk by Increasing Infrastructure Network Interdependency

Increased coupling between critical infrastructure networks, such as power and communication systems, will have important implications for the reliability and security of these systems. To understand the effects of power-communication coupling, several have studied interdependent network models and reported that increased coupling can increase system vulnerability. However, these results come from models that have substantially different mechanisms of cascading, relative to those found in actual power and communication networks. This paper reports on two sets of experiments that compare the network vulnerability implications resulting from simple topological models and models that more accurately capture the dynamics of cascading in power systems. First, we compare a simple model of topological contagion to a model of cascading in power systems and find that the power grid shows a much higher level of vulnerability, relative to the contagion model. Second, we compare a model of topological cascades in coupled networks to three different physics-based models of power grids coupled to communication networks. Again, the more accurate models suggest very different conclusions. In all but the most extreme case, the physics-based power grid models indicate that increased power-communication coupling decreases vulnerability. This is opposite from what one would conclude from the coupled topological model, in which zero coupling is optimal. Finally, an extreme case in which communication failures immediately cause grid failures, suggests that if systems are poorly designed, increased coupling can be harmful. Together these results suggest design strategies for reducing the risk of cascades in interdependent infrastructure systems.


Introduction
Understanding the reliability and security implications of increased coupling between interdependent power, water, transportation and communication infrastructure systems is critical, given the vital services that these infrastructures provide and continuing threats posed by natural disasters and terrorist attacks [3,4].This is particularly true for the coupling between electric power and communications networks, given the essential nature of electric power to modern societies, the rapid growth of smart grid technology [5], and the potential for cascading failure to lead to catastrophic blackouts [6].Smart grid systems, such as advanced metering infrastructure and microprocessor-based controls, can be valuable tools for mitigating these risks [7].But automation can also introduce new failures mechanisms: cyber-attacks may reach a larger number of critical components [8] and outages may propagate between the connected networks, increasing the risk of massive failures.
In order to quantify the risks and benefits of network interdependency, models are needed that at least approximately represent the potential for cascading within a power grid, as well as between power and communication networks.A variety of models have been suggested for understanding the mechanisms by which failures, ideas, and diseases propagate within individual networks [9][10][11][12][13][14]. Simple models clearly show that different types of networks can respond very differently to random failures and attacks [15][16][17][18][19][20], with some suggesting that contagion-style models can be used for power systems analysis [21][22][23][24].
However, power grids differ in important ways from these simple models.In a contagionstyle model, failures propagate locally: when component i fails, the next component to fail is typically topologically connected to i. Power grids are engineered networks, in which energy flows from generators to loads through power lines (edges), each of which has a limit on the amount of electrical flow it can tolerate.When node (substation) or edge (transmission line) failures occur, power re-routes according to Kirchhoff's and Ohm's laws.This rerouting increases flows along parallel paths, which can subsequently trigger long chains of component failures, potentially leading to a wide-area blackout [6,25].As a result of this process, failures propagate non-locally: the next component to fail may be hundreds of miles or tens of edges distant from the previous failure.Thus, overly simple topological models can lead to misleading conclusions [26] (Figure 1).
On the other hand, simple models can often suggest new approaches to a particular In topological models of cascading (e.g., the contagion model in [10] or the sandpile model from [27]), cascades propagate from the initiating failure 1 to neighboring nodes 2 .In a power grid, the initiating failure 1 causes increased loads along parallel paths 3 , which may subsequently fail [26].
problem, particularly when there is limited understanding, as is the case with vulnerability in interdependent networks.Motivated, at least in part, by increasing interdependency between power and communications networks, a number of recent studies suggest that interdependency can increase vulnerability in network structures that were otherwise relatively robust [1,2,28,29].Others have found nonlinear relationships between the level of coupling between interdependent networks and network performance, suggesting that there exists an optimal level of coupling [24,27,30].More recent results suggest that under some conditions, coupling between networks can improve network performance [31].
However, the results above come from models that diverge from real infrastructure networks in important ways.First, the topological structures found in infrastructure networks differ notably from standard abstract models [32], largely due to geographical and cost constraints [33].Second, the physical mechanisms of cascading within networks (see Figure 1) and between interdependent networks (see Figure 3) are notably different from those of the percolation-style models in [1,27,34].Recent results suggest that modeling the physics of power flows can have important impacts on the implications from interdependent infrastructure network models [35,36].In order to understand the extent to which insights from abstracted network models can be useful for particular types of interdependent networks (such as power and communications networks), comparisons are needed between simple models and those that more accurately capture the topology, physics, and coupling of particular infrastructure systems.
Therefore, the goal of this paper is to understand the impact of network topology, cascading mechanisms (physics), and coupling on infrastructure network vulnerability, using power grids coupled to communication networks as an illustrative test case.Two sets of numerical experiments combine to address this goal.The first set of experiments compares the relative vulnerability of different types of networks to random disturbances for two different models of intra-network cascading: a simple contagion model and a model that more accurately captures the mechanisms of cascading in power grids.This comparison shows how the use of different cascading failure models can change ones conclusions about networks vulnerability.The second set measures the vulnerability impacts of increased coupling between interdependent networks given different models of cascading mechanisms, comparing the conclusions that would be reached from the different models.

Results
In each experiment that follows, we measure the response of different types of networks to random node failures.Following the notation in [1], p is the fraction of the n nodes in each graph that remain in service immediately after an initial, random set of ∼ (1 − p)n node failures.The ultimate impact of each cascade is measured by the number of nodes remaining within the largest (giant) connected component of the graph, |GC|, after the cascade has subsided.Also, following the notation in [1], we measure network robustness (or conversely vulnerability) by estimating the probability of the post-cascade giant component including more than half of the nodes, i.e., Pr(|GC| > 0.5n).In this paper, each estimate of Pr(|GC| > 0.5n) comes from 1000 random sets of outages.
In order to compare networks with different topological structures, five topologies were studied.Power network data come from a model of the Polish power grid [37], which has n = 2383 nodes (buses) and m = 2886 edges (transmission lines or transformers), after removing parallel edges.For comparison, four synthetic networks were generated according to the standard Erdős-Rényi (ER) [38], random regular (RR) [39], preferential attachment (scale-free, SF) [40], and square lattice generating functions [41].In each case, the generating functions were adapted to produce graphs with exactly n = 2383 nodes and m = 2886 edges, and therefore the same average nodal degree (i.e., k = 2m /n = 2.422).In order to create power network data from each of the synthetic graphs, each of the generators (sources) and loads (sinks) in the Polish network was randomly assigned to one node in each network, and each edge a → b was assigned an impedance of Z ab = i pu, where i = √ −1.The line limits were assigned in order to ensure that no single edge failure would initiate a cascading failure (see Materials and Methods).

Intra-network cascading
The first set of experiments compares the robustness of these five topologies given two different models of cascade propagation.The first model is a simple model of topological cascading, proposed by Watts in [10] (see Figure 1A).In this model, after the initial set of (1 − p)n node failures, Node i fails if the fraction of Node i's neighbors that are in a failed state exceeds some threshold φ i .In our experiment, each φ i is randomly drawn from a uniform distribution over (0, 1).
The second model more accurately represents the dynamics of cascading-overloaded transmission lines (edges) in power grids (see [42]).In this physics-based cascading model, the failure of edges results in the redistribution of power flows along parallel paths according to Kirchhoff's and Ohm's laws, using the "dc power flow" linearization of the non-linear power flow equations (see SI Text).This new distribution of flows can cause edges to be overloaded, possibly inducing further edge failures (see Figure 1B).If edge failures cause the network to fracture into separate connected components, power sources (generators) and power sinks (loads) adjust to arrive at a new balance between supply and demand.Once started, cascades continue until no overloaded edges remain.
Figure 2 shows the robustness of the networks to random failures of various sizes (from p = 0.65 to p = 1), for these two models.The results show some notable similarities.For both models of cascading, the power grid and lattice structures are the most vulnerable, and the scale free topology is the most robust.In fact, the relative order of the five networks is nearly identical in Figures 2A and B. On the other hand, the power grid model accentuates the vulnerability differences between the different topologies, and changes the nature of the transition in p.In the Power Grid model, we do not observe the rapid, second-order phase transition that is apparent in the topological model; transitions as p → 1 are more gradual.The midpoint of the transition, p c = p : Pr(|GC| > 0.5n|p) = 0.5, for the power grid and lattice structures increases, indicating that these networks produce large cascades after a smaller number of node failures.

Inter-network cascading
In the second set of experiments, we consider a pair of interdependent networks (i.e., power grid and communications network, denoted hereafter by N P and N C , respectively), in which a fraction q (degree of coupling) of the n nodes in N P are coupled to nodes in N C .As in the first set of experiments, two different types of models are compared.
The first model is an implementation of the interdependent cascade/percolation model proposed in [1].In this model, when a node fails its edges in network N P and N C immediately fail.If the removed edges result in unconnected clusters in N P (or N P ), then the edges linking the clusters in N C (or N P ) fail.This cascading process continues until both N C and N P have the same set of clusters.Henceforth, this model will be referred to as the "Coupled Topological Model" (see Figure 3A).While it is clear that the interdependency created by smart grid will impact the vulnerability of power networks to attacks and failures, it is not clear exactly what mechanisms of inter-network cascading will exist as interdependency increases.Therefore, in a second set of coupled network models, we model three different possibilities for the nature of this coupling.In all three "Smart Grid" (SG) models, cascades are allowed to propagate within the power grid, as in the previous model, with the exception that the communication network is used to collect measurements from and distribute control commands to nodes in N P , if there is a grid-comm connection at this node (Figure 3).In the smart grid model, measurements are used to monitor for overloaded transmission lines (edges in N P ).When overloads occur, the measurements are used to select optimal control signals (load and generation shedding), which are distributed to generators and loads at nodes in N P that are connected to N C (see SI Text).In the first smart grid model, "Ideal Smart Grid," communications nodes continue to operate, even if nodes in N P fail.This corresponds to the case where N C has reliable battery backup systems that allow it to continue to operate when power failures occur, as is common practice in the design of modern SCADA (Supervisory Control and Data Acquisition) systems.In our second smart grid model, "Non-ideal SG," communication nodes fail with a probability that is proportional to the amount of local load shedding.For example, if the power supply for a node in N C has had 50% load shedding, the comm node has a 0.5 probability of failure.We further assume that there is a centrally located control center that manages the network.The communications system is modeled such that grid components can be monitored and controlled only when there is a functional communications network Figure 4: Robustness of the Polish network to random failures, with varying levels of coupling, q.Panel (A) shows results from four different models of cascading in power grids, three of which are coupled to communications systems, after 5% of nodes initially failed (p = 0.95).In this case we measured robustness with the fraction load served after the cascade had subsided (P T ).Panel (B) reports analogous results from the coupled topological model, for several different failure sizes, with robustness measured as in Fig. 2.
path between the control center and a particular grid node.If comm node/edge failures cause N C to fracture into clusters, signals can only pass within the cluster where the control center is located (see Figure 3).Finally, in our third smart grid model, "Vulnerable Smart Grid," generators and loads fail immediately when a communications node that is connected to a generator fails.
To build semi-realistic coupled network topologies, we used the data for the Polish power grid for N P and connected a fraction q of the n nodes to a comm network N C .Because both are geographically embedded networks, N P and N C are likely to be somewhat, but not perfectly, correlated.To simulate this correlation, N C was initialized to be identical to N P , and then 10% of the edges in N C were randomly rewired.
After initializing the data and models, the various models were, as before, subjected to random node failures, and the performance of the networks measured.For the Coupled Topological results, we measured network performance using the giant component probability Pr(|GC| > 0.5n).For the power grid models, we used an analogous measure of performance: the probability that the network can serve at least 50% of the load in the network, after the cascade has subsided.Figure 4 shows the results for fixed failure sizes (1 − p), and varying levels of coupling, q.For q = 0 (i.e., uncoupled networks), the smart grid models produce results that are identical to the uncontrolled power grid, since cascading occurs only within the power grid and the comm network neither benefits nor detriments the system.As q increases, the robustness of the Ideal and Non-ideal Smart Grid models increase monotonically.For the Vulnerable Smart Grid model, robustness decreases monotonically with q.In contrast, for the Coupled Topological model, robustness initially decreases with q, and then increases, with the "optimal" level of coupling being q = 0 for all p.It is interesting to note that the behavior of both models contrasts sharply from the results in [27], which suggest that there exists an optimal level of coupling between q = 0 and q = 1.
In order to compare the Non-ideal Smart Grid model to the Coupled Topological model in more detail for different types of topological structures, we took the four additional network topologies from Figure 2, and connected them to correlated comm networks, using the same method used with the Polish power network.Both models, for q = 1, were subsequently subjected to random node failures as before, measuring the robustness of the networks to different disturbance sizes (with varying p).
Figure 5 shows the results.In all five networks, the Coupled Topological model indicates that interdependency increases vulnerability relative to the simple contagion model.For the Non-ideal Smart Grid model, interdependency decreases vulnerability in every network, relative to the uncoupled power grid model in Figure 2.

Discussion
Together, these results have important implications both for the emerging science of interdependent networks and for the design of intelligent infrastructure systems.
Firstly, the power grid and topological models show some qualitative similarities.The relative vulnerability of the networks to random failures is similar across the various models studied in this paper.Lattices are consistently the most vulnerable and scale-free networks are consistently the most robust.Power grids perform only slightly better than lattice topologies.However, this is where the similarities end.When we measured the effect of network coupling on performance, increased coupling consistently increased network robustness in all but the most extreme (and unrealistic) power grid model.For the Ideal and Non-ideal Smart Grid models, the most robust configuration was the fully coupled case, q = 1.In the Coupled Topological model, q = 0 was the optimal level of coupling, with increased coupling generally decreasing performance until q > 0.7.For every attack size, and every topological structure, interdependency increased vulnerability in the coupled topological model and decreased vulnerability in the more realistic smart grid models.The reason that vulnerability decreased in the smart grid models is that links between the networks performed valuable functions in arresting the spread of cascades.When components were overloaded, and thus at risk of cascading, the comm network facilitated valuable system-wide control functions.Since these beneficial functions of the comm network are not modeled in the coupled topological model, coupling tends to increase vulnerability.These differences indicate that models of network interdependency can lead to misleading conclusions if they do not adequately describe the beneficial functions of coupling as well as the various mechanisms of cascading within and between the coupled systems.
Finally, the results suggest good design practices for intelligent cyber-physical systems, such as in the smart grid case.In the case of the Ideal and Non-ideal Smart Grid models, coupling increased robustness because of the limited ways in which cascades could propagate between the two networks.In practice, limits on inter-network cascades can be implemented by sound engineering practices that reduce the chance of failures propagating between networks.One good example of this is adding reliable power backup systems to critical components, such as the communication systems at power substations or traffic signals along critical transportation corridors [43].

Generating power and communications network topologies
Data for the Polish power network were obtained from [37], and then modified slightly to remove parallel links and to adjust the network so that no single transmission-line outage would initiate a cascade-a common operating principle for power systems.The topologies (edge endpoints) for the synthetic ER, RR, SF, and lattice graphs were initialized using standard generating functions, and then edges were randomly removed in order to produce final graphs with exactly the same size as the Polish network.Edge removals that would result in the graph separating into non-connected subgraphs were avoided in order to ensure that the graphs were initially fully connected.Similarly, duplicate edges and self-loops were removed for consistency with the power grid data.
Correlated communications networks data were generated by initially copying the corresponding power network, and then randomly rewiring 10% of the endpoints, excluding rewirings that created self-loops or duplicate edges.The two networks were connected by inter-network links with probability q ∈ [0, 1].The resulting interlinks produce a correlated pair of graphs [44,45] (as shown in Figure 3) that are at least similar to the correlated topologies of power grid and communication networks.

Generating synthetic power grid data
After building building graphs that were identical in size to the 2383-node Polish power grid, we generated synthetic power grid data for each synthetic graph.We did this by giving each edge (transmission line) a normalized impedance of Z ij = √ −1, such that the power flow between the two nodes, after our linearized dc power flow assumptions, is P ij = θ i − θ j , where θ x is the phase angle of the sinusoidal voltage at node x (see SI Text).Line limits were assigned to ensure that no single outage resulted in a cascading failure.

Model of cascading failure in power grids
Our model of cascading failure (DCSIMSEP) is based on the model in [42], and similar to models in [6,46,47].This model is closely related to the random fuse networks studied in [48].The model initially computes the paths of power flow along transmission lines given the locations and production/consumption of generators/loads using the dc power flow equations (SI): where P g and P d are vectors of power generation and load; B is a weighted Laplacian matrix encoding the network topology; θ is a vector of voltage phase angles; F ij is the power flow from i to j; and x ij = (Z ij ) is proportional to the inductance in the transmission line.When a component fails, flows are re-computed according to (1) and ( 2).If the revised power flows exceed the flow capacity, this line will open (disconnect) in an amount of time that is proportional to the overload.This changes the configuration of the network, with new flows from (1) and (2).If the network separates into islands, there may not exist a feasible solution to (1) due to an imbalance between supply and demand.To rectify this imbalance, a combination of generator adjustments and load cutting are used to arrive at a new, feasible solution of (1).

Smart Grid Models
The three smart grid models each depend on an optimization problem that seeks to minimize the amount of load shedding and power generation reductions necessary to arrive at a feasible solution to (1) and ( 2), with the added constraint that each F ij has to be within the flow capacity limits for this link.The ideal smart grid model uses perfect information about each node to solve this problem, optimally choosing adjustments to the available generators and loads, independent of where they are in the network.If there is no communication link to a particular node, the ideal smart grid model does not gather data about flows from this location, and assumes that it has no ability to control generators or loads at this node.The configuration of the communications network does not impact the smart grid model.
The non-ideal smart grid model adds to this a model of the communications system.In this case the optimizer can only control and monitor nodes when there is a comm network path between a particular grid node and the "control center" node.When the path to node i is broken, the optimization formulation is adapted to exclude generation and load at node i from the set of control variables, and it ignores the flow constraints adjacent to i (e.g., the flow constraint on edge i → j), unless an adjacent node (e.g., j) is connected to the control center.In addition the Non-ideal Smart Grid model assumes that if there is load shedding at grid node i, the adjacent comm node will fail with probability that is equal to the fraction of load shedding.
The vulnerable smart grid model adds to this the rather extreme assumption that if a comm node fails, the generation and load at that node will fail.

Figure 1 :
Figure1: Illustrative comparison of cascade propagation in (A) topological contagion and (B) power grid models.In topological models of cascading (e.g., the contagion model in[10] or the sandpile model from[27]), cascades propagate from the initiating failure 1 to neighboring nodes 2 .In a power grid, the initiating failure 1 causes increased loads along parallel paths 3 , which may subsequently fail[26].

Figure 3 :
Figure3: Comparative illustration of the (A) "Coupled Topological" model and the (B) "Non-ideal Smart Grid" model.In the Coupled Topological model an initiating disturbance 1 causes 2 edge failures in the power grid as well as 3 node and edge failures in the communications (comm) network.As a result, the size of the giant component is reduced to 0.8n.In the Non-ideal SG model the initiating failure potentially causes overloads 4 , which causes an edge failure and 5 a loss of power at the "sink" node.This may cause 6 a comm node and 7 link failures.comm link failures fracture the comm network, preventing messages from being passed to the control center.

Figure 5 :
Figure 5: Probability of observing a GC in the (A) Coupled Topological cascading model (the degree of coupling is q = 1); and (B) Non-ideal Smart Grid model.