Effects of void nodes on epidemic spreads in networks

We present the pair approximation models for susceptible–infected–recovered (SIR) epidemic dynamics in a sparse network based on a regular network. Two processes are considered, namely, a Markovian process with a constant recovery rate and a non-Markovian process with a fixed recovery time. We derive the implicit analytical expression for the final epidemic size and explicitly show the epidemic threshold in both Markovian and non-Markovian processes. As the connection rate decreases from the original network connection, the epidemic threshold in which epidemic phase transits from disease-free to endemic increases, and the final epidemic size decreases. Additionally, for comparison with sparse and heterogeneous networks, the pair approximation models were applied to a heterogeneous network with a degree distribution. The obtained phase diagram reveals that, upon increasing the degree of the original random regular networks and decreasing the effective connections by introducing void nodes accordingly, the final epidemic size of the sparse network is close to that of the random network with average degree of 4. Thus, introducing the void nodes in the network leads to more heterogeneous network and reduces the final epidemic size.

The spreading of infectious diseases, such as measles, influenza, Ebola, and SARS (severe acute respiratory syndrome) have threatened human societies, and now they have severe difficulties with COVID-19 (SARS-CoV2). Many mathematical models and methods have been developed to understand epidemic dynamics and the effect of preventing strategies, such as vaccination and social distancing [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16] . For example, the degree-based mean-field model [17][18][19][20] is the most popular mean-field approximation model with consideration of epidemic spreading in networks by using a single-order approximation, although it can be integrated to some heterogeneous topologies obeying degree distribution. Alternatively, pair approximation models explicitly express the epidemic process both at the node and at the link level. House and Keeling 21,22 firstly developed a pair approximation model for a susceptible-infected-recovered (SIR) epidemic model with network clusters and discussed the basic reproduction number and final epidemic size. Bauch 23 established a pair approximation for a susceptible-infected-susceptible (SIS) epidemic model and analyzed its basic reproduction number. Recently, Kuga et al. 24 successfully established a theoretical framework of pair approximation for the vaccination game in which the both dynamic processes of epidemic spread and individual actions in helping prevent harmful social behaviors are quantitatively evaluated. These epidemic models in networks have resulted in a much better understanding of the role of contact heterogeneity and clustering of contacts. Besides these, there are many other well-established epidemic models. The edge-based compartmental model (EBCM) 25 proposed a compact expression to capture SIR dynamics with arbitrary transmission and infection processes in configuration-like networks. On the other hand, the messagepassing approach 26 established a complicated system including a large number of integro-differential equations. Compared with the EBCM or message-passing model, pair approximation models permit a more intuitive understanding of epidemic dynamics.
For the simplicity of mathematical modeling, many studies have assumed that the disease spreading process in a network follows an exponential distribution, which is the Markovian process. In reality, disease spreading processes are far more complicated. For example, the recovery time of malaria obeys a delta distribution, whereas smallpox follows a gamma distribution 27,28 . An epidemic spreading process that does not obey an exponential distribution is called a non-Markovian epidemic process. The mathematical description, theoretical analysis, and numerical simulation of a non-Markovian process are much more complicated compared with a Markovian process. Van Mieghem et al. 29 showed that the Weibullean recovery time strongly affects the threshold of an SIS epidemic model in networks. Cator et al. 30 examined an SIS epidemic model with a non-exponential distribution in the infection and recovery periods and showed that the functional form of prevalence in the quasi steady state was the same as that in the Markovian SIS model. Kiss et al. 31 derived an SIR pairwise model where the recovery time is a delta distribution and gave a general expression for the final epidemic size. Röst  www.nature.com/scientificreports/ derived a new pair approximation model for gamma-and uniformly distributed infectious periods. Li et al. 33 adapted a preventive rewiring effect to the non-Markovian SIR dynamics with a fixed infectious period. Wilkinson and Sharkey 34 investigated the epidemic model that is network based and non-Markovian, containing classic Kermack-McKendrick, pairwise, message passing, and spatial models as special cases. They also explained how systems of delayed and ordinary differential equations can provide that upper and lower limits for the probability that an individual will be infected at a given time for the Poisson contact process and infected duration distribution. In addition, various time effects such as inter-event time and memory on infectious dynamics have been verified by infectious disease models 35,36 . When void nodes are introduced in a certain network, the links attached to the void nodes are eliminated, and a new network is formed that exhibits a different topology from the original network. Similar to the site percolation theory, the topology of the new network depends on how many void nodes are introduced into the original network. Referring research field of the spatial prisoner's dilemma, several precursors should be noted (see Refs. 37,38 ) in which cooperation can be enhanced by considering a site-diluted lattice. This is because a void site saves a cooperator from being exploited by their neighboring defectors. Regarding epidemiology, the results of epidemic dynamics, i.e., epidemic threshold and final epidemic size, are expected to change upon changing the network topology. Wang et al. 39 studied the effect of cutting links on the final epidemic size in complex networks using the discrete SIR epidemic model. They evaluated a simple case in which connections among individuals are randomly removed and a more complex case whereby each individual retains at least a few connections after the contact reduction. Valdez et al. [40][41][42] proposed an adaptive SIR model in which a link-activation-deactivation strategy different from the link-rewiring approach is introduced into the discrete SIR model to demonstrate the social distancing effect. These published researches focused on the effect of network topology changed by cutting links and link-rewiring on the epidemic dynamics. On the other hand, the present study focus on the effect of a sparse network, where void nodes are introduced, on the epidemic threshold and final epidemic size. Furthermore, we establish how these values for the sparse network compare with other complex networks.
In this paper, we present the pair approximation models for Markovian and non-Markovian SIR dynamics in a sparse network. We establish a set of ordinary differential equations (ODEs) for the Markovian process and a set of delay differential equations (DDEs) for the non-Markovian process. We derive the implicit analytical expression for the final epidemic size and present the comparison of the final epidemic size obtained for the sparse network model and that obtained for the heterogeneous degree network model. The manuscript is organized as follows. "Model derivation" section presents a description of the model and assumptions for deductive analysis. "Discussion" section provides the deductive results and discussion. "Conclusions" section summarizes the findings of this work.

Model derivation
We consider a sparse network based on a regular random network with infinite nodes. These nodes represent the population and void nodes, and each node has Q links. Each node except the void node (B) has a state at any time t, which can be either a susceptible (S), infected (I), or recovered (R) state. The Markovian process is considered first. When a susceptible node connects with an infected node whose probability depends on the number of S-I pairs, S changes to I with the disease transmission rate β. Each infected node I changes to a recovered node R with the recovery rate γ that is equal to the inverse of the average recovery time. Each recovered node R becomes immune and is not reinfected. The notations [X](t), [XY](t), and [XYZ](t) are used to denote the expected fraction of nodes in state X, pairs in state X-Y, and triples in state X-Y-Z, respectively, where X, Y, Z ∈ {S, I, R, B}. All notations used in the model are summarized in Table 1.
These equations are exact but unclosed. To close the system, the third-order quantities have to be expressed in terms of second-order state variables as follows 26,27 : www.nature.com/scientificreports/ where μ is expressed as (Q − 1) Q because the fact that a susceptible individual S is known to have at least one X or Y neighbor does not change the expected number of Y or X individuals amongst the other Q -1 neighbors. Furthermore, the following constraints are required: Additionally, the initial condition is hypothetically defined as: Here, α is the connection coefficient which indicates how the void nodes are distributed in the network. If the void nodes are distributed homogeneously in the network, then α = 1-x. However, if the void nodes are distributed so that the B-B link is zero, then , which is minimum value. Here, the relationship x < 0.5 must be satisfied. In addition, counting the number of effective connections through which the disease may be transmitted among the population, the average degree < Q > is calculated as αQ.
To  www.nature.com/scientificreports/ where r = γ β is the relative recovery rate, i.e., the inverse of the basic reproduction number R 0 .
Using In the steady state ( t → ∞ ), there will be no infected individuals since they spontaneously become recovered individuals with no chance of plural infection in the present model. Therefore, the constraints in Eqs. (8) and (9) 25) is an increasing function of s that vanishes at s = 0 and attains the value α(Q − 2) at s = 1 . Therefore, a real solution exists for 0 < s < 1 , and this is the only solution as long as r < α(Q − 1) − 1 . The phase transition that occurs at the critical relative recovery rate is expressed as follows: The final fractions are expressed as follows: These final fractions are expressed so that the fraction of void nodes is included in the system. In addition, the final epidemic size should be counted among the population 1-x. Therefore, the final epidemic size is expressed as 1-s Q .
Next, the pair approximation model for the non-Markovian process is derived considering a fixed infection period σ equal to 1/γ. Here, the number of infected nodes at time t is replenished by β[SI](t) and is depleted by β[SI](t − σ ) . Thus, the dynamics of infected nodes is expressed as: In addition, the deletion of the S-I link that was produced at a time (t-σ), i.e., ( βµ [
(25) α(s Q−2 + · · · + s 2 + s) = r + 1 − α. www.nature.com/scientificreports/ links changing to S-R links owing to recovery is discounted. Defining the S-I links that are produced at time t as SI (t) , the discount of the S-I links is expressed as the following evolution equation 31,32 : The integration over [t − σ , t] leads to: Therefore, the dynamics of the S-I links is expressed as: The same constraints and initial conditions that were assumed for the Markovian process are also considered here.
To which is equivalent to The non-trivial solution is then given by: Therefore, the phase transition that occurs at the critical set value of βσ which means basic reproduction number R 0 is expressed as: The final fractions are calculated via the same expressions used for the Markovian process, i.e., Eqs. (27) and (28).  Figure 1 shows the final epidemic size corresponding to the connection coefficient α and infection parameters. In the Markovian process, the inverse of the relative recovery rate 1/r (which is equal to β/γ) was used for the infection parameter. On the other hand, βσ was adopted in the non-Markovian process. The both infection parameters mean the basic reproduction number R 0 . As the connection rate decreases from the original network connection (α = 1), the epidemic threshold in which epidemic phase transits from disease-free to endemic (expressed in Eqs. (26) and (41)) increases, and the final epidemic size decreases. In other words, introducing the void nodes decreases the number of effective people-to-people connections, thereby suppressing the epidemic spreading in the network. These results are consistent with previous work 39 . When comparing Markovian and non-Markovian processes, as shown in Fig. 1c, it can be noticed that the final epidemic size in the Markovian process is less than that in the non-Markovian process even if the value of the infection parameter β/γ is equal to βσ.

Discussion
As shown in the mathematical expressions for the epidemic threshold (Eqs. (26) and (41)) and final epidemic size (Eqs. (23) and (38)), the fraction of void nodes x introduced into the original network does not explicitly affect the epidemic dynamics. However, the connection coefficient α, which strongly affects the final epidemic size, as shown in Fig. 1, depends on the fraction of void nodes. Therefore, the fraction of void nodes indirectly influences the final epidemic size. When the void nodes are randomly and homogeneously distributed in the network, the number of connections among individuals is inversely proportional to the fraction of void nodes, and α is equal to 1-x. As a result, Fig. 2 (1-a) and (1-b) agree well with the inverted figures in Fig. 1a,b, respectively. On the other hand, when the void nodes are efficiently distributed so that the connection of [BB] is zero and the connection of [SB] increases, the effective connection decrease obeying α = 1−2x 1−x . As shown in Fig. 2 (2-a) and (2-b), the final epidemic size and epidemic threshold is significantly low compared to the case of random distribution. Moreover, the difference of final epidemic size between Markovian and non-Markovian process is shown in Fig. 2 (1-c) and (2-c). The critical curves for Markovian and non-Markovian process are derived from combination of Eqs. (26) and (41) and connection coefficient α.
As mentioned earlier, the introduction of void nodes can significantly change the outcome of epidemic dynamics. Furthermore, there is a question remains as to what type of network corresponds to the epidemic dynamics in a network in which void nodes are introduced in a random regular network. Erdős-Rényi (ER) random network 43 are often used for comparison with random regular network. However, the degree distribution of the ER random network is a Poisson distribution, which is somehow heterogeneous. Therefore, the two  www.nature.com/scientificreports/ networks are compared to investigate the effect of the heterogeneous degree distributions. To compare these results with the final epidemic size corresponding to a heterogeneous degree network, the final epidemic size associated with an ER random network was additionally derived (as described in the Supplementary Material). Figure 3 shows the final epidemic size as a function of the inverse of the effective recovery rate, i.e., 1/r. The network topologies are created by introducing the void nodes into different random regular networks with Q = 4, 8, and 16, so that the < Q > = 4. In ER random networks, the degree distribution obeys the Poisson distribution with an average value of 4. In both Markovian and non-Markovian processes, upon increasing the degree of the original random regular network and decreasing the connection coefficient α accordingly, the final epidemic size of the sparse network is close to that of the random network with < Q > = 4. In other words, by introducing void nodes into the random regular network, heterogeneous degree distribution occurred in the network with a constant degree. As a result, the epidemic dynamics of sparse network is similar to the ER random network.

Conclusions
The pair approximation models for Markovian and non-Markovian SIR dynamics in a sparse network with the introduction of void nodes were presented. A set of ODEs for the Markovian process and DDEs for the non-Markovian process were established. The implicit analytical expression for the final epidemic size was derived, and the final epidemic size for the sparse network model and heterogeneous degree network model were found to agree with each other.