Web malware spread modelling and optimal control strategies

The popularity of the Web improves the growth of web threats. Formulating mathematical models for accurate prediction of malicious propagation over networks is of great importance. The aim of this paper is to understand the propagation mechanisms of web malware and the impact of human intervention on the spread of malicious hyperlinks. Considering the characteristics of web malware, a new differential epidemic model which extends the traditional SIR model by adding another delitescent compartment is proposed to address the spreading behavior of malicious links over networks. The spreading threshold of the model system is calculated, and the dynamics of the model is theoretically analyzed. Moreover, the optimal control theory is employed to study malware immunization strategies, aiming to keep the total economic loss of security investment and infection loss as low as possible. The existence and uniqueness of the results concerning the optimality system are confirmed. Finally, numerical simulations show that the spread of malware links can be controlled effectively with proper control strategy of specific parameter choice.

networks are infecting mobile devices easily since they are often platform-independent. Moreover, financial purposes enormously motivate cyber-attackers to use websites to conduct phishing attacks that attempt to acquire personal or financial information such as usernames, passwords, and credit card details. For instance, some spoofed websites or links are intentionally designed to seem official, or even these sites are legitimate, but have been compromised by malware, SQL injection or other malicious techniques. Typically, phishing is carried out by the way that the user views a phishing message, in spoofing emails or instant messages, and is tricked into clicking a link that leads to a malicious website 9 . Consequently, it is important to make a trial on better understanding of the diffusion of malicious URLs for improving the safety and reliability of devices and networks.
In the past decades, a variety of epidemic models were developed to address the diffusion of disease infections [10][11][12] and population dynamics [13][14][15][16] . Especially, spatial effects on herbivore populations are recently studied in structured populations in ecosystems [17][18][19] . Inspired by the research of biological epidemics 20 , malware epidemiology similarly aims to study the dynamics of malware spread over time and analyze the factors affecting its propagation process 21,22 . Much effort has also been done in the area of developing mathematical models for malware spread 23 , and most existing models for malicious code are based on deterministic epidemic models 24,25 . For instance, some earlier mathematical models were obtained by the compartmental approach, such as epidemic SIS, SIR and SIRS models 26,27 , which differ by considering whether the acquired immunity is permanent or not. Modification of theses models generated guides for infection prevention by using the concept of epidemiological threshold 28,29 . Some dynamical models were further proposed to give estimations for temporal evolutions of infected nodes depending on network parameters considering topological aspects of the network. But, in most of previous works, susceptible computers were assumed to be instantaneously infective as soon as they were infected and later recovered with a permanent or temporary acquired immunity. In fact, however, a device receiving malware messages will not immediately become infective until the user activates it by clicking on the hyperlink address and successfully accessing the malware websites. On the other hand, in spite of much work having been devoted in the past decades to understanding the spreading behavior of malware 30,31 , those models were actually limited to model the propagation of computer viruses and Internet worms. As far as we are concerned, few work focuses on addressing the characteristics of web malware and their propagation dynamics. Besides, empirical results indicate that human dynamics have effects on web malware diffusion. However, little is known about how human behaviors have influences on web malware outbreak and propagation when user's security awareness is considered. Therefore, this paper aims to establish an elementary dynamical model (relatively simple in the form of ordinary differential equations) to address how web malware spread with the impact of users' security awareness, and develop proper prevention strategies with human interventions by the optimal control theory.

Results
A compartment-based model. Web malware and propagation mechanisms. Generally speaking, web malware is a specific kind of malicious programs that use web pages to implement destructions. They usually employ the vulnerabilities of browsers to achieve viral implants by using some malicious codes written in Script. There are different variants of web malware that infect websites, such as iframe viruses. Most of them use iframe HTML code to cause damage by injecting iframe tags into the website 32 . Web threats are able to cause a broad range of risks, such as financial damages, damage of company reputation, and loss of consumer confidence in e-commerce and online banking 33 . Furthermore, multiple types of web malware benefit cybercriminals by stealing confidential information for subsequent sale and help absorb infected devices into botnets.
Attackers exploit the vulnerabilities of browsers or webpages to design and proliferate malicious viruses. Distribution of malicious programs has been largely expanded beyond traditional channels like email viruses to harder-to-avoid approaches like automated "drive-by downloads" launched by infected webpages (see Fig. 1). There are mainly the following several ways for the spread of web threats over networks.
Taking fraudulent methods. In this way, phishing and spam are taken to lure users to malicious (often spoofed) websites which can collect information by injecting malware. Network attackers use phishing, DNS poisoning or other means to make them appear to originate from a trusted source 34 .
Using social engineering. One fundamental method is to write and forward tempting messages or emails containing the addresses of infected websites. More specifically, malware developers employ social engineering such as enticing subject lines that reference popular personalities, sports, pornography, world events and other hot topics to design malicious links. Once users receive these types of deceptive information and are enticed to click on the hyperlinks which direct to the malicious websites, web viruses will be automatically downloaded and activated, resulting in personal information leak, such as accounts and passwords.
Infecting legitimate websites. By this way, legitimate websites infected by web malware will unknowingly transmit malware threats to visitors or alter search results to take users to malicious websites. Upon loading the page, the user's browser passively runs a malware downloader in a hidden HTML frame without any user interaction.
Compartments and parameters. In this section, we aim to develop a new compartmental model to characterize the propagation of web-based malware. For convenience, the devices through which malware propagates are also called as nodes in the sequel. In our model, a host under consideration is assumed to be in one of four states: susceptible(S), delitescent(D) (not yet infective), infected(I), or recovered(R). The state of a node is actually changing over time, i.e., switching among the above four states, because of the proliferation of malware links and the defense of antiviruses. A susceptible node first goes through a delitescent period before being infectious, and a typical pathway of malicious link infection is S → D → I → R → S (see Fig. 2). Next, several assumptions and parameters are introduced. If a user successfully visits the malware website by clicking on the hyperlink within deceptive messages or spam emails, the host will get infected. In the following, an infected host is assumed to be able to forward the malicious messages through users' contact lists. Susceptible nodes are assumed to immediately become delitescent as soon as they receive messages containing malicious URLs. Note that user behaviors will play a significant role in affecting the proliferation of malicious hyperlinks. Obviously, vigilant users have a high probability to identity and eliminate malware messages, and update recent security patches to fix bugs for system immunization. Based on this consideration, a parameter η is introduced to depict the probability of a D-node leaving for the recovered compartment. On the other hand, users without enough security awareness are probably enticed to click on the malicious links and get infected by the malware automatically downloaded from the insecure websites. Hence, a parameter ε is introduced to describe the probability of that a D-node leaves the delitescent compartment for the infected compartment. There is also another case that the states of some D-nodes may keep unchanged, because users may neglect the received malware links and do not take any measures to deal with them.
Infected devices by malware intrusion may exhibit certain symptoms, such as strange disruptions, battery draining and performance clogging. Once abnormal behavior is found, users will take security measures to detect and immune their systems. Thus, we introduce a parameter γ to describe the probability that an I-node gets immunization and turns to be recovered.
Immunity is observed when anti-malicious software is run after a node gets affected by malware. However, this kind of immunity is usually temporary. Specifically, when a node is recovered from the infected class, it recovers temporarily, acquiring temporary immunity with certain probability. Because of malware evolution or secure update failure, R-nodes will become back susceptible to malicious infections again. Considering this, a The clients or terminals will get infected once they visit the compromised webpages on the web server which has been intruded. parameter ζ is introduced to depict the probability of a R-node leaving for the susceptible compartment owning to immunity failure.
Infective devices will send malicious link copies to their neighboring nodes. For different kinds of malware, the rate of infecting susceptible nodes may be distinguished. This is an important concern for establishing an effective model. Here, the infection rate λ is defined as the probability that an S-node receives the malicious link sent by a neighboring I-node within a unit time.
Model formulation and analysis. As a matter of fact, the number of nodes in each compartment is dynamically changing over time. Thus, four variables S(t), D(t), I(t) and R(t) are introduced to describe the numbers of susceptible, delitescent, infected and recovered nodes at time t, respectively. The network size at time t is denoted by For simplicity, we assume that all newly-connected nodes are susceptible. The parameter b denotes the rate of nodes that are newly connected to the network within per unit time, and the parameter d is the disconnection probability that a node leaves the network per unit time. By applying the mean-field technique to the above assumptions, a compartmental model can be formulated as where the parameters ε, η, γ, b, d, ζ are nonnegative, and ε + η < 1.
Adding the equations of system (1) leads to N′ (t) = b − dN(t), which can be explicitly solved as where N(0) represents the initial number of nodes over the network. It can be easily observed that N(t) is varying over time if N(0) ≠ b/d. This corresponds to the fact that real networks are always evolving, owing to certain nodes dynamically connected to or disconnected from the network. While for the special case N(0) = b/d, the size of the network will keep constant due to a balance of newly-connected and disconnected nodes. The explicit solution also indicates that for the case N(0) < b/d the total network size N(t) will strictly increase to the final saturation number of b/d. Actually, the numbers of terminal devices over real networks will also reach saturation by some technological constraints, such as IP addresses, network bandwidth, and communication channel congestions.
Remark 1: Note that if b = d = 0 then system (1) reduces to model web malware propagation over a static network. And, for the case ζ = 0 model (1) looks similar to the classical SEIR model with demographics in epidemiology 35 , however, we mainly consider it for the characteristics of web malware propagation and incorporate the impact of human intervention into the model by introducing the appropriate parameter η . Thus, the above SDIRS model is essentially a newly-formulated model for web malware propagation with varying network size.
Propagation threshold. The propagation threshold of model (1) (usually also called as basic reproduction number in epidemic models which can be explained as the average number of secondary infections produced by a single infected node during its infection time) is calculated as (see Methods A for detailed calculations) Note that all the parameters in system (1) except for ζ have impact on the propagation threshold R 0 . This can be explained by that the parameter ζ which describes the probability of a R-node losing temporary immunity does not reflect the infective ability of current propagating web malware. By viewing the parameters in (2) as variables, then it obviously follows by the expression of (2) that R 0 is strictly decreasing with respect to the parameters γ, η, d, respectively, while R 0 is strictly increasing with respect to another two parameters b and λ, respectively. For the parameter ε, straightforward calculations yield ε η ε η ε Thus, the threshold R 0 is monotonically increasing with respect to ε. The parameters ε, γ and η are important since they reflect human intervention on malware infection process. Figure 3(a,b) show values of R 0 as a function of two varying parameters ε and γ (respectively, ε and η) with other parameters specifically given. The propagation threshold R 0 plays a significant role in determining the dynamics of system (1). It follows by calculations that system (1)   Stability analysis. We intend to address the stability of the equilibria of system (1). Firstly, we define the global stability of an equilibrium for system (1) with respect to be an equilibrium of system (1), then it is said to be globally asymptotically stable with respect to Ω 0 if it is Lyapunov stable and for each initial value . Then, we theoretically prove the global stability of the malware-free equilibrium  0 of system (1) with respect to Ω if R 0 < 1 (see Methods B). This means that under the model (1) once the threshold R 0 < 1 (under the comprehensive effect of all parameters), then the web malware (for any initial state within Ω) is bound to eventually disappear from the network. In this case, the web malware itself may have low diffusion ability, e.g., the malicious links can be easily recognized, so users will neither click on them nor forward them to other friends. Besides, high security awareness of users also benefits the reduction of R 0 even if the web malware has strong infective ability. Figure 4 numerically illustrates the analytical results. The parameters used for numerical simulations are chosen  For the case R 0 < 1, the global dynamics of (1) in Ω has been completely determined. Its epidemiological implication is that the number of infected nodes over the network vanishes in time so web malware finally disappears from the network. While for R 0 > 1, the web malware will persist. The web malware is said to be endemic if the infected nodes over the network persist above a certain positive level for sufficiently long time. It can be well captured and analyzed through the notion of uniform persistence. System (1) is said to be uniformly persistent (see refs 36 and 37) if there exists a constant 0 < c < 1 such that any solution (S(t), D(t), I(t), R(t)) with (S(0), D(0), Thus, the web malware is endemic if system (1) is uniformly persistent. And, we can easily prove that system (1) is uniformly persistent by using Theorem 4.3 in ref. 38 (refer to the proof of Proposition 3.3 in ref. 39). In this case, both the numbers of infected and delitescent nodes persist above a certain positive level.
For the infected equilibrium  ⁎ of system (1), we theoretically prove its asymptotical stability if R 0 > 1 and further discuss the global stability of the special case ζ = 0 under certain assumptions (see Methods C). This means that under the effects of parameters in model (1), once the threshold R 0 > 1, then the number of nodes infected by web malware will finally keep a steady level. This case reflects the kind of web malware which may evolve or have strong infectivity, and thus there exists a game between web malware and antivirus software. Figure 5 numerically illustrate the stability of  ⁎ . For the set of parameter values specifically given in Fig. 5, the value of R 0 is computed as 1.1582 > 1, and the corresponding infected equilibrium is  ≈ ⁎ (7455, 309, 294, 1941). Figure 5(a) shows the evolutions of system (1) with a specific set of initial values, from which it can be seen that all the components of system (1) eventually converge to corresponding infected equilibrium states, respectively. In order to explore how the solutions evolve with different starting points, Fig. 5 Parameter analysis. For the case R 0 > 1, let Ψ = ε ζ ζ γ ε γ η ε Fig. 6(a), it is obviously shown that greater infection rate λ benefits the propagation of web malware, resulting in keeping a final higher number of infected devices. It can be also seen in Fig. 6(a) that the infected component of malware equilibrium possesses significant difference when λ ∈ [0.00005, 0.001], while I * has inconspicuous increase while λ belongs to the interval (0.001, 0.01). By taking several different sets of parameters, the evolutions of I(t) are also shown in Fig. 6(b), which indicates that some web malware (characterized by choosing appropriate parameters) is possible to intrude the whole network.
By viewing the parameter ζ as a variable, straightforward calculations yield Scientific RepoRts | 7:42308 | DOI: 10.1038/srep42308 Thus, Ψ is strictly increasing with respect to ζ. Besides, ζ is not incorporated in R 0 , and ε has positive effect on both Ψ and R 0 . Therefore, I * is strictly increasing with respect to ζ and ε, respectively. Figure 7(a,b) show how the parameters ζ and ε contribute web malware spread, respectively. The number of infected nodes undergoes a drastic change in the early time, and then would finally keep a balance. Higher values of ζ and ε will result a greater eventual level of malware-infected nodes, however, when both parameters ζ and ε reach great enough, the infected component of the malware equilibrium possesses less obvious increase.
In contrast, the parameter γ has obvious negative effect on Ψ , and both γ, η have negative effects on R 0 . Furthermore, note that Ψ does not incorporate η, thus I * is strictly decreasing with respect to γ and η, respectively. Figure 8(a,b) show how the parameters γ and η inhibit the propagation of web malware, respectively. As γ and η increase, the level of infected nodes possesses less reduction, which indicates that the security investment is not proportional to the effectiveness of malware prevention. In other words, when the amount of security investment achieves a certain extent, user's security awareness and the effects of anti-malware measures grow slowly.
Optimal control and strategies. In system (1), there are four state variables S(t), D(t), I(t) and R(t). All the parameters in system (1) are constant, however, the real parameters should be time-varying. Thus, in this section, some of these parameters are considered to be controllable, and how to control the dynamic systems is worth studying 40,41 . We will use the control theory to obtain proper strategies for preventing malware spread over networks. First, we assume that the parameter η is controllable, and the variable function η(t) is introduced to reflect the probability that a D-node turns to be a R-node with the influence of user awareness at time t.
]} indicate an admissible control set.  The economic impact of malware attacks worldwide is dramatically increasing. As we all know, malware would cause massive direct damages and costs, such as labor costs, costs of repairing and cleansing infected systems, loss of user productivity, loss of revenue due to loss or degraded performance of system, and other costs directly incurred as the result of a malware attack. In order to effectively avoid malware attacks, updated anti-malware or firewall are widely deployed at both the organizational level and the individual level to defend against malware threats. But, these preventive measures also cost much security investment. Next, we aim to minimize the total cost of direct loss and security investment.
The optimal problem. The more nodes are infected by malware, the greater economic loss is. Thus, the financial loss caused by malware can be considered to be relevant to the number of infected nodes. We introduce a function F loss (I(t)) to describe the economic loss caused by the malware-infected nodes over the network. For simplicity, we suppose that the average loss caused by a single infected node per unit time is a suitable constant φ. Then, the whole loss caused by all the infected nodes within a unit time is φI(t) which is proportional to the infected node number. Then, we can compute the loss function across the time interval [0, T] as follows In addition, we also suppose that the level of user security awareness grows with the increasing of security investment. So, inversely, the cost investment function, denoted by F cost (η(t)), is also monotonically increasing with the value of η(t) which reflects the level of user security awareness. Here, we define where ϕ is an appropriate coefficient. The greater ϕ is, the more security investment costs for same improving of user security awareness. The square of the control variable reflects the severity of the size effects of control.
In the sequel, we propose an optimal control problem to minimize the following objective functional where J(η) is the sum of direct loss and preventive security investment. For the sake of deriving an optimal solution pair, we need to define the Lagrangian and Hamiltonian for the optimal control problem (3) and (4). In fact, the Lagrangian of the optimal problem is given by Figure 8. Consider system (1) with parameters b = 100, d = 0.01, λ = 0.0002, ζ = 0.1, η = 0.35, ε = 0.25,  and initial vector (1000, 1000, 3000, 2000). Scientific RepoRts | 7:42308 | DOI: 10.1038/srep42308 Next, we need to seek a suitable η(t) such that the integral of the above Lagrangian arrives the minimum. To do this, we define the Hamiltonian H for the control problem as follows where λ 1 (t), λ 2 (t), λ 3 (t) and λ 4 (t) are the adjoint functions to be determined suitably.
Theorem 1. Consider system (4) with the objective functional (3), then there exists an optimal control  η Proof. Note that the control variable and the state variables in system (4) are nonnegative. Besides, the coefficients involved in system (4) are bounded and each state variable of system (4) is bounded on the finite time interval, so we can employ the result in ref. 42 (pp. 182) to confirm the existence of an optimal control to system (4).
First, the set of control and corresponding state variables is nonempty. All the right parts of the equations of system (4) are continuous, bounded and can be written as a linear function of η with coefficients depending on time and states. In this minimizing problem, the necessary convexity of the objective functional in η(t) is satisfied.
]} is apparently convex and closed. Besides, the optimal system is bounded which determines the compactness needed for the existence of the optimal control. Additionally, the integrand of the objective function (3), i.e., I(t) + ϕη 2 (t)/2, is convex on the control η(t). And, it is easy to confirm that there exists a constant ρ > 1 and positive numbers v 1 and v 2 such that J(η(t)) ≥ v 1 |η| ρ/2 + v 2 . Thus, we conclude that there exists an optimal control.
To find the optimal solution, the Pontryagin's maximum principle is applied to show the existence of an optimal control. Theorem 2. Let S * (t), D * (t), I * (t) and R * (t) be optimal state solutions associated with the optimal control variable η * (t) for the optimal control problem. Then, there exist adjoint variables λ 1 (t), λ 2 (t), λ 3 (t) and λ 4 (t) that satisfy with transversality conditions Furthermore, the optimal control η * (t) is given by 2 4 Proof. First, we use the Hamiltonian (5) to determine the adjoint equations and the transversality conditions. By setting S(t) = S * (t), D(t) = D * (t), I(t) = I * (t) and R(t) = R * (t), and differentiating the Hamiltonian (5) with respect to the state variables S, D, I and R, we obtain By the optimality conditions, we have Scientific RepoRts | 7:42308 | DOI: 10.1038/srep42308 It follows by the above identity that 2 4 Considering the property of the control set , we obtain So we have the optimal control η * (t) which can be written in the following compact notation Here, the formula (9) for η * is called as the characterization of the optimal control. The optimal control and states can be found by solving the optimality system consisting of the state system (4) with boundary conditions, the adjoint system (6) and (7), and the characterization of the optimal control (9). To solve the optimality system, we use the initial and transversality conditions together with the characterization of the optimal control η * given by (9).
By substituting the values of η * (t) into the control system (4), we get the following system To find out the optimal control and the state system, we need to numerically solve the above system (10).
Numerical algorithm. In this section, we apply an iterative approach called Gauss-Seidel-like implicit finite-difference method to solve the optimality system. First, we discretize the time interval [0, T] into n sub-intervals at the points t k = kδ, k = 0, 1, … , n(nδ = T), where δ is the time step. It is well known that the derivative of a differentiable function x(t) is defined by Thus, the time derivative of the state variable can be approximated by its first-order forward-difference when the time step δ is small enough, e.g., In the sequel, we denote S k = S(t k ),  Then, the above state values can be used to solve the adjoint equations by approximating the time derivative of the adjoint variables using their first-order backward-differences because of the transversality conditions. Thus, we derive Next, we can formulate an algorithm to solve the optimality system and get the optimal control by certain calculations. It follows by (11) and (12) that Then, by some calculations, it follows by (9) that the value of the optimal control at time t k + 1 is formulated as Numerical simulations. In this section, we aim to do some numerical simulations for the optimality system by using the above iterative method. In order to compare the numerical results of system (1) and the control system, we consider the same parameters, i.e., b = 100, d = 0.01, λ = 0.00005, ζ = 0.1, η = 0.5, ε = 0.2, γ = 0.2, and φ = 0.001, ϕ = 30. Through certain calculations, we plot Fig. 9(a) that shows the evolutions of the numbers of nodes in each compartment with the optimal control shown in Fig. 9(b). We can see in Fig. 9(b) that the optimal control η(t) increases in the early time and finally tends to a constant. This means that we should enlarge the security investment in the process of control. Figure 9(c) illustrates the evolution of infected nodes with optimal control, compared to the number of infected nodes without control. In order to explore the influence of parameter ϕ, we design a numerical experiment with ϕ as a variable. Consider other parameters given above, Fig. 10 shows the dynamics of system (3) and (4) with five different values of ϕ. It is shown in Fig. 10(b) that the control variable decreases and approaches the equilibrium earlier with the increase of ϕ, while Fig. 10(a) shows that the number of infected nodes increases for greater value of ϕ. This indicates that security investment should be properly cut down when the cost arrives high enough.

Discussion
In this study, we introduce several parameters to describe the spread processes of web malware based on their mechanism analysis, and develop a new compartmental SDIRS model with varying network size to model the spread of web malware over networks. We compute the propagation threshold of the model and carry out its sensitivity analysis. The properties of the elementary model system are also carefully analyzed. If the threshold is below unity, the global stability of the malware-free equilibrium is theoretically proved. The malware equilibrium is proved to be locally stable if the threshold exceeds unity. Although we study the long behavior of this model, it can be only used to describe web malware spread within a short time interval since the parameters in the SDIRS model are assumed to be constant.
Practical parameters are actually varying with time. So, based on the newly established SDIRS model, we consider the parameter η to be varying and controllable. Aiming to keep the total economic loss of security investment and infection loss as low as possible, we propose an objective functional and study the optimal control strategy towards the η parameter. Through theoretical analysis and the Pontryagin's maximum principle, the expression of the optimal control is explicitly given. Numerical simulations show the effectiveness of taking the control strategy on inhibiting the spread of web malware over networks. Also, we suggest that users should enhance their awareness levels of network security, such as being able to discriminate malicious links and not to click on strange hyperlinks, installing updated anti-virus software on devices, keeping browsers updated and installing patches immediately. We develop the model (1) based on a homogeneously mixed assumption of the propagation network. It can be applied to model the proliferation of web malware over complete or regular networks. But, most real-world networks, such as the WWW and Internet, have been empirically found to be highly structured rather than simply homogeneously, e.g., each device may have heterogeneous malicious hyperlinks. The compartment-based models suffer from a common defect of not making full use of the knowledge concerning the structure of the propagation network. As a result, it is worth understanding the impact of network topology on the web malware prevalence. In recent years, network(node)-based models have already been considered and developed to model infectious disease diffusion over complex networks 44,45 , such as spatial epidemics 46 and waterborne diseases 47 . Thus, our future work is to formulate further novel network-based models by incorporating the influence of network topology on web malware spread.

Methods
Calculation of the threshold. Van . For the functions in the above system, five assumptions (A1)-(A5) are described below.
 is set to zero, then all eigenvalues of Df(x 0 ) have negative real parts, where Df(x 0 ) is the derivative [∂ f i /∂ x i ](i.e., Jacobian matrix) evaluated at x 0 which is a (locally asymptotically) stable equilibrium.
Then, van den Driessche and Watmough proved a useful lemma (see Lemma 1 in the ref. 48). That is, if the above assumptions (A1)-(A5) are satisfied, then the derivatives DF(x 0 ) and DV(x 0 ) are partitioned as where F and V are the m × m matrices defined by Then the SDIRS model can be rewritten as follows  It is easy to verify that the functions satisfy assumptions (A1)-(A5).
is set to zero, then all eigenvalues of Df(x 0 ) have negative real parts.
Then, it follows by the above result (see also Lemma 1 in the ref. 48) that λ ε . Then, the threshold is the spectral radius of the matrix FV −1 , i.e., Proof of the global stability of malware-free equilibrium E 0 . Theorem 3. The malware-free equilibrium point  0 of model system (1) is globally asymptotically stable with respect to Ω if R 0 < 1. Proof. We proceed by use of the Lyapunov direct method with undetermined coefficients. Denote  = . b d / Consider the following candidate function where ω 1 , ω 2 , ω 3 are positive constants to be determined. Clearly, it follows by D(t) ≥ 0, The time derivative of  along an orbit of system (1) , then we need to find appropriate ω 2 such that the following two inequalities are satisfied represents the initial size of the network. Next, we proceed by considering two cases.
Case 2: c > 0. In this case, N(0) > b/d and N(t) is strictly decreasing. That is, the network size keeps reducing to the minimum limit of b/d.
The proof completes by following the above two cases.
Proof of the stability of malware equilibrium  ⁎ . Theorem 4. The malware equilibrium  ⁎ of system Proof. Here, we use the method of first approximation to show the asymptotic stability of  ⁎ . By certain calculations, the Jacobian matrix of (1) at a point  = ∈Ω S D I R ( , , , ) can be derived as Thus, the Jacobian matrix of (1) at the malware equilibrium  ⁎ is Scientific RepoRts | 7:42308 | DOI: 10.1038/srep42308 Next, we only need to confirm the matrix  ⁎ J ( ) is stable, namely, the real parts of all its eigenvalues are negative. This is usually done by checking the Routh-Hurwitz conditions, but here verification of the inequalities in the Routh-Hurwitz conditions for  ⁎ J ( ) is technically rather complicated. So, we use another criteria for the stability of matrices. That is, for an m × m matrix A with real entries to be stable, it is necessary and sufficient that: (1) the second compound matrix (See Methods D) A [2] is stable; . This result was developed by Li et al. 39 by using the spectral properties of the second compound matrices (also see Lemma 5.1 in ref. 39).
Thus, it remains to show that  ⁎ J ( ) satisfies the above conditions (1) and (2). The second compound matrix  ⁎ J ( ) [2] of the Jacobian matrix  ⁎ J ( ) is  For the diagonal matrix P = diag (I * , (γ + d)/(ε)I * , t 1 I * , S * , t 2 S * , t 3 S * ), where t 1 , t 2 , t 3 are positive real constants to be determined, then the matrix  ⁎ J ( ) [2] is similar to This verifies the above condition (2) and completes the proof.

Remark:
If ζ = 0, then system (1) can be also reduced to model the case that recovered nodes have permanent subject to the restriction s(t) + d(t) + i(t) + r(t) = 1. In system (16), the term b/N(t) represents the percentage of newly-connected S-nodes over the whole network within unit time, λN(t) means the average infection rate of I-nodes over the whole network per unit time. Next, we denote / ( ), : ( ), and assume that  b, λ  keep constant here, which indicates that in this case b and λ are actually changing with the varying network size.
Observe that the variable r(t) does not appear in the first three equations of (16) and note that the identity s(t) + d(t) + i(t) + r(t) = 1 implies =  d b. This allows us to attack (16) by studying the subsystem From physical considerations, we study (16)  . It can be verified that Γ is positively invariant with respect to (17). We denote by ∂ Γ and Γ the boundary and the interior of Γ , respectively. Note that system (17) is essentially equivalent to a special case (α = 0) of system (2.3) in ref. 39. Thus, we can similarly address the global stability of the malware-free equilibrium (respectively, malware equilibrium) of system (17) with respect to Γ (respectively, Γ ) by the method given in ref. 39.
Scientific RepoRts | 7:42308 | DOI: 10.1038/srep42308 Compound matrices. For an n × n matrix A and integer 1 ≤ k ≤ n, the k-th additive compound matrix of A is denoted by A [k] . This is an N × N matrix, = ( ) where B(k) is the kth exterior power of an n × n matrix B and D + denotes the right-hand derivative. Some details for the definition and properties of additive compound matrices together with their connections to differential equations can be referred to the papers 49,50 .
The entries in A [2] are linear relations of those in A. Let A = (a ij ). For any integer = … ( ) i n 1, , 2 , let (i) = (i 1 , i 2 ) be the ith member in the lexicographic ordering of integer pairs such that 1 ≤ i 1 < i 2 ≤ n. Then, the entry in the ith row and the jth column of Z = A [2] is defined by Pertinent to our purpose, for n = 4, the second additive compound matrix A [2] of an n × n matrix A = (a ij ) is For any integer 1 ≤ k ≤ n, the kth additive compound matrix A [k] of A is defined canonically. Some properties of the additive compound matrices and further applications can be found in the refs 50 and 51.