Rumor spreading model considering rumor credibility, correlation and crowd classification based on personality

The study of rumor spreading or rumor controlling is important and necessary because rumors can cause serious negative effects on society. The process of rumor spreading is influenced by many factors. In this paper, we suggest that people with different personalities will behave differently after hearing rumors. Thus, we divide the population into two types: radical people and steady people. Furthermore, we suggest that the credibility of rumors and the correlation between rumors and people’s lives are important factors that will influence the spread of rumors. Based on these considerations, we propose the SEIsIrR model. We establish differential equations to describe the dynamics of the rumor spreading process in homogeneous and heterogeneous networks. Using the Jacobian matrix and next generation matrix, we obtain the spreading threshold of the SEIsIrR model and discuss the relationship of the spreading threshold between homogeneous networks and heterogeneous networks. We employ a real rumor dataset obtained from Twitter to verify the SEIsIrR model and perform numerical simulations in Watts-Strogatz (WS) networks and Barabasi-Albert (BA) networks to verify the obtained spreading thresholds and discuss the impacts of these factors on the rumor spreading process and the differences in the rumor spreading processes between WS networks and BA networks. The simulation results show that these factors influence the speed and range of rumor spreading.

Rumor spreading model considering rumor credibility, correlation and crowd classification based on personality Xuelong chen * & nan Wang the study of rumor spreading or rumor controlling is important and necessary because rumors can cause serious negative effects on society. The process of rumor spreading is influenced by many factors. In this paper, we suggest that people with different personalities will behave differently after hearing rumors. Thus, we divide the population into two types: radical people and steady people. Furthermore, we suggest that the credibility of rumors and the correlation between rumors and people's lives are important factors that will influence the spread of rumors. Based on these considerations, we propose the SEIsIrR model. We establish differential equations to describe the dynamics of the rumor spreading process in homogeneous and heterogeneous networks. Using the Jacobian matrix and next generation matrix, we obtain the spreading threshold of the SEIsIrR model and discuss the relationship of the spreading threshold between homogeneous networks and heterogeneous networks. We employ a real rumor dataset obtained from twitter to verify the SeisirR model and perform numerical simulations in Watts-Strogatz (WS) networks and Barabasi-Albert (BA) networks to verify the obtained spreading thresholds and discuss the impacts of these factors on the rumor spreading process and the differences in the rumor spreading processes between WS networks and BA networks. The simulation results show that these factors influence the speed and range of rumor spreading.
With the development of the internet, rumor spreading has become easier and faster 1,2 . Social networking service (SNS) is the main platform of rumor spreading due to its numerous users and complex network structure. Everyone in social networks is both a spreader and a recipient of information. So, a rumor is easily produced and spread widely, and some rumors cause great panic in society 3,4 . For example, a rumor that nuclear leakage caused by the Fukushima nuclear accident in Japan would pollute salt was widely spread on the internet in 2011 and caused panic buying of salt. Since rumor spreading can cause serious consequences, an in-depth investigation of rumor spreading in complex social networks has significance 5 and can help governments or managers of SNS to control rumor spreading.
The process of rumor spreading in social networks is similar to the process of epidemic spreading; thus, most studies of rumor spreading are based on the epidemic model. Daley and Kendall formalized the first rumor spreading model, namely, the DK model, in 1964; the model is based on the classic epidemic model SIR 6 . In the DK model, the crowd was divided into three groups: people who have not heard rumors (ignorants), people who spread rumors (spreaders), and people who stop spreading rumors (stiflers). With the studies of complex network theory, the effect of complex network structure on the process of rumor spreading was explored. Moreno et al. 7 introduced mean field equations to characterize the rumor spreading process in complex networks. Nekovee et al. 8 contrasted the rumor spreading process in a random network with the rumor spreading process in a scale-free network, the results showed that the rumor spreading model exhibits different spreading thresholds in different networks.
In addition, rumor spreading is a social contagion process, in which people's behaviors and social environments may influence the process of rumor spreading. Thus, some researchers considered people's behaviors and social environments in rumor spreading. Zhao et al. proposed a rumor spreading model that considers the forgetting mechanism, which expressed that spreaders may convert to stiflers without contacting others.
These researchers also verified that the forgetting rate, which changes over time, has a significant impact on rumor spreading [9][10][11] . Corresponding to the forgetting mechanism, some scholars considered the remembering mechanism and yield to the SIHR model 12,13 . Wang et al. 14 believed that trust between people can influence rumor spreading and proposed a rumor spreading model that considers the trust mechanism to analyze the influence of trust between people on rumor spreading. Additionally, two or more rumors would spread at the same time. Some scholars extended the classic single rumor spreading models to coupled spreading models, which consider two rumors [15][16][17] . Moreover, some people may refute the rumors that he or she has heard. In light of this, Zan et al. 18 focused on the counterattack mechanism and analyzed the influence of networks' self-resistance on rumor spreading. Similarly, Zhang et al. 19 designed the IS 1 S 2 C 1 C 2 R 1 R 2 propagation model to study rumor refutation in complex social networks. When people hear rumors, most people will not immediately become spreaders and contemplate whether the rumors are true. In view of this, Huo et al. 20 proposed the XWYZ model. Compared with the SIR model, this model considered the new group W, which denotes people who hesitate to spread rumors. Xia et al. 21 proposed the SEIR model with the hesitating mechanism and introduced the fuzziness of a rumor's content as a parameter of this model. Hu et al. 22 believed that there are three attitudes towards rumors, namely, supported, opposed and hesitant, and they verified that people who hesitate to spread rumors are positive to the spreading of rumors. Moreover, some researchers suggested that individuals' decisions and actions depend on their states of mind and surroundings when they hear rumors 23,24 . Sahafizadeh et al. 25 indicated that group propagation, where a group means a small community in which people may not directly know each other but all members can view messages sent to the group, has an impact on rumor spreading.
The differences among people and some of their attributes, such as education level and ability to identify information, may influence their decision when they hear rumors. Afassinou introduced the population's education rate in the rumor spreading model and divided the ignorants into two parts, namely, educated ignorants and non-educated ignorant; the simulation results showed that the education rate influenced the process of rumor spreading 26 . Similarly, Wang et al. 27 divided the ignorants into two parts according to their ability to identify information; the simulation results verified that the division is significance. In addition to these, Ma et al. 28 applied individuals' mastering degree of knowledge and rationality degree to depict individuals' diverse characteristics and formed the conclusions that disseminating rumors is difficult due to the diverse characteristics of individuals. Li et al. 29 discussed individuals' sensitivity to rumor spreading based on the SIS and SIR models and concluded that lower sensitivity can inhibit the spread of rumors. Furthermore, the connection between individuals also influences rumor spreading. Cheng et al. 30 developed a model with a dynamic spreading probability, which is related to the strength of individuals' connections in a social network, and verified that the strength of connection substantially impacts the process of rumor spreading.
However, we discover that the majority of previous studies did not consider the rumor credibility in rumor spreading models. Although some scholars considered the importance and fuzziness of rumors 21,31 , these two attributes of rumor cannot properly represent the credibility of rumors. The fuzziness and importance of rumors just depicts the feature of words, whereas the credibility can fully describe the persuasiveness of rumors to people. In addition to textual content, measuring the credibility of a message requires the consideration of additional factors, such as information sources 32 . Therefore, considering the credibility is appropriate.
Although crowd classification has been introduced into rumor spreading models in some papers 26,27 , crowd classification based on personality is a relatively new perspective. An individual's personality describes intrinsic human traits that influence the external performance 33 . We hypothesize that a crowd can be divided into two categories based on personality. After hearing a rumor, some people are radical, that is, they easily believe what they heard. Conversely, some people are steady and calm and are likely to contemplate and seek confirmation before they decide whether the rumor is true and spread the rumor. Besides, people are more likely to focus on news that is relevant to their lives. Some studies have shown that information that is relevant to people' lives is more easily spread 34 . For example, salt is relevant to people's lives; the rumor that salt would be polluted by the Fukushima nuclear accident was wildly spread in China in 2011. Thus, the correlation degree between rumors and people's lives is an important factor that influences the rumor spreading process and should be taken into account in the study of rumor spreading. However, previous studies did not adequately consider this factor.
The remainder of this paper is structured as follows: In the section titled "Rumor spreading model", the description of our model is given. We establish differential equations to describe the dynamics of this model. The spreading thresholds of this model in homogeneous and heterogeneous networks and their relationship are also discussed in this section. In the section titled "Verification and numerical simulation", the model is validated by real data, and numerical simulations are performed to illustrate the rumor spreading process and the impacts of different factors on the process based on our model. A brief conclusion is given in the last section.

Rumor spreading model
In this section, a rumor spreading model, which considers rumor credibility, correlation between rumors and people's lives and crowd classification based on personality, is proposed and referred to as the SEIsIrR model.
Based on the classic SIR model and the hesitating mechanism, we divide individuals into 5 classes according to the process of rumor spreading as follows.
• Steady ignorant. This class includes people who do not know the rumor; if they hear the rumor, they prefer to contemplate it and seek confirmation before making decisions. • Radical ignorant. This class includes people who do not know the rumor; if they hear the rumor, they are most likely to believe it and spread it without contemplating it or seeking confirmation. • Exposed. This class includes people who know the rumor but hesitate to believe it and do not spread it.
• Spreader. This class includes people who spread the rumor.
• Stifler. This class includes people who know the rumor but never spread it or stop spreading it.
These 5 classes are represented as Is, Ir, E, S and R, respectively. Each individual must belong to only one of the 5 classes. In addition, individuals' classes are not fixed and can change with rumor spreading. The transition rules for the 5 classes are explained as follows: (1) Rule 1: From class Ir to class S. If a rumor is more credible and more relevant to people's lives, people who belong to Ir are more likely to switch to class S when they hear the rumor. Thus, we assume that when an individual in class Ir comes into contact with an individual in class S and get to know the rumor, he will switch to class S with probability γαλ, where γ is the credibility of the rumor; α is the correlation coefficient, which depicts the degree of correlation between the rumor and people's lives; and λ is the spreading probability, which depicts the probability that an ignorant individual knows the rumor via contact with an individual in class S 22 . These 3 parameters are continuous random variables that are constrained as 0 1 γ ≤ < , α < ≤ 0 1 and 0 1 λ ≤ ≤ . The higher γ is, the more credible the rumor is. When γ = 1, the rumor is fully credible, which means that everyone who knows the rumor will believe it; however, this situation is impossible in real life. Therefore, we limit γ to less than 1. The higher α is, the more relevant to a human's life the rumor is. α = 1 implies that the rumor is closely bound to people's lives.
(2) Rule 2: From class Is to classes S and E. When an individual in class Is comes into contact with an individual in class S, he or she will switch to class S or class E with different probabilities. Since individuals in class Ir are more likely to spread rumors than individuals in class Is, we introduce the parameter μ ( μ < ≤ 0 1 ) to depict the spreading desire ratio of individuals in class Is to individuals in class Ir. The probability that individuals in class Is will switch to class S can be defined as γαλμ. From the previous definition, the parameter μ should only have a role in the process of switching from ignorants to spreaders, and individuals in class E are not spreaders because they are hesitant to spread the rumor even though they had heard it. Therefore, it is reasonable that μ does not act on the switching from Is to E. Conversely, individuals in class Is are calmer compared with individuals in class Ir; so, they may switch to class E with the probability γ(1-γ)αλ, where γ = 1 means that the rumor is fully credible, and γ = 0 means that the rumor is fully unreliable. None of the individuals will switch to class E in either case. When γ takes the median value (γ = 0.5), the rumor is most doubtful and the number of individuals who will switch to class E attains the maximum. In other situations, higher and lower credibility, such as when γ = 0.7 and γ = 0.3, may produce a similar hesitation degree of individuals. Therefore, we suggest that more credible and less credible rumors have a similar influence on the probability that individuals will switch from Is to E. According to reality and intuitive experience, a certain relationship between the rumor credibility and the proportion of people who hesitate should exist. We assume that the relationship is that the proportion of people who hesitate is positively correlated with γ(1-γ), which will be verified by real-world data and the simulation in the section titled "Verification and numerical simulation". Thus, the reasonable assumption that the probability that individuals in class Is will switch to class E is expressed as γ(1-γ)αλ.
(3) Rule 3: From class E to classes S and R. Individuals in class E can switch to class S or R depending on the class which the individuals they are in contact with belong to. Individuals in class E will decide whether to spread rumors when they are influenced by others. Therefore, we assume that individuals in class E switch to class S with the probability θ when they are in contact with an individual in class S and switch to class R with the probability φ when they are in contact with an individual in class R. (4) Rule 4: From class S to class R. When an individual in class S comes into contact with an individual in class S, E or R, the former would switch to class R with the probability η 1 because a spreader will realize that the rumor is not new when he or she meets another person who knows the rumor. Additionally, individuals in class S will switch to class R at the rate η 2 due to the forgetting mechanism, which means that some individuals may forget the rumor in the spreading process.
Based on transition rules 1-4, we construct a flow diagram of the rumor spreading process, as shown in Fig. 1. The meanings of the parameters of the SEIsIrR model are summarized in Table 1.
Because the period from the birth of a rumor to its disappearance is relatively short, the change in the total number of individuals is minimal and negligible in this period. So, based on the premise that the analysis results will not be affected and to simplify the discussion, we assume that the total number of individuals is N and will remain constant during the process of rumor spreading.
Because complex networks such as social networks are usually the carriers in rumor spreading, we can discuss rumor spreading process based on the SEIsIrR model in complex networks to validate the scientificity of the SEIsIrR model. Various complex networks, such as the Watts-Strogatz (WS) network, Edös-Rényi (ER) network and Barabasi-Albert (BA) network, exist. These networks can be divided into two categories according to the degree distribution, namely, homogeneous networks and heterogeneous networks. In the next subsections, we will discuss the spreading threshold of the SEIsIrR model in homogeneous and heterogeneous networks.

Spreading threshold of SEIsIrR model in homogeneous networks. Let
be the densities of individuals in class Is, Ir, E, S and R, respectively, at time t in homogeneous networks. These densities satisfy the normalization condition as follows: According to the transition rules of the SEIsIrR model, we establish the following mean-field equations to describe the evolution of t ( ) in the homogeneous networks: where k denotes the average degree of the homogeneous networks. The steady state of Eq. (2) is expressed as follows: , where x and y mean the densities of individuals in class Is and the densities of individuals in class Ir, respectively, when no individuals exist in class E and S. The densities of individuals in class R are represented as 1-x-y according to Eq. (1).
We employ the Jacobian matrix method to discuss the stability of Eq. (2) and obtain the rumor spreading threshold in the homogeneous networks.
The Jacobian matrix of Eq. (2) is expressed as follows: The Jacobian matrix at E 0 − − x y x y ( , , 0, 0, 1 ) is as follows: The characteristic equation of J(E 0 ) can be obtained as follows: We can calculate the eigenvalues of the Jacobian matrix J(E 0 ) by solving Eq. (6) and obtain the eigenvalues Because the first 3 eigenvalues are 0 and the last eigenvalue is non-positive, whether point E 0 is stable depends on the fourth eigenvalue.
In similar studies, the basic reproduction number (denoted as R 0 ) is usually employed to describe the number of secondary cases caused by a spreader in a completely ignorant population 35 . R 0 < 1 represents that each spreader produces less than one new infection on average, which means that the number of spreaders does not increase and the rumor will fade. Conversely, R 0 > 1 represents that the rumor will infect more people and the rumor can continuously spread in the crowd.
Based on the fourth eigenvalue, we define R 0 as follows: 2 μ γ αλ η η + < − − + , that is, the fourth eigenvalue of J(E 0 ) is negative. Since all eigenvalues are non-positive number in this case, the equilibrium point E 0 is stable, which means that the rumor will not spread further. Conversely, if R 0 > 1, the fourth eigenvalue of J(E 0 ) is positive, and the equilibrium point E 0 is not stable, which means that the rumor will continuously spread in the network.
Spreading threshold of SEIsIrR model in heterogeneous networks. Nodes in heterogeneous networks have different degrees. For example, the node degree of the BA network follows a power-law distribution. For a better analysis, we divide the heterogeneous population into several homogeneous groups according to the nodes' degree. Each group is also divided into 5 classes-Is, Ir, E, S and R-as mentioned in previous sections. (2020) 10:5887 | https://doi.org/10.1038/s41598-020-62585-9 www.nature.com/scientificreports www.nature.com/scientificreports/ represent the densities of individuals with degree k in classes Is, Ir, E, S and R, respectively, at time t; they satisfy the normalization condition If we use p k ( ) to denote the degree distribution function of a network, which means the ratio of individuals with degree k to the whole population, then we have ρ ρ can be expressed with the same pattern. The probability that nodes with degree k connect to nodes with degree ′ k can be represented as ′ p k k ( / ), and to simplify the discussion, we consider uncorrelated heterogeneous networks, in which p k k k p k k and k is the average degree of the network. Thus, the probability that a node with degree k connects to a spreader node at time t is ρ ∑ Based on the transition rules of the SEIsIrR model and mean-field method, we establish the mean-field equations in heterogeneous networks as follows: Eq. (8) will attain a steady state when the gradients of all densities are 0; so the steady state is represented as follows: According to the next generation matrix method 35 , we can calculate the basic reproduction number by the formula R FV ( ) 0 1 ρ = − , where − FV 1 is referred to as the next generation matrix and X ( ) ρ is the spectral radius of X. F and V can be instantiated by the differential of  and V respectively, at E 0 , where  denotes the rate of production of new infections, and V denotes the rate of change in the 5 classes. V can be obtained by the formula = − − + V V V , where V − denotes the outflow rate of individuals in the 5 classes and + V denotes the inflow rate of individuals in the 5 classes.
In this paper, individuals who switch from class E to class S are not considered to be new infections because individuals in class E and S have already heard rumors, which means that they are infected individuals 35 . Based on the next generation matrix and Eq. (8), we have the following: By differentiating Eq. (10) and Eq. (11) at E 0 ′ ′ − ′ − ′ x y p k x y ( , , 0, 0, ( ) ) , we can obtain the following: We obtain the inverse of V as follows: k k 1 1 2 The basic reproduction number for Eq. (8) is the spectral radius of the next generation matrix − FV 1 35 , and the basic reproduction number for the heterogeneous population is the sum of the basic reproduction number for homogeneous groups with different degrees. Therefore, we can obtain R 0 for the entire population as follows: . Here, we use p k ( ) to convert x′ and y′ into x and y, so that the R 0 of homogeneous network and heterogeneous network can be further compared with each other.

Relationship between the spreading thresholds in homogeneous networks and heterogeneous networks.
In the above analysis, we have obtained the spreading thresholds of the SEIsIrR model in the homogeneous networks and the heterogeneous networks, which are expressed in Eq. (7) and Eq. (15). Comparing these two equations, we can obtain R k x y 0 ( ) 2 = μ γ αλ η + in both the homogeneous networks and the heterogeneous networks as η 1 approaches 0. Therefore, the results concluded that the basic reproduction number in the heterogeneous networks is equivalent to that in the homogeneous networks when η 1 approaches 0. The basic reproduction number depends on the average degree of a network rather than the degree distribution of a network when the rate of individuals in class S who switch to class R approaches 0.
To further analyze the numerical relationship between R 0 in the homogeneous network and R 0 in the heterogeneous network, we can obtain other forms of R 0 in the homogeneous networks and heterogeneous networks by further conversion of Eq. (7) and Eq. (15), as shown in Eq. (16) and Eq. (17), respectively: To make R 0 in the homogeneous networks comparable to R 0 in the heterogeneous networks, we set the average degree k of the two networks to the same value.
. Comparing Eq. (16) and Eq. (17), R 0 in the heterogeneous networks is less than R 0 in the homogeneous networks. This result shows that the spreading threshold of heterogeneous networks is less than that of homogeneous networks when the parameters and the average degree of networks are equivalent in both networks. In other words, rumors spread more widely in the homogeneous networks.

Verification and numerical simulation
In this section, we verify the SEIsIrR model by using a real rumor dataset of Twitter named Pheme and numerical simulations. Additionally, the impacts of different parameters on the rumor spreading process and the differences between the rumor spreading processes in a homogeneous network and those in a heterogeneous network are also discussed.
The WS network is a typical homogeneous network. Compared with the ER network, the degree distribution of the WS network is more concentrated, which is more consistent with the nature of a homogeneous network.
www.nature.com/scientificreports www.nature.com/scientificreports/ The WS network is closer to social networks in reality. As to heterogeneous network, BA is the typical representative, which has been extensively recognized. Therefore, we simulate the process of rumor spreading based on the WS network and BA network in numerical simulation verification.

Model verification.
Verification by actual data. The Pheme dataset is used to verify the rumor spreading model. First, we obtain categories that correspond to the classes in our model by classifying the tweets in the Pheme dataset according to their properties. Second, we adjust the parameters in our model to make the curve obtained by the model simulation best fit the curve generated by the data from the Pheme dataset. Last, we analyze the reasons for the slight difference between the two curves.
The Pheme dataset contains 4824 tweets associated with 9 different events on Twitter 36 . We choose two events: Charlie Hebdo and Sydney Siege. The reason for choosing these two events is that they contain the most tweets, and the larger the number of tweets is, the more obvious the transmission characteristics are.
The Pheme dataset has two types of tweets for each event: source tweet and reply tweet. A source tweet is an initial tweet, and a reply tweet is a response to a source tweet. Table 2 illustrates some tweets' properties and possible values that we use to classify tweets. The property "support" denotes the attitudes of source tweets. The property "response-vs-source" denotes the attitudes of reply tweets to source tweets. The property "certainty" denotes the degree of certainty with which the tweets are expressed.
The data in the Pheme dataset is the reflection of all individuals and their behaviors involved in rumor spreading over a certain period of time. In our model, Is and Ir denote people who are not currently involved in rumor spreading, and R demotes people who know the rumor but never spread it or stop spreading it. So, capturing the changes in the number of individuals in classes Is, Ir and R is difficult in the Pheme dataset. Depending on the tweets' properties, we label some tweets in the Pheme dataset as rumor tweets or hesitating tweets that correspond to class S and E in our model.
For example, if the value of "support" property is supporting, the source tweet is a rumor tweet and the source tweet's reply tweet is a rumor tweet if the value of "response-vs-source" is agreed. Despite the value of the other property, if the value of "certainty" is uncertain, the tweet is judged to be a hesitating tweet. Although antirumors do not belong to the content of this paper, a reply tweet to an antirumor tweet is judged to be a rumor tweet if the reply tweet has a "response-vs-source" property value of "disagreed". So, the antirumor tweets in the Pheme dataset are considered in the labeling process. The process of labeling tweets can be summarized as the following pseudo code. if x is source tweet: if x.support == supporting: x = rumor tweet else if x.support == denying: x = antirumor tweet else if x is reply tweet: if x.reponse-vs-source == appeal for more information or x.certainty == uncertain x = hesitating tweet else if x.source == rumor tweet: if x.reponse-vs-source == agreed: x = rumor tweet if x.reponse-vs-source == disagreed: x = antirumor tweet else if x.source == antirumor tweet: if x.reponse-vs-source == agreed: x = antirumor tweet if x.reponse-vs-source == disagreed: x = rumor tweet We can label tweets according to their properties; however, specific parameters' values, such as γ, in the SEIsIrR model cannot be obtained from the Pheme dataset. Therefore, we adjust the parameters' values to obtain a model that closely reflects the real data, and the parameters' values that make the model derived curve to best fit the real data derived curve would be the best values. We perform multiple simulations with different parameters' values, which are iteratively changed from 0.1 to 1, and obtain the specific parameters' values in Table 3, which can render the model derived curve that resembles the real data derived curve. In addition to the parameter adjustment, the initial population classification should also be set for the verification process. Because the ratio of the quantity of radical ignorant to the quantity of steady ignorant is hardly available in the Pheme dataset, the ratio is set to 6:4 after many experiments.
properties values source tweets support supporting, denying, underspecified reply tweets response-vs-source agreed, disagreed, appeal for more information certainty certain, somewhat-certain, uncertain, underspecified www.nature.com/scientificreports www.nature.com/scientificreports/ For the Charlie Hebdo event, Fig. 2(a) shows how the number of rumor tweets and hesitating tweets changed in the first 6 hours on Pheme dataset. The reason that we choose the first 6 hours is that this duration is considered a complete rumor spreading process. Although secondary transmission may occur after a few hours, it may constitute another spreading process. Fig. 2(b) shows the simulated evolution of densities of individuals in classes E and S in the BA network. Similarly, we can observe how the number of rumor tweets and hesitating tweets changed in the Pheme dataset and the simulated dynamics of the BA network for the Sydney Siege event in Fig. 3. The BA network is chosen for the simulation because it is closer to real social networks. A comparison of Fig. 2(a,b) or a comparison of Fig. 3(a,b) reveals that the dynamics of the real-world dataset and our model are similar. The slight difference is attributed to the network structure, that is, the BA network is not exactly the same as the real-world network in terms of the network structure.
Verification by numerical simulations. Assume that a WS network consists of 10 000 nodes, which means that a rumor is spreading among 10000 individuals, and the average degree of the network is 10. For the BA network, we set the number of edges that connect to each new node to 5, which means that each new node chooses 5 nodes to   www.nature.com/scientificreports www.nature.com/scientificreports/ connect. The node degree of the BA network follows a power-law distribution. For ease of comparison, the number of nodes and the average degree of the BA network are set to 10 000 and 10, respectively, which is consistent with the size and average degree of the WS network. Based on these settings of the network structure parameter, we apply the NetworkX package in Python to generate the initial WS and BA networks. In the simulation process, the value of parameter t represents one tenth of the number of iterations of the simulation, and the 5 classes are dynamically updated in every iteration according to the transition rules among the 5 classes. Additionally, we assume that at the initial time (t = 0), the number of individuals in class S is 10, and the remaining individuals are people who have not heard the rumor, namely, belong to class Ir or Is. can be used to denote the final scale of rumor spreading because it is the density of individuals who have ever known the rumor throughout the rumor lifecycle. Thus, we can conclude that the final scale of rumor spreading in the WS network is larger than that in the BA network. However, the rumor spreading speed at the beginning stage in the BA network is faster than that of the WS network. These conclusions confirm that the central nodes, which have a large number of connected edges in the BA network, accelerate the spreading of rumors and restrain further spreading of rumors once they switch to the R class. Therefore, the final scale of rumor spreading in the BA network is smaller than that in the WS network.
To verify Eq. (7) and Eq. (15) and explore the influence of the value of R 0 on rumor spreading, we set ρ = (0) 100 S to highlight the change in ρ t ( ) S ( (0) S ρ is set to 100 only for Fig. 5) and simulate the evolution of the densities of individuals in class S when R 0 < 1 and R 0 > 1. Fig. 5(a) shows two cases for the WS network: when R 0 < 1 and when R 0 > 1. In the case of R 0 < 1, we set the spreading probability λ = 0.06 and obtain R 0 = 0.81 based on Eq. (7). According to the definition of R 0 , the density of individuals in class S decreases when R 0 < 1, which means that the rumor does not spread further. In the case of R 0 > 1, we set the spreading probability λ = 0.3. By Eq. (7), we obtain R 0 = 4.05, which is greater than 1. This finding means that on average every spreader can infect more than one ignorant when R 0 > 1; thus, the rumor can be propagated. Fig. 5(b) shows two cases in the BA network: when R 0 < 1 and when R 0 > 1. Similar to the two cases in the WS network, if R 0 < 1, the curve falls until the density of individuals in class S approaches 0, which means that rumors do not spread further. Conversely, if R 0 > 1, the curve rises and then gradually falls, which means that more individuals are infected by rumors.

Discussion on the impacts of parameters.
To maintain the consistency of the analysis process, the simulation process in this subsection adopts the network generation parameters in the sub-subsection titled "Verification by numerical simulations".
Impact of credibility of rumor. Fig. 6 shows the evolution of the densities of individuals in classes E, S and R over time with different γ in the WS network. As shown in Fig. 6(a), as γ increases, the peak value of ρ t ( ) E decreases when γ > 0.5. Conversely, the peak value of ρ t ( ) E increases with an increase in γ when γ < 0.5. When a rumor is fully credible or completely untrustworthy, no hesitation will occur. When γ is approaching 0.5, the rumor becomes more confusing and more people will doubt the rumor. Therefore, the maximum value of ρ t ( ) E will appear when γ = 0.5. As shown in Fig. 6(b), the peak value of ρ t ( ) S is positively correlated with γ. With an increase in γ, the time for ρ t ( ) S to attain its peak decreases. As shown in Fig. 6(c), the larger the value of γ is, the larger the www.nature.com/scientificreports www.nature.com/scientificreports/ final value of ρ t ( ) R is and the shorter the time to attain a steady state is. Thus, we can conclude that the increase in γ will cause an increase in the speed and the final scale of rumor spreading. Fig. 7 illustrates the changes in the densities of individuals in classes E, S and R over time with different γ in the BA network. Compared with Fig. 6, similar changes occur in the WS network, that is, the maximum densities of individuals in class E appears when γ is 0.5 (refer to Figs. 6(a) and 7(a)). However, the curves with different γ in the BA network are closer to each other than those in the WS network (refer to Figs. 6(a,b) and 7(a,b), which means the differences in the time to attain their peak values with different γ in the BA network are smaller than those in the WS network. Central nodes in the BA network accelerate the spread of rumors and reduce the influence of the increase in γ on rumor spreading.
Impact of crowd classification. in the WS network, where ρ (0) Ir represents the initial proportion of individuals in class Ir. Since classes E and R have no individuals at the initial time, that is, ( , the sum of ρ (0)

and (0)
Ir ρ is a definite value, that is, given the value of (0) Ir ρ , the initial population classification is determined. (0) Ir ρ is set to 0.3995, 0.4995 and 0.5995, which means that the initial number of individuals in class Ir is less than, equal to and greater than the initial number of individuals in class Is. As shown in Fig. 8, the smaller (0) Ir ρ is, the higher the peak of is. Conversely, the larger (0) Ir ρ is, the higher the peak of ρ t ( ) S is and the larger the final value of ρ t ( ) R is. According to these phenomena, if the number radical people in a crowd increases, the speed of rumor spreading increases, the number of spreaders increases, and the number of people who hesitate to spread rumors decreases. in the WS network are greater than those in the BA network, and the time to reach their peak values in the BA network is a slightly less. The reason for these differences is the existence of central nodes in the BA network.
Impact of correlation coefficient. Fig. 10 illustrates how the densities of individuals in classes E, S and R change over time with different α in the WS network. As shown in Fig. 10, with an increase in parameter α, both the peak values of ρ t ( ) E and ρ t ( ) S and the final value of ρ t ( ) R increase, and the time to attain their steady states decreases. The higher the correlation coefficient between a rumor and people's lives is, the larger the spreading scale of the rumor is and the faster the speed of rumor spreading is. This conclusion accords with the reality. Fig. 11 describes the changes in the densities of individuals in classes E, S and R over time with different α in the BA network. Compared with Fig. 10, the total trend of the curves in Fig. 11 are identical to that in Fig. 10. However, each of the curves in Fig. 10(a,b) is symmetrical about its peak value, and each of the curves in Fig. 11(a,b) is asymmetrical. The nodes in the WS network have almost the same number of connecting edges, and the increasing processes of the values of t ( ) E ρ and ρ t ( ) S are similar to their decreasing processes.
Comprehensive impact of credibility of a rumor and correlation coefficient. In this part, we investigate the comprehensive impact of the credibility of a rumor and the correlation coefficient between a rumor and people's lives on rumor spreading to further understand the rumor spreading mechanism in the SEIsIrR model. The relationship between the final densities of individuals in class R, which is denoted as ( ) R ρ ∞ and the two parameters γ and α, is shown in Fig. 12, where we set 0 1 0 9 . ≤ γ ≤ . and . ≤ α ≤ . 0 1 0 9. The value of ρ ∞ ( ) R will be very close to 0 or 1 when these two parameters are less than 0.1 or greater than 0.9. As shown in Fig. 12, ( ) R ρ ∞ attains its maximum value of 0.98 when γ = 0.9 and α = 0.9. When the values of γ and α are greater than 0.5, the curve slowly falls with a decrease in γ and α. However, when γ and α are less than 0.5, the curve rapidly falls with a decrease in γ and α. Therefore, we can obtain the conclusion that ρ ∞ ( ) R is more sensitive to γ and α when γ and α is less than 0.5. Comparing the trend of a surface on the α coordinate axes with that on the γ coordinate axes, the similar     www.nature.com/scientificreports www.nature.com/scientificreports/ conclusion Inspired by real life and based on previous studies, first, we extended the classic SIR model to propose SEIsIrR model considering the crowd classification based on personality, the correlation degree between rumors and people's lives and the rumor credibility in this paper. Second, we applied the Jacobian matrix and next generation matrix to analyze the spreading threshold of SEIsIrR model in homogeneous and heterogeneous networks and discussed the relationship between the spreading threshold in homogeneous networks and that in a heterogeneous networks. Last, we validated the model by real data and numerical simulation, simulated the process of rumor spreading using different parameters and explored the impacts of crowd classification, rumor credibility and correlation coefficient between rumors and people's lives on the rumor spreading process in the WS and BA networks by using Python.   www.nature.com/scientificreports www.nature.com/scientificreports/ We obtained significant results: (1) The simulation results showed that when the credibility is closer to its intermediate value, more people will hesitate to spread rumors, and the higher the credibility of a rumor is, the larger the final scale of rumor spreading is. If a social network has more radical people, the speed and final scale of rumor spreading is faster and larger. The higher the correlation degree between a rumor and people's lives is, the larger the spreading scale of a rumor is and the faster the speed of rumor spreading is. (2) In the same conditions, the speed of rumor spreading in heterogeneous networks is faster than that in homogeneous networks. However, the scale of rumor spreading in heterogeneous networks is smaller than that in homogeneous networks. The central nodes in heterogeneous networks generates these differences.
In the future, we will consider the following two possible extensions. First, the further fine-grained population classification from the perspective of personality is closer to the social reality, so we might get the probability distribution of people with different personalities by analyzing real-world data to extend the SEIsIrR model and make it more practical. Second, the credibility and the correlation degree of a rumor varies for different people. So, we may further refine the characteristics of the correlation and rumor credibility instead of just characterizing them by single parameters.