#cparse("/super/config/super.config.vm") #cparse("${superIncludes}/super.before-doctype.fhtml") #cparse("${superIncludes}/super.head-top.fhtml") Supplementary Information #cparse("${superIncludes}/super.head-bottom.fhtml") #cparse("${superIncludes}/super.body-top.fhtml")

Supplementary Information for

Modelling disease outbreaks in realistic urban social networks

S. Eubank, H. Guclu, V.S.A. Kumar, M.V. Marathe,
A. Srinivasan, Z. Toroczkai, N. Wang, and The EpiSims Team

Related web sites

Background information and access to data and software

Additional analysis of the social network

Methods: EpiSims modelling technology

Methods: Smallpox modelling results

Note on the degree distribution of a Random Geometric Graph

Future Directions

Note: Clicking on any figure below will bring up a higher resolution version.


Background information and access to data and software

TRANSIMS

TRANSIMS is a large scale agent based traffic simulation tool being developed by The Los Alamos National Laboratory during the past 8 years involving a team of about 40 people and a total cost of 30 million US dollars. It is sponsored by the US DOT (Department of Transportation), US DOD (Department of Defense) and LANL (Los Alamos National Laboratory). TRANSIMS is a system capable of simulating second-by-second movements of every person and every vehicle through the transportation network of a large metropolitan area. IBM Business Consulting has created a commercial version of TRANSIMS, which is an integrated suite of products containing an easy-to-use Graphical User Interface for the modelling functions, a GIS-based network editor, a 3D data visualization and animation software, and a reporting system. They are releasing the latest version of TRANSIMS to both non-commercial and commercial users. The update includes the TRANSIMS-DOT Modelling Interface, the TRANSIMS-DOT Network Editor, the TRANSIMS-DOT Visualizer, and improvements to the TRANSIMS-LANL, Version 3 modules. IBM Business Consulting Services is preparing to assist the transportation planning and research communities in making the transition to this new technology. For information on how to receive TRANSIMS-DOT, please contact     
     Naveen Lamba, 

     IBM Business Consulting Services
     Naveen.Lamba@us.ibm.com
     Phone: 703-322-5656
     http://www.transims.net

LANL has in the past offered an academic software license to TRANSIMS for a nominal fee. We intend to provide such a license for both the newest version of TRANSIMS and EpiSims. The Department of Transportation has announced that TRANSIMS will be available in mid-February, 2004 for academic use at a cost of around $1000 US.

More than half a dozen universities, including Texas A&M and Open University, already have such licenses. TRANSIMS has been used successfully by these universities to carry out transportation planning research. In addition, they have developed detailed semester long course material based on TRANSIMS for training transportation scientists. As an example, Professor Laurence R. Rilett, an associate research engineer with Texas Transportation Institute and the E.B. Snead II Associate Professor of Civil Engineering, at Texas A&M University been using TRANSIMS for the past five years. See http://itstexas.tamu.edu/presentations/2001annualmeeting/Rilett.pdf for a recent talk by Prof. Rilett on this topic. The TRANSIMS-LANL source code is also available to universities through Los Alamos for US$1000 for teaching, research and development. Contact Mr. Charles Gibson, gibson_charles_e@lanl.gov, in the Technology Transfer office. The computational resources required to develop a model for a city of a million or so people should be available at most universities.

The Federal Highway Administration and DOT have decided to use TRANSIMS as their primary transport planning modelling environment. This is an unprecedented step: the current models in use in Metropolitan Planning Offices (MPOs) are about 30 years old. It  povides a perfect opportunity to produce social contact networks for various cities over the next 5-10 years time frame. IBM has been given the initial contract for this process and more than $100 million has been allocated for the initial deployment in 5-10 cities over the next 2-4 year time frame.

For more information and technical details on TRANSIMS see the bibliography below and proceedings volumes from the Transportation Research Board (TRB) annual conferences, where TRANSIMS has been regularly presented for the last eight years.  These papers describe the entire spectrum of research: from modelling methods to efficient algorithms to studies and statistical analysis of data. The research papers and technical reports provide a very detailed account of methods and calibration processes used in TRANSIMS.  Kai Nagel, who was our colleague at Los Alamos and currently is at ETH Zurich, is teaching a similar course at ETH and is also writing a book based on the work done at Los Alamos and ETH.

Calibration and validation of TRANSIMS has been carried out at various levels. A structural validation of the simulation is done so that the models produce known traffic invariants such as flow density patterns, jam wave propagation, etc. At the level of data matching, TRANSIMS is designed so that each module of TRANSIMS produces data sets that can be matched to known or collected macroscopic observables. These include: activities and population densities in an entire city, number of people occupying various locations in a time varying fashion, time varying traffic density split by trip purpose and various modal choices over highways and other major roads, turn counts, number of trips going between zones in a city, etc. The data set used for Portland has been validated at all these levels. TRANSIMS has been used in three 1-2 year case studies, for Albuquerque,  Dallas/Ft. Worth, and Portland. Each of these studies guided its development to more sophisticated use cases. Technical reports on the Dallas and Portland case studies are available on LANL's TRANSIMS web site.

TRANSIMS bibliography

Some of the published papers on different aspects of TRANSIMS are listed below. Many more papers, reports and user documentation for TRANSIMS are available on the web site: http://transims.tsasa.lanl.gov/documents.html 

In particular, papers 9-18 set the theoretical and physical basis of the simulation itself. The other papers and publications are concerned with the computational problems of the simulation (such as efficiency of the algorithms, parallelization, etc.).
 

  1. C. Barrett, K. Bisset, R. Jacob, G. Konjevod, M. V. Marathe. Classical and Contemporary Shortest Path Problems in Road Networks: Implementation and Experimental Analysis of the TRANSIMS Router, in Proceedings of the European Symposium on Algorithms, Lecture Notes in Computer Science, Springer, 126-138, 2002.
  2. C. Barrett, R. Jacob, M. V. Marathe. Formal-Language-Constrained Path Problems. SIAM Journal of Computing, 30(3): 809-837, 2000.
  3. R.R. Jacob, M. V. Marathe, K. Nagel. A Computational Study of Routing Algorithms for Realistic Transportation Networks, ACM Journal of Experimental Algorithms, 4, Article No. 6 (1999).
  4. P. Wagner and K. Nagel., Microsimulation and the Physical Basis for TRANSIMS.  Microscopic Modelling of Travel Demand: The Home-to-Work Problem, Transportation Research Board Preprint, (1999).
  5. P M Simon and K Nagel, Simple queuing model applied to the city of Portland, International Journal of Modern Physics, 10, 941-960 (1999).
  6. J. Esser and K. Nagel, Census-based travel demand generation for transportation simulations. In Traffic and Mobility: Simulation-Economics-Environment, Eds. W. Brilon, F. Huber M. Schreckenberg, H. Wallentowitz, Springer, (1999), p. 135-148.
  7. K. Nagel, M. Rickert, P. M. Simon and M Pieck. The dynamics of iterated transportation simulations. Presented at the TRIannual Symposium on Transportation ANalysis (TRISTAN-3), San Juan, Puerto Rico, (1998).
  8. M. Rickert and K. Nagel, Issues of simulation-based route assignment, International Symposium on Transportation and Traffic Theory (ISTTT) (1999).
  9. T P Kelly and K Nagel. Relaxation criteria for iterated traffic simulations, International Journal of Modern Physics C9, 113-132 (1998).
  10. P M Simon, K Nagel, Simplified cellular automaton model for city traffic, Physical Review E, 58, 1286-1295 (1998).
  11. K Nagel, D E Wolf, P Wagner, and P Simon, Two-lane traffic rules for cellular automata: A systematic approach, Physical Review E58,  1425-1437 (1998).
  12. K Nagel. Experiences with iterated traffic microsimulations in Dallas, Overview Paper, in Traffic and Granular Flow, (1997).
  13. M Rickert, K Nagel. Experiences with a simplified microsimulation for the Dallas/Fort-Worth area, International Journal of Modern Physics C, 8, 1009 (1997).
  14. K Nagel, C L Barrett. Using microsimulation feedback for trip adaptation for realistic traffic in Dallas, International Journal of Modern Physics C, 8, 505-525 (1997).
  15. P Wagner, K Nagel, D E Wolf, Realistic multi-lane traffic rules for cellular automata, Physica A, 234, 687-698 (1997).
  16. M. Rickert, K. Nagel, M. Schreckenberg and A. Latour. Two lane traffic simulations using cellular automata, Physica A, 231(4) 534-550 (1996)
  17. K. Nagel, Particle hopping models and traffic flow theory, Physical Review E, 53(5), 4655 (1996).
  18. K Nagel and S Rasmussen. Traffic at the edge of chaos, Artificial Life IV, edited by R.A. Brooks and P. Maes, MIT Press Cambridge MA, p. 222-235, (1994).

Social network estimates

An instance of the static people-people (228 MB) referred to in the paper can be downloaded by clicking here. This is an ASCII text file compressed with gzip. For each vertex, it gives the vertex id and degree, followed by a list of its neighbour's identifiers and the duration of contact with that neighbour in hours. For example, the data for the first two vertices in the file is:
63470	7
30497 3.25028
63471 15.0003
63472 10.5003
119348 4.1667
838564 2.15031
934839 5.31691
1042269 0.166977
63471 2
63470 15.0003
63472 13.3336
Vertex (person) id 63470 has 7 neighbours. Neighbour 30497 had 3.25028 hours of contact with 63470 during the course of a day.
Neighbour 63471 is the second vertex in the file. Note that each link is reproduced twice in this file: once for each endpoint.

The full bipartite dynamic social network data can be obtained from LANL by sending an e-mail request to episims-data@lanl.gov accompanied by a one page, formal proposal for the use of the data. This data contains the following information: for every  person (approximately 1.6 million of them) characterized by an integer index, it lists during a 24 hour period the entrance times and exit times into and from a location (characterized by a location integer index), for all locations that person visited during the day (there are a total of approximately 181,000 locations in the Portland data). The data is too large to download via the internet (approximately 12GB) but it will be sent via mail to researchers upon our acceptance of the proposal. This is for a single-day population activity. As more network instances are generated, we intend to make them available throgh the same mechanism.

The census data that goes into the TRANSIMS simulation is publicly available from the census bureau at www.census.gov. Land use and survey data are typically held by urban planning organisations. The Department of Transportation lists data resources and contact information for the 35 largest US urban areas at http://tmip.fhwa.dot.gov/clearinghouse/docs/landuse/compendium/iurd.stm.

EpiSims

The EpiSims software is not as mature as TRANSIMS. It is research software subject to almost daily change and lacking a user interface. We intend to create a stable, documented version covered by an academic license similar to that for TRANSIMS. In the meantime, we have included in this Supplementary Information details on the modelling technology and the parameters used for our studies sufficient for the motivated reader to reproduce our models.

Generic cities

As we have tried to make clear in this paper, we view the social networks created by TRANSIMS as a single instance of a stochastic process defined in an enormous space of possibilities. Our work on characterizing this instance has been undertaken with the goal of eventually generating, through a generalized random graph process, graphs that are constrained to look like social networks produced by the TRANSIMS data. Preliminary studies show that indeed it is possible to parameterize the process in such a way as to model specific cultural, geographic, or demographic attributes of cities. The study, along with the network data will be posted on the EpiSims web site in the near future.

Additional analysis of the social network

The following is a list of a number of other measurements based on the social network data.

Clustering coefficients by degree


click for high resolution click for high resolution
Figure 1. Clustering coefficients by degree for a) the people-contact graph and b) the locations network (after discarding the direction of edges in the latter).



Occupancies by activity type and time of day


click for high resolution
click for high resolution click for high resolution
click for high resolution






click for high resolution
click for high resolution click for high resolution
click for high resolution
Figure 2. Each location in EpiSims is associated with several activities. We consider the primary activity for each location (namely the activity done by the majority of people visiting that location) and observe the number of people with this activity at the location (i.e., temporal degree), as time progresses. The figures above show the temporal degrees of a few locations with different activity types. Each plot shows the temporal degree of 5 randomly chosen locations. The horizontal axis shows time over a 24 hour period with 0 being approximately 11:00 p.m. The primary activity is indicated in the figure. The vertical axis represents the number of individuals at that location as the day progresses. The temporal degrees are according to our expectations. For example, figure a) shows 5 different randomly chosen location blocks labelled homes: the degree decreases considerable during mid-day when people leave for work then it goes back up when they start returning home (some people do not return home working night shifts and leading to a small difference compared to the early morning hours). The locations where the primary activity is work, fill up with people during mid-day, to get emptied again by night when they leave, see figure b). The little dips around noon correspond to people leaving workplaces for lunch. Similarly, for schools, figure e) and colleges figure f) there is a mid-day fill up, with the difference that in colleges the evening classes create a second peak. All these data are result of the people mobility generated by the TRANSIMs simulation engine, and thus they are not inputs to the simulation. The fact that these curves follow our expectations, is a validation of the simulation.

Mixing rates by age


click for high resolution
CAPTION: Age distribution of the population

 

click for high resolution   click for high resolution   click for high resolution
         
click for high resolution   click for high resolution   click for high resolution
         
click for high resolution   click for high resolution   click for high resolution
         
click for high resolution   click for high resolution   click for high resolution
         
click for high resolution   click for high resolution   click for high resolution
Figure 3. Demographics are likely to play a very important role in determining the efficacy of any strategies for disease control. We show here mixing rates for one important demographic group, age. Figures a) - o) show the average number of contacts that a particular age group has with the rest of the population. Each graph corresponds to a specific age group (say A). The horizontal axis corresponds to all the age groups. For a particular age group (say B) on the x-axis, the (corresponding) value of the vertical axis corresponds to the average number of contacts group A make with group B. The particular age (A) for which results are plotted is stated above the plot. The empirical observations agree quite well with our intuition. It also shows the degree of mixing among parts of population with that belong to various age groups. From the pictures a) - g) it follows that youngsters typically have meetings with youngsters (siblings, schools, etc.). Figures i) - o) show that adults spend most contact with other adults (work places). The very high peaks in f), g), and i) show that teenagers have most contacts among other teenagers of the same age (16-18), and have little mixing with others. One reason for studying these distributions is to see if it is possible to design targeted vaccination strategies that are derived from the demographic attributes of a population. Mixing properties among various age groups help specialize the targeting methodology.

Local views of the social network


click for high resolution | click for high resolution |
click for high resolution | click for high resolution
Figure 4. These figures display different representations of the same small part of the person-person contact graph, rendered using the Tulip graph drawing program. Vertices represent individual people, and edges are present if the two people came into contact for at least an hour during the day. Vertices are coloured by their distance in the graph from one of a few individuals. Vertex colours are the same in each panel. Edges are coloured corresponding to the colour of vertices at either end.

EpiSims Modelling Technology

EpiSims is designed to simulate many different diseases. Hence we have developed a disease model capable of representing the characteristics of many diseases by appropriate parameter selection. Here we describe the general within-host progression and between-host transmission models used for the study results reported below, together with the actual parameter settings used in the smallpox and plague models. In addition, we describe a planned revision of the models to address limitations of the current models.

Disease Model

Within Host
Each individual in the simulation who is exposed (either through exposure to an initial release or through contact with an infected and infectious person) will progress through a series of disease stages. An exposed individual will either become infected or not with a probability based upon the disease model and the individual's demographics. Individuals who become infected either develop a clinical case of the disease or not. For instance, some fraction of those infected with smallpox never develop a fever or symptoms of the disease. As above, the probability of developing clinical symptoms depends upon victim characteristics.

Those individuals who develop clinical cases may develop different variants of the disease with correspondingly different disease courses and different epidemiological implications. Five different manifestations of smallpox (infection with variola major) are identified in the literature:
EpiSims uses a single parameter, the disease load, to represent the effect of a disease upon a host. The load in EpiSims is intended to be analogous to viral titre in a throat swab, number of spores or bacteria present, concentration of toxin, etc. However, it need not reproduce such clinical aspects of these loads as distribution throughout the body. It is merely a parameter that is used to determine whether a person is infected, symptomatic, too sick for normal activities, infectious, or dead. The higher the disease load, the sicker the person.

An isolated contaminated person or location's load grows or shrinks at predetermined rates. All locations share a single common exponential growth or decay rate, specified by the user in the disease description. For example, the amount of virus present in the environment would decay exponentially if there were no sources (infected people); the amount of bacteria might grow exponentially; while the number of spores would remain fixed.

Each person has his or her own schedule determining the growth and decay of the disease. The schedule provides for piecewise exponential change in the load (or, put another way, piecewise linear growth of the logarithm of the load). The growth schedule is parameterized by an arbitrary number of time delays ( t0, t1, t2, ... ) and growth rates 0, γ1, γ2, ...). Any of the growth rates can be zero or negative if desired. We use this growth schedule to represent many different effects, including:
The final component of the within-host disease model is a set of threshold values for determining the effect of the load on an individual. The thresholds we use determine whether an individual is:
It is important to note that EpiSims does not assume any ordering of these values, except that LD is the largest and LI the smallest. Thus the relative value of the "staying home" and infectious thresholds can dramatically alter the course of the simulation, as it should. On the other hand, the thresholds in this model are global values, so there is no individual variation in disease related behaviour.

The course of an individual case of disease in EpiSims runs like this:

Transmission model

Environment-mediated transmission
In EpiSims, an infectious person contaminates his or her environment, in a process analogous to sneezing or coughing. The contamination may be restricted to a small region near the infected person, and/or it may spread to an entire EpiSims location, which is roughly the size of an apartment building, office building, or shopping mall. Transmission occurs as uninfected people absorb virus (or bacteria, spores, etc.) from a contaminated location.

Geographical locations, as well as people, have a disease load associated with them, representing the level of contamination of the location. Disease load in a location has an associated exponential growth rate, which may be positive, negative, or zero. This allows EpiSims to model non-infectious diseases, transmission of disease between people who are never in direct contact, or diseases with non-human vectors. The simulation can be initialized by contaminating a specific location at a specific time and/or by assigning a non-zero load to one or more people.

Disease transmission among people is accomplished by contaminating the environment they share. There are two steps:
  1. infectious people contaminate their local environment by shedding load.
  2. people present at a contaminated location absorb load from it.

There are two corresponding parameters controlling the interaction of each person with his or her local environment:
  1. shedding rate, βS, the fraction of the individual's load that is shed to the environment per hour.
  2. absorption rate, βA, the fraction of the environment's load that is absorbed by an individual per hour.
These parameters are specific to the individual. In this way EpiSims can model some behavioural differences that affect transmission rates. For example, children can be given higher absorption and shedding rates than adults if the available data so indicate.

The shedding and absorption parameters can be set from an estimate of how long a person must be in close contact with an infectious person before becoming infected. The product of the shedding rate and the threshold for contagion ( βS LC) gives the minimum amount of load transferred from the infectious person to the environment per hour. The product of the absorption rate with the shedding rate and the threshold load for contagion (βA βS LC) gives the overall rate of absorption by the exposed person. The minimum time to become infected is the threshold for infection divided by the overall absorption rate, or min(TI) = LIA βS LC. If more detailed information is available, for example the probability of infection as a function of time spent in close contact with an infectious person, it can be incorporated into the model.

The scaling of transmission probability with number of susceptibles is not well understood. Clearly, a single infectious person is unlikely to infect everyone in a large stadium, although she or he might infect everyone in a household. Yet the same person might infect everyone in a conference room. The mechanistic transmission model we have outlined here has the property that an infectious person infects roughly the same absolute number of people per unit  time, regardless of the number of susceptibles present. This is because load is absorbed from the environment in a round robin fashion, and it is removed from the environment by absorption. Once a person has absorbed enough load to be infected, that person no longer absorbs load. This models diseases for which the primary caregiver is most likely to be infected and people who are not in close contact are unlikely to be infected.

The transmission model in EpiSims can reflect both the distance between a susceptible individual and an infectious person and the duration of the exposure. If the interaction is of sufficient duration given the distance at which it occurred, the susceptible individual is assumed to have been exposed.

Because of the limitations of the process that produces our social network estimates, we do not have data for proximity of people, other than that they are in the same (possibly very large) location. We do have good estimates of the duration of contact. It seems as though the dependence on distance is very coarse: one mode of transmission occurs for close ranges (< 6 feet) and another for larger distances. We have developed an ad hoc model that takes advantage of this coarseness. A large location is split into small sublocations. An infectious person transmits disease to everyone in the same sublocation at a much higher rate than to others in different sublocations of the same location.
Basic reproduction number
Traditionally, epidemiology has focused on the basic reproduction number, a measure of how many people would be directly infected by a single infectious individual inserted into a population of susceptibles.
The basic reproductive number is a convolution of several quantities evaluated at the introduction of the first infectious person:
Indeed, it is possible to define a spectrum of reproductive numbers related to the characteristic directions in the space of demographics along which disease spreads. In addition, the same convolution of quantities taken at other times will in general yield a time varying reproductive number. The reproductive number is not an input parameter for EpiSims. Instead we suggest that the demographic-dependent person-person transmission rates are the most easily observable invariant quantities, and allow the simulated movement of people to produce the time dependent mixing rates. For diseases with fairly lengthy incubation periods, and thus clearly distinct waves of infection, a value for the basic reproductive number can be estimated from EpiSims output as the ratio of the size of the first two cohorts infected in the simulation.


Smallpox Modelling Results

We have chosen disease parameters for this study based on data available in the open literature, particularly Fenner et al. (Smallpox and its eradication, F. Fenner, D.A. Henderson, I. Arita, Z. Jezek, I.D. Ladnyi, World Health Organization, 1988, also available online), and on private communications with many experts in smallpox modelling. There are, of course, differences of opinion on the correct value of many of these parameters. The ranges of values we have chosen represent our understanding of the best estimates available at the time the studies were undertaken. Our intention with this model is to demonstrate the range of possibilities available in individual based models, and to conduct sensitivity studies for certain parameters.

Parameters for smallpox

It is not sufficiently precise to say an individual has smallpox, because each manifestation has a distinct disease course and poses a different set of issues for response. For example, the flat and hemorrhagic forms disproportionately affect pregnant and immuno-suppressed individuals. Patients with these forms have shorter prodromal periods, are more highly infectious, and will present with different symptoms than ordinary smallpox patients. The correlation between these differences is particularly unfortunate -- if enough people are exposed to the virus, it is likely that individuals with these short incubation time forms of smallpox will be the first to show up in the health care system. They will be haemorrhaging from all body orifices and progressively under every inch of their skin and they will die at a much higher rate than those with ordinary smallpox. Because their symptoms are not as obvious as the classic pox and there will not yet have been any diagnoses of ordinary smallpox, they are likely to be misdiagnosed at first. Misdiagnosis is particularly dangerous in these cases, as they are highly infectious almost as soon as they start presenting symptoms (much sooner than ordinary smallpox patients). Our model for smallpox attempts to capture some of these correlations and provide for the possibility that a subpopulation such as immuno-compromised people might serve as a backbone for disease transmission.

The transmission model quantifies the types of interaction necessary for a susceptible individual to be classified as exposed to a disease and potentially infected with it. In this context we define exposure to mean that the individual has received enough smallpox virions or viral particles that he or she could develop smallpox. Although the threshold is commonly taken to be a single virion for smallpox, not everyone who is in proximity to an infectious person will receive this dose. For example, CDC guidelines recommend prioritizing for vaccination people who have had "face-to-face close contacts ( ~6.5 feet or 2 meters) or household contacts to smallpox patients after the onset of the smallpox patient's fever." (CDC Interim Smallpox Response Plan and Guidelines, Draft 2.0 - 11/21/01) They also distinguish among non-household contacts with fewer than 1 hour of contact, 1-3 hours of contact, and more than 3 hours of contact. Thus, we chose to set transmission parameters so that 1 hour of contact was often sufficient to infect, and more than 3 hours of contact almost always sufficient.

Several variables affect the relationship between exposure and the duration and distance characterizing the interaction between the infectious person and susceptible individual. For example, it is well known that smallpox victims tend to be most infectious during the first few days of rash, with contagion diminishing as the disease progresses. Different manifestations of smallpox (e.g. modified, ordinary, flat, early hemorrhagic, or late hemorrhagic smallpox) also imply different levels of transmissibility. Consequently, a smallpox transmission model should take into account the particular disease manifestation present in the infectious person.


Table 1. Threshold values of log10(load) used for the experiments reported below. See text for explanation of their use.
log(LI)
infected
0.5
log(LS)
symptomatic
5.6
log(LHS)
stay home (early)
5.9

                  (late)
6.2

                  (never)
7.9
log(LC)
infectious
6.0
log(LD)
dead
8.0

We have chosen to represent two variant forms of the disease, both caused by variola major: ordinary smallpox and hemorrhagic smallpox, which is both more infectious and more lethal. Tables 2 and 3 shows parameter values for the two variants. Note that the variant manifested by each person is determined before the simulation is run. We assign 2.4% of the general population hemorrhagic parameters, as well as 21% of pregnant women, and 10% of HIV cases (which are drawn uniformly from the population independent of demographics at a rate of 0.14\%).

The growth rates and time delays for ordinary smallpox are chosen so that the following hold:
The parameters for the hemorrhagic variant are adjusted so that all victims die. 30% of the deaths occur between 12 hours and 36 hours after the end of the incubation period (modelling an early hemorrhagic variant); the rest are 5-7 days after the incubation period. Post exposure vaccination has no effect on people who develop a hemorrhagic variant.


Table 2. Variations in disease manifestation
disease parameter
value
meaning
incubation period
N(12,2)
cutoff at
7 and 17 days
truncated normal distribution
time between infection and
appearance of fever
prodromal period:
10%: 3 days
80%: 4 days
10%: 5 days
length of symptomatic,
non-infectious period
(fever, rash)
pregnancy related
0.0125

0.21

0.45
fraction of females between
15 and 45 who are pregnant
fraction of those pregnant who
will get a hemorrhagic variant
fraction of non-hemorrhagic pregnant who will die
HIV related
2300

0.10
number of people in population
who are HIV positive
fraction of HIV+ who
will get a hemorrhagic variant
hemorrhagic rate
0.024
fraction of general population who will get hemorrhagic variant
death rate
0.30
fraction of non-hemorrhagic, non-pregnant who will die if infected
environmental decay rate
10-6/hour
rate of load decay outside host

       
Table 3. Parameters affecting disease transmission. Absorption rates are set so that a minimally infectious person infects a susceptible after 1 hour of close contact 95% of the time, and after anywhere from 3 minutes to 18 hours 5% of the time.

transmission parameter
value
Shed rate: normal smallpox
                 hemorrhagic variant
0.0001
0.001
Time to infect 95%
                         5%
1 hour
3 min - 18 hours
infectious period 10%
                              80%
                              10%
3 days
4 days
5 days


Variola is assumed to be inactivated very quickly in the environment, so that direct contact is required for transmission. Its decay rate outside a host is taken to leave only one virion in a million viable after an hour.

As discussed above, EpiSims requires shedding and absorption rates for each person to accomplish disease transmission. For the disease parameter values given in the tables above, shedding rates of βS = 10-4 per hour for people with normal variety and 10 times higher for the hemorrhagic variety, that involves more coughing, are reasonable. The absorption rate can now be determined using estimated transmission probabilities as a function of time. For example, assume that 1 hour of close contact with an infectious person leads to infection in 95% of the people. The remaining 5% will become infected in durations spread uniformly from 3 minutes to 18 hours, as shown in Table 3.

There are several possibilities for choosing whom to infect. The load could be spread among everyone present, concentrated onto one victim, or anything in between. For the studies reported in this paper, environmental load was distributed among all present, though it had no effect on those who were immune.

Although not directly parameterized, EpiSims can control the infectivity ratio defined as the quotient of peak infectivity at the beginning of the infectious period to minimum infectivity, given by the threshold for contagion, $L_C$. In the experiments reported here, the ratio is 10.

As noted above, R0, the basic reproductive number, is not a parameter of this model. R0 depends on the social network, disease transmission parameters, and the identities of the first infected people. We can estimate a value from the results of each simulation run as the ratio of the number of people in the second cohort of infected people to the number in the first cohort. With the parameters listed herein, varying the initially infected people and the viral load that causes people to stay home sick, estimated values for R0 vary from 1.0 to 3.4.

Simulated Behaviour

One of the most important assumptions in any smallpox model is whether infectious people are mixing normally in the population. There are two relevant questions: can infectious people be diagnosed by the casual observer (i.e., are people infectious before the appearance of characteristic pox?); and are people incapacitated before they become infectious? This study was designed to examine the sensitivity of results to variation in this assumption. We undertook to model two (probably unrealistic) extreme cases: one in which no one who is infectious is mixing with the general population and another in which no one's behaviour is affected at all by the disease. In addition, we modelled one more realistic case between these two extremes. We have not explored more of this variability because we would first like to refine the overall model.

Table 1 shows parameter values for disease thresholds. We have examined three different thresholds for staying home sick, or self-isolation. The effect of the values we have chosen is as follows:

Simulated Response Protocol

Every person who becomes symptomatic in the simulation is placed on a list. Every simulated day, if contact tracing is in effect, a subset of the people on the list is chosen for contact tracing. The subset contains the minimum of an absolute number of people or a fraction of the people on the list. In the experiments reported here, we use the fraction 0.8 and set the absolute threshold at either 10,000 or 1,000. These are probably unrealistically large numbers, but they allow us to estimate the best case results of a targeted vaccination strategy. Note that because vaccination is done on a household basis, the actual number of people vaccinated per day can be several times as large as the absolute threshold.

Table 4. The activity dependent fraction of contacts of symptomatic individuals who can be traced. For example, no contacts made while shopping will be identified, but all household contacts will be identified.
activity
fraction traced
household
1.0
office visit
1.0
work
0.7
school
0.7
college
0.7
social recreation
0.1
shopping or other
0.0


Table 5. Effects of post exposure vaccination. How many days after exposure a vaccination will prevent death. In the experiments here, everyone age 30 or over is assumed to have been vaccinated. Post exposure vaccination is assumed to have no effect on immuno-compromised individuals.
vaccination status
head start
unvaccinated
2 days
previously vaccinated
4 days


Table 6. Fraction of people who comply with a quarantine order.
state of health
compliance rate
asymptomatic
0.70
symptomatic
0.85



We consider the effects of delay in response, assuming the response is instituted 14, 17, or 20 days after the attack. Mass vaccination is assumed to require four days.

Social network

The activity data currently available from TRANSIMS resolves locations at which activities are performed down to roughly four per city block. Using the social network derived from this level of resolution creates unrealistically high contact rates. Instead EpiSims breaks each location into several cells. Individuals are assigned to one of the cells containing other people performing similar activities at that location. The maximum number of people allowed in a cell varies by activity type, as shown in Table 7. The values chosen below are nothing more than reasonable guesses. The larger these values, the more opportunity for mixing and transmission.

Table 7. Maximum occupancy of each cell in a location where the indicated activity is conducted.

activity
cell occupancy
work
50
shopping
50
college
40
social recreation
30
school
25
office visit or other
10


1) Evolution of smallpox epidemics

click for high resolution   click for high resolution
     
click for high resolution   click for high resolution
Figure 5. Above are four frames of a movie showing a side-by-side comparison of a baseline case (on the left) with a targeted vaccination and quarantine strategy, in which symptomatic people are interviewed, most of their recent contacts identified, and the contacts and their households are vaccinated and sent to quarantine. The full movie below shows 6 frames per day at 4 hour intervals for 70 days. The bars represent number infected at each location, and the colour represents the fraction of infected people who are infectious. The attack is presumed to be covert, so until people become symptomatic around day 10 they continue with their normal activities (and there is no difference between the two movies). The attack site, a university, is readily visible as a very large spike in the downtown area which grows and decays every day as students gather on campus and disperse throughout the city. Above the views of Portland are strip charts displaying the cumulative number of people infected and dead as a function of time, and for the targeted response, the number vaccinated and quarantined, which are important for determining feasibility of the response. These are not labelled, but the colour coding corresponds to the labelled text below the charts. Note that the scale on the leftmost chart is a factor of one hundred different from the rightmost.

Smallpox propagation movie :

This link points to a 250 MB, 1280x720 pixel AVI file. If your browser is not set up to play movies, try opening the AVI file directly with Windows Media Player or QuickTime Player.

Note on the degree distribution of a Random Geometric Graph

A random geometric graph (RGG) is obtained by randomly distributing points in the unit square and connecting with edges only those points which are within a distance R of each other. The existence of a peak around the mean degree and an exponentially decaying tail for large degrees comes from the fact that physical proximity has a typical cost to it, requiring time and space, and a person simply cannot be in close proximity to 1.6 million people during the day14. In spirit, an RGG is similar: the distance R represents a cost beyond which edges do not exist (one can relax this by introducing a distribution around R).

Future Directions

As a result of discussions with modellers and epidemiologists, we have identified several areas in which the current EpiSims technology needs improvements. The design below will permit much easier comparison with other models in the field while still retaining the individual based resolution characteristic to EpiSims. The remainder of this section consists of excerpts from a design document. It does not represent the current state of EpiSims.

Requirements for a model:
Desirable properties:

Drawbacks of the current model:
Benefits of the current model:

Different individuals manifest infection by the same infectious agent in different ways. A primary goal of EpiSims is to capture the dependence of disease manifestation on demographics. To this end, each person in the population is assigned:
The assignment is consistent with user specified probabilities conditional on demographics. A variety of assignments may be produced by varying the random seed. During the course of the simulation, each location (at the finest resolution simulated) is also associated with a disease state. Some aspects of a location's disease state may be modified by exogenous events created by the user (e.g. contamination, decontamination); others reflect transmission dynamics internal to the simulation.

In addition to the conditional probabilities for the assignments described above, the user must also specify the following:

Planned implementation of Disease Manifestations

Each possible Disease Manifestation must be specified by the user prior to using EpiSims. A Disease Manifestation is a Markov Chain consisting of a finite set of Disease States together with transition probabilities among them and a distribution of residence times in each state. Most Disease States are associated with values for the following attributes relevant to the spread of disease:
The value of some attributes affects the dynamics of disease transmission directly. For example, infectivity is related to the probability of transmission as explained below. Others may affect the behaviour of an infected person: symptom levels reflect the severity of symptoms, the likelihood of health-care-seeking behaviour, and the likelihood of correct diagnosis; the degree of incapacitation affects whether a person stays home from work, shopping, or other activities. Detailed interpretation of these attributes is provided below.   In addition to these attributes, the user can assign each state a unique name. Simulation outputs and analysis tools will refer to states by this name.

Note that the Disease State does not contain a "recovered" attribute. The simulation maintains information about each person's history, including whether an individual has ever been infected and whether he or she is currently infected. These can be combined into the notion of "recovered". Alternatively, it is possible to specify a transition from one Disease Manifestation to another upon "recovery". That is, a person who has contracted the disease and recovered may have a different reaction the next time.

As mentioned above, the finest resolution of location also includes a form of Disease State. A location's Disease State contains enough information to represent contamination. Thus, at least the "infectivity" attribute of a Location's Disease State should be maintained, although other attributes are ill defined. Possibly, the "symptom" attribute could be used to specify whether contamination could be detected. Note that, unlike a person's Disease State, a Location will probably cycle through many infections in the course of the simulation. Whenever an infectious person is present, the Location will become contaminated. This contamination may decay quickly if the residence time in the infected state specified by the user is short.

Residence Times
Every Disease State except the special dead and uninfected states is associated with a probability distribution of residence times. The user may choose from a predetermined set of distributions and assign any necessary parameters.

State Transitions
The user may specify an arbitrary number of transitions out of each Disease State into others. Associated with each transition is a probability. Optionally, each transition may also be associated with a set of Treatment Ids. When a person leaves a Disease State, she or he will pick a new state from among those whose transitions are labelled with the person's treatment id, with

Planned Implementation of Disease Transmission


A transmission rate function returns the (baseline) probability of a person's becoming infected per minute of contact as a function of his/her disease transmission type and the type of an infectious person at the same location. That is, if exactly one susceptible of transmission type j and one infectious person transmission type k have been in a work location for one minute, the base probability that the susceptible has become infected is given by ρ work(j,k) . The susceptible or infectious "person" may in fact be a location. Note that the transmission rate function need not be symmetric between susceptible and infective, and that it may be activity specific.

Planned Adjustments to Transmission Rates
The reason the probability returned by the transmission rate function is called a "baseline" is that it is further adjusted by duration of contact, number of people in the location, infectivity, and susceptibility.
Duration of Contact
If more or less time than one minute has passed, the probability is adjusted as for a Poisson process, using the survival rate and assuming the probability of infection in each time interval is independent. Thus if the base probability for infection per minute is p, the probability in t time units is 1 - (1-p)t.
Number of People Present
If more than one infective is present, the probability is scaled under the assumption that each infective spreads disease independently. Thus if there are Ni infectives of transmission type i, with probability of transmission pi, the overall probability of transmission in time t will be 1 - exp{t Σi Ni ln(1-pi)}. If M susceptibles are present, we divide the probability of transmission by the scale factor Mα, where α is a user specified scale factor. Each susceptible present undergoes a Bernoulli trial with the probability relevant to that person.
Infectivity and Susceptibility
If the infective has infectivity r, and the susceptible has susceptibility s, the base transmission probability is adjusted to be srρwork(j,k). The user should ensure that all possible resulting probabilities are less than unity. Taking into account the different levels of infectivity associated with each Disease State, if there are nk,r infectious people of type k with infectivity r, then the probability of infecting a single susceptible of type j in time t would be

p(t) = 1 - exp [t Σtypes k Σinfectivity r nk,r ln(1 - rρwork(j,k)j)].

Total probability of transmission
Putting everything together, the probability of infecting a person of transmission type j with susceptibility s in a location with activity type a with M susceptibles and nk,l infectious people of type k with infectivity r during a time t, subject to user specified scaling in susceptibles α, is:

pj,s(t) = {1 - exp [t Σtypes k Σinfectivity r nk,r ln(1 - rs ρ a(j,k)]}/ Mα
Planned Disease Manifestation Constraints
There is a single consistency constraint on allowed values of attributes for a Disease State: non-zero incapacitation implies non-zero prodrome or symptoms. In particular, the following constraints are NOT imposed:
  1. infectious => symptomatic or prodromal
  2. dead => uninfectious (corpses can be hazardous)
In addition, the transition probabilities for each state must sum to unity by Treatment Id.

Planned Implementation of Behavioural Thresholds

Some actions taken at run time during the simulation depend on thresholds set by the user. For example, when a person becomes incapacitated, he or she will skip some normal activities. Which activities are skipped depends on the value of the person's incapacitation versus user specified thresholds. Similarly, symptomatic people may be mis-diagnosed if their level of symptoms is not above a user specified threshold. Also, symptomatic people may seek over-the-counter remedies or emergency care as the level of their symptoms rises. The user may specify a threshold value for incapacitated for staying home from any of the defined activity types. Furthermore, the user may specify any of the following thresholds for the symptomatic and prodrome attributes:
The user may specify two sets of thresholds – when more than a user specified number of people have been diagnosed with the disease the second set of thresholds will be used. This allows for the increased likelihood of correct diagnosis when the disease is known to be present in the community.

Planned Implementation of Prophylaxis and Treatment

Prophylaxis (before infection) is modelled by changing the person's susceptibility. The user must specify a distribution of susceptibilities to use. As usual, this distribution may be conditioned on people's demographics. The variability in susceptibility post-prophylaxis allows one to model variable efficacy.

During the course of the simulation, an individual may seek treatment as described above. Availability of treatment is constrained by the simulation based on available resources (in an as-yet-to-be-determined way) and on level of symptoms (also to be determined). The simulation will determine whether each individual seeking treatment receives it, and also what kind of treatment is given. Examples of possible treatments might include:
The effect of treatment (after infection) is specified by the user in the Disease Manifestation Model. Each state transition may be labelled with a set of Treatment Ids. If it is not labelled, the transition is available to any individual. A labelled transition is only available to individuals who have received treatment at one of the ids included in the set.





#cparse("${superIncludes}/super.body-bottom.fhtml")