key: cord-016238-bs1qk677
authors: Venkatachalam, Sangeeta; Mikler, Armin R.
title: An Infectious Disease Outbreak Simulator Based on the Cellular Automata Paradigm
date: 2006
journal: Innovative Internet Community Systems
DOI: 10.1007/11553762_20
sha: 
doc_id: 16238
cord_uid: bs1qk677

In this paper, we propose the use of Cellular Automata paradigm to simulate an infectious disease outbreak. The simulator facilitates the study of dynamics of epidemics of different infectious diseases, and has been applied to study the effects of spread vaccination and ring vaccination strategies. Fundamentally the simulator loosely simulates SIR (Susceptible Infected Removed) and SEIR (Susceptible Exposed Infected Removed). The Geo-spatial model with global interaction and our approach of global stochastic cellular automata are also discussed. The global stochastic cellular automata takes into account the demography, culture of a region. The simulator can be used to study the dynamics of disease epidemics over large geographic regions. We analyze the effects of distances and interaction on the spread of various diseases.

Nowadays, the problem of emergent diseases and re-emergent diseases like influenza and SARS, have caused increased attention towards public health in general and epidemiology specifically. With the ever-increasing population and ability to travel longer distances in short time, the spread of communicable diseases in a society has been accelerated [16, 17] . Growing diversity of the population, and globalization are leading towards increasing interaction among individuals. Constant exposure to public health threats is raising people's concern and necessitates pro-active action towards preventing disease outbreaks. Further, greater emphasis on infections and epidemics is rooted in the imminent threat arising from bioterrorism. As a result, Public Health professionals have been focusing on identifying the factors in the social, physical and epidemiological environment which aid to faster spread of diseases.

As the significance of Public Health is being recognized, the role of epidemiologists has become more prominent. Epidemiology deals with the study of cause, spread, and control of diseases. The goal of epidemiologists is to implement mechanisms for surveillance, monitoring, prevention and control of different diseases. To accomplish the above mentioned, epidemiologists need to deal with large data sets of disease outbreaks. These data sets are often spatially and/or temporally distributed. It is in fact ironic that, for epidemiologists to study the dynamics of different diseases, it is vital for an outbreak to occur. Epidemiologists have been studying and analyzing the data sets using primarily statistical tools. In the vast variety of infectious diseases, expertise is needed in terms of epidemiologists for every disease. Statistical tools, prove to be inadequate and fragmentary, when focusing on large spatial domains. These tools have been deemed limited, particularly in view of an emerging global computational infrastructure that facilitates high performance computing. Hence, it is imperative to develop new tools that take advantage of today's computational power, and help epidemiologists to analyze and understand the spatial spread of diseases. The computational tools also enhance the quality of information, accelerate the generation of answers to specific questions and facilitate in prediction. Such tools will take on an important role in surveillance, monitoring, prevention and control of different diseases.

In the domain of computational tools, the Cellular Automata paradigm has been in use for several decades [14] . Nevertheless, in the field of modeling epidemics, this paradigm has rarely been utilized to its full potential [1, 10, 14, 8] . A cellular automata as defined by Lyman Hurd is a discrete dynamical system, where space, time, and the states of the system are distinct [15] . CA has been exemplified as an array of similar processing units called cells. The cells arranged in a regular manner constitute a regular spatial lattice. Figure 1 shows a regular lattice of cells. The fundamental property of each cell is a state, where the states of cells change based on a update rule, either local or global. The update rule is applied synchronously throughout the lattice and the state transitions of the cells are based on few of the close by cells, known as the neighborhood. For a two-dimensional lattice the most common neighborhoods defined are von Neumann and Moore neighborhood as shown in figure 1 [15] . In the von Neumann neighborhood, the state of cell C i ,j depends on the states of the four neighborhood cells namely C i+1,j , C i−1,j , C i,j+1 , C i,j−1 . In the Moore neighborhood, the state of cell C i,j depends on the states of the eight neighborhood cells namely

As mentioned before, the CA's evolution is based on a global update rule applied uniformly to all the cells. The signature of this rule can be thought of as a state transition from time t-1 to t. As shown in the figure 2 the state of the center cell changes to a state, which is in majority among the cells in the neighborhood. The update rule determines the deterministic or stochastic behavior of a CA. Stochastic behavior is seen by probabilistic update rules in non-deterministic state transitions.

Our efforts to design and implement a Cellular Automata based simulator has been necessitated by the need to study the dynamic of spread of a vast number of infectious diseases. Towards this goal, this paper proposes the use of CA paradigm to simulate an infectious disease outbreak. Specifically, this paper focuses on the design and evaluation of EPI-SIM, a global disease outbreak sim-00 00 00 00 00 11 11 11 11 11 00 00 00 00 11 11 11 11 ulator. The following section summarizes some of the research effort in modeling disease epidemic and highlights principle approaches. The design of EPI-SIM is discussed in Section 3. Section 4 presents the experimental analysis and results of the simulator. Section 5 discusses the Geo-Spatial model and the approach towards the global model to account for different demographics. Section 6 concludes the paper with a summary and direction for future work in the area of modeling infectious diseases outbreaks.

Most of the work in modeling infectious disease epidemics is mathematically inspired and based on differential equations and SIR/SEIR model [3] . Differential equation, SIR modeling rely on the assumption of constant population and neglect the spatial effects [5, 6] . They often fail to consider individual contact/interaction process and assume populations are homogeneously mixed and do not include variable susceptibility. Considerable research has been conducted in SIR(Susceptible, Infectious, Recovered) modeling of infectious diseases using a set of differential equations. Both partial and ordinary differential equation models are so deterministic in nature that they neglect the stochastic or probabilistic behavior [8] . Nevertheless, these approaches/models have been shown to be effective in regions of small population [8] . Other approaches for modeling disease epidemics have been using mean field type approximations [12] . Even though the MFT models are similar to the differential equations, they add a probabilistic nature by adding different probabilities for the mixing among individuals. Although, according to Boccara [5] mean field approximations tend to neglect spatial dependencies and correlations and assume that the probability of the state of cell being susceptible or infective is proportional to the density of the corresponding population. This approach relies on the quantitative measures to predict local interaction. Boccara and Cheong [5] study the SIS model of spread of infectious disease in a population of moving individuals, thereby introducing non-uniform population density. In every update the cells take up a state of being either susceptible or infectious and randomly choose a cell location to move to. Ahmed et al [2] model variations in population density by allowing cyclic host movement. Other approaches in modeling variable susceptibility of the population, have been done by inducing immunity in the population. Ahmed et al [1] introduce incubation and latency time, and suggest that the parameters have an accelerating impact on the spread of a disease epidemic. Nevertheless, the underlying assumption is spontaneous infection of individuals. Boccara and Cheong [6] concentrate on SIR epidemic models and take into consideration the fluctuation in the population by births and deaths, exhibiting a cyclic behavior with primary emphasis on moving individuals. Di Stefano et al [8] have developed a lattice gas cellular automata model to analyze the spread of epidemics of infectious diseases. The model is based on individuals, where individuals can change their state independent of others and can move from one cell to other. However, this approach does not consider the infection time-line of latency, incubation period, and recovery which have been shown to be important to model a disease epidemic.

In our model the basic unit of cellular automata is a cell, which may represent an individual or a small sub-population. For each cell we use the Moore (8) neighborhood definition. Each cell can be characterized with its own probability for risk of exposure, probability of contracting the disease and state. Unlike the SIR model, every cell comes in contact with the cells in its defined neighborhood. The time-line for infection that we consider is shown in figure 3 . However, the moore neighborhood is restricted in modeling population demographics and travel patterns. The limitation is eliminated in the next version of the simulator with a global neighborhood which will be proposed in future publications. The following sections discuss the definitions, features and rules of the model and simulator.

In order to understand the functioning of the simulator, we define definite number of states a can exists in, and define the infectious time-line. The following section describes the different states and definitions considered in the model.

State 'S' for Susceptible is defined as the state where, the cell is capable of contracting a disease from its neighbors. In the infectious state, 'I' the cell is capable of passing on the infection to its neighbors. In the recovery state, 'R' the cell is neither capable of passing on the infection, nor is capable of contracting the infection.

Infectivity ψ, at any given time is defined as the probability of an susceptible individual to become infectious, if it has an infectious cell as a neighbor. Latency λ, is defined as the time period between, the cell becoming infected and it becoming infectious. Infectious period θ , is the period of time, when the infected cell is capable of spreading the disease to other cells. Recovery period ρ is defined as the time period, the cell takes to recover, wherein it is neither capable of passing on the infection, nor is capable of catching the infection.

The following rules are applied to the CA for simulating the spread of the disease. The rules describe the state transitions of individual cells.

with an infected cell in its defined neighborhood. The cell acquires the disease from the infected neighbor based on the probability of given by the parameter of infectivity ψ. The cell remains in the latent state for the number of time steps (updates) as defined by the parameter latency λ. 2. The state of the cell changes from latent L to infectious I after being in state L for the given λ. In this model we assume for simplicity, that every cell exposed to the pathogen, will become infectious. In the state I, the cells are capable of passing on the infection to neighborhood cells. For example if for a disease D, λ= 2 units, then after two time steps the cell will enter the infectious state I. 3. After a time period, defined by the infectious period θ, the state of the cell changes from infectious I to recovered or removed R. Once the cells enter the state R, the cell is no more capable of passing on the infection. 4. From the state R, the cell's state changes back to either susceptible S or it remain in state R, signifying complete immunity. The 'healing mode' turned on determines the transition from state R to state S and vice versa.

While modeling a disease epidemic, few parameters that are considered important are neighborhood radius, contact between individuals, infection probability (variable susceptibility), immunity, latency, infectious period and recovery period. The simulator is highly parameterized to let the user change and modify the above parameters. The neighborhood of every cell can be changed from a 8 neighborhood to 4 neighborhood depending on the region being simulated and the contacts among the individuals of the region. As mentioned, the infection probability represented as infectivity ψ is a significant parameter for the spread of a disease. In the case of our model, ψ is based on the virulence of the disease and contact rate among individuals. For some diseases individuals attain lifetime immunity, after being infected, while for disease like common cold, individuals attain temporary immunity. Thus, to take this fact into consideration, the simulator has a feature of healing mode. With the healing mode enabled the simulation is executed in a mode that forces cells to turn into susceptible after the recovery state and with healing turned off, the cell attains complete lifetime immunity. As mentioned above, the infection time-line is also an important factor in modeling a disease epidemic. Thus the time periods of latency λ, infectious θ, and recovery ρ are all expressed as time units, for example, latency of two days, can be represented as λ=2 units. The simulator allows the user step through the simulation at each time step, or execute it continuously. We will see in the next section, how changing these parameters, can change the dynamics of spread of diseases.

An epidemic is a severe outbreak of an infectious disease which spreads rapidly to many people. For example, the occurrence of Influenza in a region is considered as an epidemic. When a disease spreads to larger geographic regions or throughout the world it is known as pandemic.

Moving along the same direction an endemic is defined as a disease that is always present in certain group of the population. Using our model we show both an epidemic and endemic. An epidemic is characterized by an exponential growth of the infected individuals in a population. In the case of an endemic the number of infected individuals fluctuates around a mean, there is no exponential growth.

Experiments were conducted on a 140 by 140 grid cellular automata with different values of ψ, λ and θ. The results in this section represent the mean over multiple random experiments and different random graphs of the same type. The analysis of results in this section have been conducted with reference to the above definitions.

As mentioned earlier, ψ is an important factor in the analysis of spread of a disease. Figure 4 

Vaccination has contributed significantly towards the eradication and reduction of effect of many infectious diseases [7] . The following experiments were con- on the simulator by vaccinating about 5% of the population at random and infecting few cells. Figure 5(b) , shows the growth of infected individuals in a vaccinated and non vaccinated population. Figure 5 (b) depicts that the growth of infected individuals in a population with only 5% of the population vaccinated, is considerably less as compared to the growth in a non-vaccinated population.

We study the effects of spatial distribution of population, by vaccinating a part of the population using the random vaccination and ring vaccination. Every time a new vaccine is discovered, the question arises as to how should the vaccine be distributed to minimize the spread of a disease and maximize the effect of vaccination. Thus, in this experiment we compare the random vaccination, which is also known as uniform strategy [9] , and ring vaccination. The doses of vaccine available at our disposal is often limited, thus for the purpose of experiment we consider N doses of vaccine to be available to vaccinate the population, where N is about 5% of the population. In random vaccination, the N vaccines, are randomly distributed to individuals in a population, independent of the other. In the ring vaccination, individuals are vaccinated in a ring surrounding an area. The thickness and circumference of the ring depends on N. As Figure 6 shows, using random vaccination many more individuals are infected as compared to the ring vaccination. This experiment validates the result shown by Fukś and Lawniczak in [9] .

The previous model described poses a limitation of neighborhood. The model considers a neighborhood of 8 cells, because of which after a time period the number of susceptibles reduce and saturate the neighborhood . In such a situation the variance of infectivity parameter plays no role and has the same effect on the spread of the disease. Also, the need to simulate a disease, where an infective can spread the disease to twelve other individuals in one time step, will not be possible to simulate. Another important issue to note is the movement of people, migration, or travel is not considered. Some models, we saw in the previous section deal with movement of individuals from one cell to another in the defined neighborhood, where again the neighborhood is restricted. The saturation of neighborhood occurs due to overlapping of neighborhood, when more than one cell is infected in a neighborhood. Cells in a neighborhood may get infected more than once in one time unit.

The Geo-spatial model is designed for simulating global outbreak of a disease in a environment with global interaction. Even for this model the basic unit of CA "saturation.txt" using 1 "saturation.txt" using 2 "saturation.txt" using 3 "saturation.txt" using 4 "saturation.txt" using 5 Fig. 7 . Represents saturation of neighborhood ψ is a cell, which represents an individual. The neighborhood as defined for this model is global, where in a region of n cells every cell has n-1 neighbors.

For the functioning of this model, the definitions for the states of cell, parameters for simulation are same as the ones for SIR model discussed earlier. This model has an additional parameter of contact rate and the definition is as follows.

Contact rate parameter defines the number of contacts made by an individual per time unit. Instead of having the same contact rate parameter for every cell in the lattice, for simulation purposes this parameter has a Poisson distribution over the cells. The simulation of spread of disease is discussed further.

where k is the contact rate defined for that cell. Thus the cell has now established contacts with k cells. 2. Once a contact has been established between cell 'a' and cell 'b', depending on the virulence of the disease defined by the infectivity parameter, cell 'a' can pass the infection to cell 'b' if cell 'b' is in a susceptible state S. If cell 'a' is not infected currently and cell 'b' is infected then cell 'a' can acquire the infection from 'b'. Thus the infection can pass on in both directions.

The Geo-spatial model is different from the SIR type model in terms of the neighborhood. The neighborhood saturation problem posed by SIR type model is overcome by this model. However, this model is restricted in modeling population demographics and travel patterns. The choice of cells for contact, is random and is not based on distance from the cell or any other parameter.

To study the effects of position of index case on spread of a disease, the simulation was run with different initial positions. After a certain time unit it is seen that locations of new infected cases are not very different for the two To analyse the contact rate the experiment was done with three different contact rates for cells. The result shows that as average contact rate increases, the number of infected individuals also grows. For this model the contact rate is directly proportional to number of infected individuals. It is important to note that the contacts made by cells are random. Figure 8(a) shows the comparison.

As seen before in the other model, as infectivity parameter ψ increases the number of infected individuals increases. The average contact rate was fixed for this experiment. Figure 8(b) shows the comparison.

The models described above may be used for simulating diseases over small regions with local interaction and global interaction respectively. As mentioned before, these models do not take into account the demographics of the region and may not be accurate for simulating disease spread over large geographic regions because of the neighborhood constriction posed by them. Thus the global stochastic cellular automata with demographics will facilitate to understand the effects of different demographics, the population density, socio-economics of a region and culture. It can also be used effectively for investigating different vaccination strategies and understanding the effects of travel.

In the following section we discuss the design of a global outbreak simulator with a global interaction and demography. Even for this model the basic unit of CA is a cell, which represents an individual or a small sub-population. The neighborhood as defined for this model is global, where in a region of n cells every cell has n-1 neighbors. The neighborhood for a global SCA is defined using a fuzzy set neighborhood. The definition of Fuzzy set neighborhood is as follows.

The set F ⊂ S where S is a set of all the cells F : { s, p |s ∈ S, 0 ≤ p ≤ 1} s,1 : Total/Complete membership s,0 : No membership The variable p maintains the state of infection, 1 if infected else 0.

State of infection δ is defined as any number between 0 and 1, indicating the level of infection present in the cell. 0 indicates not infected, 1 indicates fully infected.

Interaction Coefficient i for a particular cell is defined as the interaction between that cell and every other cell in the lattice space. It is calculated as the reciprocal of the euclidean distance between the cells. Euclidean distance as derived from the GIS gravity model.

Global interaction coefficient Γ of cell Ci,j is the summation of all the individual (n-1) interaction coefficients of the cell. Every cell has one global interaction coefficient and n-1 interaction coefficients.

The infection factor I is calculated as a fraction of the interaction coefficient to the global interaction coefficient Γ , for every cell to cell interaction. It is also based on the virulence of the disease and the state of infection of the infecting agent.

I Ci,j = ∀C k,l =Ci,j iC i,j ,C k,l Γ C i,j ×δ C k,l ×ψ

The global interaction coefficient and the interaction coefficients are calculated based on the distance. As the distance in between the cells reduce, the interaction coefficients increase which indicates more chances of interaction between them.

Γ Ci,j = ∀C k,l =Ci,j 1 √ i−k 2 + j−l 2

The global interaction coefficient and the interaction coefficients are calculated based on the distance and population. The distance between the cells and the populations of the cells are considered. For better understanding, the cells are considered to be small regions having certain populations. The product of the populations of the two cells, acts as a factor for the interaction coefficients. The population factor is directly proportional to the interaction coefficient and the distance between them is inversely proportional to the interaction coefficient.Thus two cells with high populations are assumed to interact more than two cells with low populations, when the distance between them is same.

Γ Ci,j = ∀C k,l =Ci,j 1 √ i−k 2 + j−l 2 × P Ci,j × P Ci,j

This paper describes a disease outbreak simulator using the cellular automata paradigm. The results show the variation in the spread of the disease for different parameters of infectivity ψ. The simulator has also facilitated the study of different vaccination strategies. Geo-spatial model helps us in simulating disease spread in an environment with global interaction including travel and migration. In the same direction the global model can be used to simulate disease spread over large geographic regions. It deals with global interaction and the demographics of the region. While still working on the development of computational tools to facilitate surveillance, monitoring, prevention and control of dynamics of different diseases, the current simulators prove as valuable tools to study the dynamics of different diseases. Global stochastic versions of the CA are currently being developed.

On Modeling epidemics. Including latency, incubation and variable susceptibilityPhysica A

On some applications of cellular automataPhysica A

A comparison of simulation models applied to epidemics

Multiagent Coordination by Stochastic Cellular Automata Presented at the International Joint Conference on Artificial Intelligence

Critical behavior of a probabilistic automata network SIS model for the spread of an infectious disease in a population of moving individuals

A probabilistic automata network epidemic model with births and deaths exhibiting cyclic behavior

Object-oriented implementation of CA/LGCA modelling applied to the spread of epidemics

Individual-based lattice model for spatial spread of epidemics Discrete Dynamics in

Epidemiology through Cellular Automata Case of Study: Avian Influenza Indonesia Working Paper WPF2004

Epidemiologic Methods for the Study of Infectious Diseases

Mean-field-type equations for spread of epidemics: The 'small world' model

Deterministic site exchange cellular automata models for the spread of diseases in human settlements

Epidemic Modelling Using Cellular Automata

Complex transmission dynamics of clonally related virulent Mycobacterium tuberculosis associated with barhopping by predominantly human immunodeficiency virus-positive gay men

Educational uses of virtual reality technology