key: cord-0853275-92ft1hd7
authors: Monteiro, L.H.A.; Gandini, D.M.; Schimit, P.H.T.
title: The influence of immune individuals in disease spread evaluated by cellular automaton and genetic algorithm
date: 2020-08-18
journal: Comput Methods Programs Biomed
DOI: 10.1016/j.cmpb.2020.105707
sha: 284bcb2f182c6b74b39dbcce654135a5ed15a63c
doc_id: 853275
cord_uid: 92ft1hd7

Background and objective: One of the main goals of epidemiological studies is to build models capable of forecasting the prevalence of a contagious disease, in order to propose public health policies for combating its propagation. Here, the aim is to evaluate the influence of immune individuals in the processes of contagion and recovery from varicella. This influence is usually neglected. Methods: An epidemic model based on probabilistic cellular automaton is introduced. By using a genetic algorithm, the values of three parameters of this model are determined from data of prevalence of varicella in Belgium and Italy, in a pre-vaccination period. Results: This methodology can predict the varicella prevalence (with average relative error of [Formula: see text]) in these two European countries. Belgium data can be explained by ignoring the role of immune individuals in the infection propagation; however, Italy data can be explained by considering contagion exclusively mediated by immune individuals. Conclusions: The role of immune individuals should be accurately delineated in investigations on the dynamics of disease propagation. In addition, the proposed methodology can be adapted for evaluating, for instance, the role of asymptomatic carriers in the novel coronavirus spread.

In epidemiological studies based on mathematical models, an accurate estimation of the model parameters is crucial for designing effective control strategies. The role of immune individuals is often neglected in these studies; that is, immune individuals are often disregarded in the transmission and healing processes [1] . The main goal of this work is to examine the validity of this assumption.

Suppose that recovery from an infection confers lifelong immunity. A typical example is varicella (chickenpox), a contagious disease transmitted through social contacts [2] . This disease primarily infects children. Usually, immune adults take care of sick children, without the risk of getting infected again. Also, the meeting of susceptible and infected children is partially promoted by these same immune adults, because children usually go to schools, parks, clubs in the company of them.

In this scenario, immune individuals can increase the contagion rate of susceptible individuals and can decrease the convalescence period of infected individuals. Therefore, it is reasonable to conjecture that immune individuals play antagonistic roles in the spread of this infection.

The propagation of contagious diseases, in which immune individuals influence the contagion and recovery rates, was already analytically investigated from a model written as a set of differential equations [3] . Here, an equivalent model formulated in terms of cellular automaton [4] (CA) is proposed to analyze this issue.

Genetic algorithm (GA) is an optimization metaheuristics in which chromosomes, the candidate-solutions for the to-be-solved optimization problem, evolve by applying operators of crossover, elitism, mutation, and selection [5] . It is expected that some of these chromosomes represent near-optimal solutions after some generations.

GA has been used for estimating parameters of models based on CA in epidemiological studies [6] and in other contexts [7] [8] [9] .

The parameter identification of epidemic models can be considered an inverse problem, which has been solved by employing distinct computational techniques [10] [11] [12] . Here, GA is employed to determine the values of three parameters of the proposed epidemic model based on CA. This parameter identification is performed on data of prevalence of varicella in Belgium and Italy, around the year 2000. The values found for these parameters supports a discussion on the mentioned conjecture.

This manuscript is organized as follow. In Section 2, the epidemic model based on CA and the GA employed in its parameter identification are introduced. In Section 3, the numerical results are presented. In Section 4, the possible relevance of this study is stressed.

2 Methods: CA and GA CA has been used in works on computational epidemiology [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] . Let a CA be represented by a two-dimensional lattice with n × n cells. To avoid edge effects, the boundary conditions are taken as periodic (that is, the top and bottom edges are connected and the left and right edges are also connected; thus, the lattice has a toroidal shape). Here, each cell of the CA lattice corresponds to an individual of the host population, which may be infected by a pathogen. At each time step t, from each cell, m undirected links start and they end in other cells inside a Moore neighborhood with radius r (thus, m undirected links start from each cell to others pertaining to the square matrix of size 2r + 1 centered in such a cell). Let q i,r = 2(r+1−i)/[r(r+1)] be the probability of a cell contacting any other in the i-th layer, with i = 1, 2, ..., r (the i-th layer is formed by the cells with Moore radius equal to i). For instance, for r = 2, then q 1,2 = 2/3 is the probability of a cell contacting any of the 8 cells in the layer i = 1 and q 2,2 = 1/3 is the probability of contacting any of the 16 cells in the layer i = 2. All links are rewired at each time step t.

Such a time-varying connectivity is used to emulate migratory movements inside the geographical area represented by the two-dimensional lattice. This dynamic random network was called as mainly locally connected graph [23] . In the computer simulations shown in the next section, the topological features of this graph (that is, the values of n, m, and r) remain fixed and the chromosomes of the GA are only composed by genes related to the infection properties. Certainly, other types of complex networks could be employed [24, 25] . They could be used to take into account, for instance, non-isotropic features of the simulated environment [26, 27] . proposed [23, [28] [29] [30] ; however, this is the first one that takes into consideration the role of immune individuals in the spreading of a contagious disease.

In a computer simulation, after a transient period, the percentages of S, I and R-individuals tend to fluctuate around constant values; that is, the system tends to a stationary solution, as illustrated in Fig. 1 . The percentages of S, I and R-individuals in this steady state are denoted by S * , I * , and R * = 1 − (S * + I * ).

If I * = 0, the steady state is known as disease-free; if I * > 0, as endemic. In a simulation, S * and I * are calculated as the average amounts of S and I-individuals divided by N , obtained in the last w time steps from a total of T time steps.

Thus, fluctuations due to the random nature of the CA model are smoothed by this averaging process. On average, the values of S * and I * do not change from one simulation to another; that is, the endemic attractor reached in each simulation does not change. Obviously, w and T must be conveniently chosen; that is, w and T must be chosen in order to compute S * and I * after the system reaching its steady state. In Fig. 1 , T = 100. Observe that the steady state is attained for t 30;

thus, in this case, w = 70 would be a suitable choice.

The CA parameters to be fitted by the GA are k, q, and p. The values of the other parameters (which are the constant probabilities P 3 , P 5 , and P 6 ) are obtained from literature. Thus, each GA chromosome is composed of three genes (the values of k, q, and p). The optimal chromosome contains values of k, q, and p such that, in a simulation with the CA model, then S * = S tar and I * = I tar , in which S tar and I tar are the targets; that is, the average percentages of S and I-individuals found in the European countries considered in this work.

Three fitness functions, denoted by F 1 , F 2 , and F 3 , are used to evaluate how good a chromosome (a candidate-solution) is. These functions are:

, and 0 < 1. If S * = S tar and I * = I tar , then L S , L I , E S , and E I are equal to . In this case, the maximum value is 1/(

for F 1 , 1/(2 ) for F 2 , and 1/ for F 3 . Observe that the better the chromosome, the higher its fitness. In fact, the optimal chromosome maximizes these three functions.

Recall that R * is not explicitly considered in these fitness functions because it is not an independent variable.

Each generation of the GA has η chromosomes. In the first generation, these Observe that only the long-term behavior of the CA model is taken into account to fit its parameters. The rationale behind this assumption is that varicella was (in fact, still remains, even after vaccination) endemic in many countries. Thus, only data related to S * and I * (and, consequently, R * = 1 − (S * + I * )) are usually available and these are the data used here to fit the CA model. If the approach proposed in this work was employed to study the spread of a new pathogen, as the novel coronavirus, then features of the transient behavior of the amounts of S, I and R-individuals should/could taken into consideration.

The results obtained from numerical simulations by combining CA and GA are presented in the next section.

In For Belgium, the target is S tar BEL = 0.035 and I tar BEL = 0.00022 [3, 31] ; for Italy, S tar IT A = 0.1 and I tar IT A = 0.000033 [3, 32] . These numbers represent the normalized amounts of susceptible and infected individuals found in each country. The time evolution of the fitness functions F 1 , F 2 , and F 3 for the two countries are shown in Figs. 2, 3, and 4 , respectively. Also, from the best chromosome found for each country and each fitness function, the difference between the target and the steady state reached in the CA model is calculated. Table 1 presents these results.

Observe that F 2 gives the best result for Italy data; and F 3 , for Belgium data. For Belgium, the average relative error (considering S, I and R-individuals) is about 4% for F 3 ; for Italy, the average relative error is about 2% for F 2 . The three best chromosomes for the three fitness functions for both countries are shown in Table 2 .

The best chromosome found for Belgium is k BEL = 0.339118, q BEL = 0, p BEL = 6.312226; for Italy, k IT A = 0, q IT A = 0.064927, p IT A = 0.299330. These results are discussed in the next section.

Despite the unknown degree of underreporting, the CA model proposed here can predict (with average relative error of 2%−4%) the varicella prevalence found in two European countries around the year 2000. For Belgium, k 0.3, q = 0, and p 6.

Thus, Belgium data can be explained by ignoring the role of R-individuals in the infection propagation (because q = 0). For Italy, the best chromosome corresponds to k = 0, q 0.06, and p 0.3. Therefore, Italy data can be explained by considering contagion exclusively mediated by R-individuals (because k = 0). Table   2 confirms that q = 0 for Belgium and k = 0 for Italy compose the best chromosomes for the three fitness functions. Thus, in this work, real-world data can be predicted (with precision of 2% − 4%) by supposing that disease spread is only due to the direct contact among S and I-individuals (that is, k > 0 and q = 0) or by supposing that disease spread is only due to the direct contact among S and I-individuals mediated by R-individuals (that is, k = 0 and q > 0). Certainly, both these ways of disease propagation must simultaneously occur in both countries. However, the epidemic data can be explained by considering either one way or the other way, which is a surprising result. The way with k > 0 and q = 0 is commonly found in theoretical studies; the way with k = 0 and q > 0 is the novelty of the proposed CA model. In addition, for Belgium data, the parameter related to recovery mediated by R-individuals is greater than the one for Italy data. In fact, for the first country, p 6; for the second country, p 0.3.

In short, the main conclusion is: since R-individuals can take part in the processes of contagion and recovery, their roles should be accurately delineated in studies on dynamics of disease spread. Usually, their presence is assumed to be only benefi-cial from an epidemiological point of view, because, due to their acquired immunity, they can not directly propagate the infection. However, they can catalyze the meeting among S and I-individuals. This fact should not be neglected in mathematical approaches.

Two particular features of the implemented GA were crucial in the parameter identification: a decreasing mutation rate and the initial generation of chromosomes randomly picked from a uniform distribution in the logarithmic scale. From an expert system perspective, the methodology presented here can be applied to forecast the prevalence of other contagious diseases, as the one caused by the novel coronavirus. In future works, vaccination can also be taken into consideration.

This article does not contain any studies with human participants or animals performed by any of the authors.

The authors declare that there is no conflict of interest.

5 Tables and table captions   Table 1 : Relative errors for the best chromosome found for each fitness function and each country. For instance, for S tar BEL , the relative error e F 1 is given by e F 1 = |S * −0.035|/0.035, with S * obtained from a CA simulation with the best chromosome found by the GA with the fitness function F 1 . The average relative errors for each fitness function and each country are also shown.

target 

Infectious Diseases of Humans: Dynamics and Control

On considering the influence of recovered individuals in disease propagations

Cellular Automata and Complexity: Collected Papers

An Introduction to Genetic Algorithms

System identification and prediction of dengue fever incidence in Rio de Janeiro

Identification of probabilistic cellular automata

Calibration of an urban cellular automaton model by using statistical techniques and a genetic algorithm

Revisiting the edge of chaos: Evolving cellular automata to perform computations

Novel parameter estimation techniques for a multi-term fractional dynamical epidemic model of dengue fever

Parameter estimation of influenza epidemic model

Solving the inverse problem of an SIS epidemic reaction-diffusion model by optimal control methods

On modeling hepatitis B transmission using cellular automata

Cellular automata and epidemiological models with spatial dependence

Simulating SARS: Small-world epidemiological modeling and public health policy assessments

Simulating the spatial dynamics of foot and mouth disease outbreaks in feral pigs and livestock in Queensland, Australia, using a susceptible-infected-recovered cellular automata model

Modeling infectious diseases using global stochastic cellular automata

Modeling epidemics using cellular automata

Phase transition in spatial epidemics using cellular automata with noise

Epidemiological modeling with a population density map-based cellular automata simulation system

A cellular automaton model for the transmission of Chagas disease in heterogeneous landscape and host community

Impact of time delay on the dynamics of SEIR epidemic model using cellular automata

On the basic reproduction number and the topological properties of the contact network: An epidemiological study in mainly locally connected cellular automata

A new surveillance and spatio-temporal visualization tool SIMID: SIMulation of Infectious Diseases using random networks and GIS

Leveraging hospital big data to monitor flu epidemics

Modelling the impact of transit media on information spreading in an urban space using cellular automata

Optimize the spatial distribution of crop water consumption based on a cellular automata model: A case study of the middle Heihe River basin

Disease spreading in complex networks: A numerical study with Principal Component Analysis

The impact of imported cases on the persistence of contagious diseases

Self-sustained oscillations in epidemic models with infective immigrants

The seroepidemiology of primary varicella-zoster virus infection in Flanders (Belgium)

The seroepidemiology of varicella in Italy