key: cord-0005319-2a7vaez2 authors: Ortega, Neli R. S.; Santos, Fabiano S.; Zanetta, Dirce M. T.; Massad, Eduardo title: A Fuzzy Reed–Frost Model for Epidemic Spreading date: 2008-07-29 journal: Bull Math Biol DOI: 10.1007/s11538-008-9332-3 sha: 12850eaf6ee547718ba4c07b4a87bce0ecd24e44 doc_id: 5319 cord_uid: 2a7vaez2 In this paper, we present a fuzzy approach to the Reed–Frost model for epidemic spreading taking into account uncertainties in the diagnostic of the infection. The heterogeneities in the infected group is based on the clinical signals of the individuals (symptoms, laboratorial exams, medical findings, etc.), which are incorporated into the dynamic of the epidemic. The infectivity level is time-varying and the classification of the individuals is performed through fuzzy relations. Simulations considering a real problem with data of the viral epidemic in a children daycare are performed and the results are compared with a stochastic Reed–Frost generalization. Fuzzy dynamical systems still consists in a challenging area, particularly for the modeling of non-linear systems. Fuzzy models based on differential equations have been proposed by Pearson (1997) , Seikkala (1987) , and Nikravesh et al. (2004) . However, this approach is difficult to apply in epidemiology due to the fact that epidemic models have, in general, strong non-linearities. In order to incorporate the heterogeneities in ecological and epidemiological models, Barros et al. considered fuzzy parameters in differential equations (Barros et al., 2000 Jafelice et al., 2004) , whose solution could be found by calculating the fuzzy expected value whenever the variables have a probabilistic distribution. Although these models consist in a significant contribution to the field, applying them is not an easy task. An alternative approach for fuzzy epidemic models based on dynamical linguistic models has been proposed (Ortega et al., 2000 . However, these gradual rules systems present important limitations, as the explosion of the number of the rules and the difficulties of the experts to model the consequents if many input variables are considered . In this context, any new approach of fuzzy dynamic model applied in epidemiology can represent an important contribution for both fuzzy logic and epidemiology. Epidemic systems are, in general, described through macroscopic models, in which the dynamics is based on the population parameters such as the force of infection, the mortality rate and the recovery rate Coutinho et al., 2005; Massad et al., 2003 Massad et al., , 2005a . In contrast, there are few microscopical epidemic models available (Jafelice et al., 2004; Menezes et al., 2004) , that is, models whose individuals' information affect the population dynamics. Many models of epidemic spreading have been proposed to help in the comprehension of infectious diseases, with the obvious assumption that knowledge could help in the control of these diseases (Massad et al., 1995 (Massad et al., , 1999 . The simplest epidemic model available in the literature is the so-called Reed-Frost model, proposed by Lowell Reed and Wade Frost (Abbey, 1952; Maia, 1952) . This model is based in just one population parameter, performing a particular case of a chain-binomial model. In this system, each individual is classified as susceptible or infected and it is assumed that all of them are under the same contact rate with each other. In addition, each infected individual can independently infect susceptible individuals, depending only on the probability p of an infected contact between them. Thus, the probability that a susceptible individual will become infected at time t , C t , is equal to the probability of at least one infectious contact, i.e., where I t is the number of infected individuals at time t . Time t is assumed to be a discrete variable, and an individual's classification may be updated only when time changes from t to t + 1. In the Reed-Frost model, it is assumed that the probability of an infectious contact is fixed along the epidemic course and it is the same for all individuals. Therefore, neither heterogeneities in susceptible and infected groups nor errors involved in the classification process are considered. Due to its assumptions, the Reed-Frost model is adequate to describe infectious diseases that spread in closed and homogeneously-mixed groups, whose size N is constant and small. However, the homogeneity assumption does not hold in a majority of real epidemics, since each individual may present different susceptibility and infectivity levels, depending on environmental, physiological and psychological factors. In certain cases, the assumption of time-invariant susceptibility/infectivity levels does not hold either. In addition, errors in the diagnosis process are likely for a great number of infectious diseases, especially when the diagnostic test is neither readily nor easily available, as in the case of dengue (Coutinho et al., 2006) , influenza , yellow fever (Massad et al., 2005b) , and several other viral and bacterial infections (Zanetta et al., 2003) . In those cases, the diagnostic process involves uncertainties, and is usually based upon a set of clinical characteristics, often subjective and vague, which we call signals. Indeed, the infectivity level of an infected individual may depend upon the set of signals developed. In this paper, we propose a generalization of the classical Reed-Frost model, including the clinical signals presented by each individual of the group in the classification process, through a fuzzy decision process. By doing this, we intend to incorporate individual heterogeneities in the classificatory process: signals are used to define whether an individual is infected or susceptible, and also to define how the epidemics will spread. In order to consider the individual's heterogeneities in the Reed-Frost model, in terms of infectivity or susceptibility, several generalizations were proposed (Lefévre and Picard, 1990; Picard and Lefévre, 1991) . These attempts consist in macroscopic epidemic models, where the heterogeneity was given by the stratification of the population into subgroups, considering homogeneous mixing within the subgroups. The subgroups are closed and the level of infectivity and susceptibility are assumed as constant throughout the epidemic course. Recently, Menezes et al. (2004) proposed a Reed-Frost generalization from a microscopic point of view, in which the uncertainty involved in the diagnostic classification is modeled through a stochastic process and based on individual's information. Menezes et al. considered studies involving small groups, within which both homogeneous mixing and homogeneous susceptibility were maintained. The epidemic course depends on the clinical signals involved both in the classification process and to define the probability of an infectious contact. These clinical signals may include symptoms, results of laboratorial and physical exams. It is assumed that no resistance is gained and the individual becomes susceptible again, after being infected. They consider the model in the context of both retrospective, in which the patient's health status are observable and modeled as random variables, and prospective studies, in which these true health status are not known. In order to explore the role of the classification process in this epidemic model, we present in this paper a fuzzy approach for the Menezes et al. Reed-Frost generalized model, taking into account the vagueness of the diagnostic process instead of the stochastic uncertainty. We developed a microscopic epidemic model based on the clinical signals and consider a fuzzy relation to evaluate the individual's infectiousness, performing a fuzzy decision process where the infectiousness degree is applied directly into the epidemic dynamic. Modeling the Reed-Frost dynamics based on the signals scenario is based on the idea that there is an association between the intensity of the signals present in an infected individual and the probability of an infectious contact, p, with this individual. Thus, it is assumed that the higher the signal values, the higher the probability that a contact between an infected and a susceptible individual be infectious. The model allows the inclusion of both signals linked to increased and to decreased infectiousness. Furthermore, distinct signals can affect the probability p with different intensity and ways, affecting also the classification process. So, the calculation of the probability p assumes an important role in this approach, since the individual's information is transmitted to the population dynamics through it. In this formulation, it is assumed that each individual i has health status, susceptible or infected represented by G i,t . The binary variable G i,t takes value 1 if the individual i is infected at t , and 0 if the individual is susceptible. In this way, the number of individuals infected at t , in a group with size N , is given by: (2) In general, the diagnostic process is based upon the set of signals present in the individual under analysis. This signals set can be summarized by one variable I D i,t , taking normalized values into the interval [0, 1], since it must be a fuzzy measure. In addition, these clinical signals usually vary their severity depending on both the disease and on the individual variability. Furthermore, for reasons other than the infection considered, these signals can also be present in susceptible individuals. In this case, it is expected that its expression should be less intense than in the presence of the infection. Since the clinical signals expression is different for infected and susceptible individuals, we assumed two probability distributions, depending on the parameters of either the susceptible or the infected populations. We represent by X I the signal for any infected individual, and by X S the signal for any susceptible individual. So, given an individual's health status G i,t , X I and X S are random variables intrinsically linked to the pathogen; therefore, their distributions remain unaffected by the epidemic course. Since they take a value within the interval [0, 1] and considering the diversity of functions that the Beta probability distribution provides, we assume the following probability distributions: At time t , the possibility P jl,t that a contact between a susceptible individual j and an infected individual l results in a new case is a function of the signals of the infected individual only, I D l,t , as a consequence of the susceptibility homogeneity assumption. We assume in particular that this function is where ϕ and ω are parameters of the model and should be chosen in a way to guarantee that P l,t ∈ [0, 1] for all l, t. Then the epidemic dynamic in this fuzzy Reed-Frost model is also given by: and C t here can be interpreted as the possibility that an individual be infected at time t + 1, similarly to the classic Reed-Frost model, and will be used to generate the health status of the individuals in time t + 1. The main difference between the Menezes et al. (2004) proposal and this fuzzy approach consists in the structure of the summary of signals, which are performed by a random variable in the former and by a possibility measure in the latter. In the fuzzy case, the individual's infectiousness is calculated through a membership degree based on the max-min composition (Reis et al., 2004; Pereira et al., 2004) . Consider a set of signals S and the matrix representation of a fuzzy relation. Thus, S l = [s] 1×k is the array of k signals of the individual l, I l = [i] k×q is the matrix that associate each signal to the infective statement and DI l = [di] 1×q is the membership degree of the individual l in the fuzzy set Infected, interpreted here as the degree of infectiousness, found by the fuzzy composition given by: whose fuzzy composition • is the max-min composition defined by: As an example, consider the set of signals S = [fever, cough], i.e., s 1 is fever and s 2 is cough, and an individual who presents fever degree equal to s 1 = 0.7 and cough degree equal to s 2 = 0.4. The matrix I that relates signals and infectiousness is I = [i fever , i cough ], where i fever is the relationship degree between the symptom fever and infectiousness status and i cough is the relationship degree between the symptom cough and infectiousness status. So, an individual that have a degree of fever, s fever , and a degree of cough, s cough , will belong to the infectiousness fuzzy set with the degree given by: We assume that each individual i has k signals, with levels represented by membership degrees in each fuzzy subset of clinical signal (like fever, cough) s i1 , s i2 , . . . , s ik . So, these levels are numbers between 0 and 1, with s i1 = 0 indicating that the clinical signal 1 is absent in patient i, and s i1 = 1 indicating that patient i presents the clinical signal 1 with maximum level (or severity). The infectiousness degree is computed for all individuals and the heterogeneity is considered in the epidemic dynamics through the signal influence on the possibility p (the possibility of an infective contact between a susceptible and an infected individual), and consequently, C t (the risk of a susceptible individual becoming infected). The new individual set of signals in next time is found from C t , by Eq. (5), and the epidemic spreading is built through the number infected individuals generated during the time. It is important to highlight that the use of the max-min composition of the fuzzy relations consists in an arbitrary choice and, as it is common in fuzzy models, other possibilities could be explored since these compositions are generated through fuzzy operators for disjunction and conjunction manipulations (Pedrycz and Gomide, 1998; Sanchez, 1977) . In addition, the relational matrix that join the signals and infectiousness can be elaborated through experts opinion, which allows to introduce in the model information that is not commonly available in other way. Although the fuzzy approach has not an explicit probability structure, all calculations performed in the stochastic proposal by Menezes et al. is possible to be developed. However, the calculations in the fuzzy approach demand greater care, once they are more complex. The calculus accomplished in the stochastic model are based on the mathematical manipulations over conditional probabilities and, therefore, considering random variables. However, in the fuzzy model the measure generated through the max-min composition does not produce a pure random variable, as in the probabilistic context, but a possibility measure, where the σ -addictive property do not always hold (Pedrycz and Gomide, 1998) . Nevertheless, these calculations would allow an analysis of a fuzzy R 0 and the possibility that this R 0 be larger than 1. This analysis could provide different results than that obtained through classic structures, like the stochastic ones, which has already been shown in several works (Barros et al., 2000 . In order to analyze the fuzzy model performance and to compare it with the stochastic approach proposed by Menezes et al., we simulated a virus infection scenario. Besides, the performance of each model was evaluated in a quantitative and qualitatively way through the comparison with real data. During the 2003 and 2004 years, all children of a daycare, with roughly 120 children, with ages varying from 1 month to 6 years old, were followed up in São José do Rio Preto, a city of São Paulo State in Brazil. The objective of this work, among other things, was to study the circulation of respiratory infections. All daycare children with cold symptoms had nasopharyngeal aspirates collected and analyzed with multiplex technique (Bellau-Pujol et al., 2005) . Therefore, it was possible to determine the true health status of each child. Also, the epidemiological data were collected for all children in study, independently of the symptomatic status. All children stayed in the daycare during the whole day, which can be considered a quasi-closed group. Although the children are usually gather in small groups, there are periods along the day that they interact with each other, as in the meal time and in the playful moments. In the same way, there is also interaction among the teachers during the workday. These characteristics, added to the fact that the breathing infections can be configured as infections of long reach, allow us to consider that the data and the study conditions are in agreement with the model's assumptions. In order to find the fuzzy relations between signals and infectiousness, four experts in childhood diseases supplied the relational matrices considering the more important clinical signals for global infections by viruses. The matrix with the fuzzy relations between signals and infectiousness degree was found by the median of that four experts values. The signs consider for global infections and their respective fuzzy relations values were: fever (0.85), cough (0.85), coryza (0.85), sneeze (0.70) and wheeze (0.60). In these simulations, it was assumed homogeneous susceptibility and an infected individual was considered immunized to new infection during 3 weeks, which was the minimum period for re-infection observed. So, in the model simulated re-infection is possible, since the protection period is respected. The model has basically three parameters, which are presented in the equations of the dynamic structure: ϕ, the polynomial's coefficient; ω, the polynomial's power; and θ , the prior probability of infected status. In addition, the size of the population N was maintained constant since small variations in its value do not affect the result of the model. We assumed N = 120, which is around the monthly medium number of children in the daycare. In order to generate the signals of susceptible and infected individuals we elaborated Beta distributions considering the prevalence of the symptoms of viral infections in the population. The signals prevalence were classified in five categories as follows: very low, when the most probable prevalence is roughly 10%; low, when this prevalence is roughly 30%; medium, when the prevalence is about 50%; high, when it is around 75%; and very high, when the expected prevalence is around 90%. Each category was described by a Beta distribution as following: Very low ∼ Beta(3, 20), Low ∼ Beta(5, 10), Medium ∼ Beta(5, 5), High ∼ Beta(13, 5) and Very High ∼ Beta(25, 3). Depending on the signal considered, it is possible that a non-infected individual presents signals with some intensity. However, it is not expected that this happens with great frequency in the population. In other words, it is expected that the majority of the susceptible individuals should not be symptomatic. So, it was assumed that all signals of the susceptible individuals have very low prevalence. To determine the prevalences for the signals of infected individuals in the viral infection scenario simulated, we considered the prevalences observed in the daycare along the time study. So, based on this observation it was assumed the follows signals prevalences: fever, sneeze, and wheeze are very Low, cough is high and coryza is very high. Note that the Beta distributions defined above can be used to describe the prevalences of several signals, in different contexts. Since the simulation of both models involve random process, each simulated condition was repeated 150 times aiming to find the results through statistical analysis. As expected, the simulations of the models showed that they present a great diversity of dynamical behavior, depending on the parameters values. In some areas of the parameters space, the fuzzy and stochastic models are equivalent (for example, to small values of ϕ and ω, with fixed θ ). However, there are areas in the phase space where the models present quite different behaviors. In order to analyze the differences and similarities between the fuzzy and the stochastic models in a more detailed way, a diagram was elaborated by varying all parameters of the model and considering the dynamical equilibrium provided by both models. As can be noted in the figures below, the diagram presents areas in which the epidemic responses of the models completely agree and areas where they have not similar behavior. In fact, there are no abrupt transition between the regions and frontiers between the regions in this diagram could be considered as fuzzy limitations. However, it is possible to define two crisp states: a so-called concordant area, where the systems present very similar behavior in the majority of the points; and a so-called discordant area, where the systems present quite different behavior for the majority of the points. The concordant area is characterized by the presence of the few types of epidemic response, that is, where the epidemic does not hold or it is endemic for both the models. In contrast, in the discordant area there are several concomitant epidemic behaviors (endemic, strong epidemic, etc.). Figure 1a shows that for small values of initial proportion of infected individual (parameter θ ≤ 0.04) there are only three regions in the diagram: (1) a concordant area for small values of ϕ parameter (ϕ < 0.01), where the epidemic does not hold for both models; (2) a discordant area where the fuzzy model always presents endemic response and the stochastic model presents both no epidemic and endemic responses; and (3) a concordant area, where both models present an endemic behavior (this concordant area is maintained for values of ϕ ≥ 0.07). In this regions of the parameters space, it is not observed strong epidemic. This is due to the small values of the theta parameter, which is responsible by the starting of the infection process in the population. Varying the values of θ we can note that the regions in the diagram are modified: A fourth region appears in the map, corresponding to a discordant area. Figure 1b shows that for θ = 0.05 this new discordant region starts for ϕ ≥ 0.8 and small ω values. Moreover, this region increases according to the θ value it grows. This is expected since high levels of θ imply in high virus circulation and, by the models assumptions, the signals are Figure (b) shows that for θ = 0.05, a fourth region appears in the map, corresponding to a discordant area. more intense. In addition, due to the proprieties of the max-min composition of the fuzzy relations and the summary of the signals, the epidemic course tends to be stronger in the fuzzy model than in the stochastic approach. This occurs because the differences between the values of possibility and probability infectious contact are more expressive in this situation. Therefore, in this region, both the models result in no epidemic (particularly for high θ values, because the number of susceptible individuals is very low), weakly, moderate, or strong epidemic behaviors, but they do not agree for the majority of the parameters set. Aiming to evaluate the models performance when compared with real data, we explored the parameters space seeking to find epidemic behaviors that were comparable to the daycare infections, considering the dynamic equilibrium of the system. Thereby, looking for a more accurate quantitatively analysis, we study the steady-state of the models and compare it with the average number of infected children in the daycare. Figure 2 shows four examples that illustrate the results found for the steady-state of the models, comparing them with the average real data. In Fig. 2a , it was considered the total viral infection in both first and second semester, in which the annual average of infected number was equal to 15.50 cases, and the models' performance, fixing the parameters θ = 0.01, ϕ = 0.03 and ω = 1. For this parameters set, the fuzzy model had a better performance than the stochastic approach. The average number of infected in the dynamic equilibrium of the fuzzy and stochastic models are equal to 15.52 and 10.66, respectively. In Fig. 2b , it was considered the real data for picornavirus infection in which the annual average was equal to 13.15 cases, and the models' performance, fixing the parameters set in θ = 0.01, ϕ = 0.02 and ω = 1.5. Although the stochastic and fuzzy behaviors, in this case, were more similar than in the prior picture, the fuzzy performance was again the best. The fuzzy model provided an average number of infected equal to 12.84, contrasting with the average of 14.94 supplied by the stochastic model. Considering the same situation pictured in Fig. 2a and exploring the parameters space looking for a parameters set that provide the best result of the stochastic model, we find In order to illustrate how different the model's dynamic behavior can be, we present the picture 2d. In this figure, it is compared the average of the infected number with picornavirus (equal to 13.15) with the fuzzy and stochastic dynamic. Note that while the fuzzy system reaches a non-trivial steady-state, resulting in an infected number of 14.93, in the stochastic approach the epidemic does not hold (for parameters θ = 0.02, ϕ = 0.02 and ω = 1.9 in both models). From the theoretical point of view, like in the stochastic approach, the fuzzy model proposed here allows several variations in order to consider different epidemic scenarios (Menezes et al., 2004) . These variations can be performed in the possibility function (4) and/or in the fuzzy operators of a fuzzy relation composition (7). Also, this fuzzy model is more in accord to the prospective approach of those stochastic Reed-Frost model. However, it is possible to apply this fuzzy structure for retrospective studies. Furthermore, if data is available it can be used to improve the fuzzy relational matrix provided by the experts. Some differences can be pointed between the fuzzy and the stochastic structures. In the former, all signals information related to the possibility function can be performed through fuzzy relational matrix, while in the latter it should be done through the probabilistic function. Clearly, from the interdisciplinary point of view, it is easier to understand the fuzzy relational approach than the mathematical formalism of polynomial functions. In the same way, the heterogeneity of susceptible individuals can be more easily performed in the fuzzy structure. It can be made simply by considering a fuzzy relational matrix that cross information about the immunological characteristics (as information about the child's history, family and personal antecedents, breastfeeding, re-infections, etc.) and the degree of susceptibility for the infection. In this sense, the fuzzy relational matrix can supply a fuzzy measure of the individual's protection for certain infections, taking into account the aspects of the identification uncertainties, commonly present in a real epidemic process. In addition, both fuzzy measures for the susceptibility and infectiousness degree can be elaborated based on the experts opinion. However, from the simulation point of view, it is interesting to highlight that both fuzzy and stochastic Reed-Frost models can provide a diversity of epidemic behaviors, depending on the parameters set. As observed in the parameters space diagrams, there are regions in which they completely agree and regions where they provide quite different results. During the year that the day care's children were accompanied, a total of 255 exams were carried out, in 186 the presence of breathing infections was diagnosed. However, the number of infected children per month is too small to evaluate the performance of the dynamic model more deeply. Besides, the data are insufficient for simulating a retrospective version of the model. In other words, it is not possible to introduce in the model, via likelihood analysis for instance, the information contained in the real data and later to evaluate its predictive ability. Analyzing the dynamic behavior of the steady-state, we can see that the models were able to describe the average number of infected individuals for all viruses types, considering both annual and half-yearly data. In some cases, the fuzzy approach provides a better results. In addition, the situations presented in Fig. 2 are in accordance with that showed by the parameters space analysis, in the sense that in the case in which the set parameters corresponding to a concordant area (Figs. 2a and 2c ) the fuzzy and stochastic dynamic equilibrium were similar. On the other hand, for the parameters set corresponding to the discordant area (Figs. 2b and 2d) they present different results. In Fig. 2b , both of them present endemic situation, but with different average values. However, in the Fig. 2d , while for the fuzzy model we have an endemic state, for the stochastic approach the epidemic does not hold. This also illustrates the fact that the transitions in the space of the parameters occur in a fuzzy rather than in a crisp way. In addition, it is important to point out that both models presented here do not consist of simple generalization of the classical Reed-Frost model. In fact, these approaches allow the natural incorporation of the existing differences among individuals in the models, by including the signal heterogeneities. Clearly, both approaches can relax their hypothesis, as the homogeneous mixing assumption for instance, becoming more powerful models. Obviously, in this case they would lose their identification with the classical Reed-Frost model. Finally, we would like to highlight the importance of this work for the area of epidemic modeling, where the scarceness of information usually makes the elaboration of models that involve the individual aspects (micro) in the epidemic process (macro) unfeasible. Models of this type are rare in epidemiology and their analysis allows a better understanding of the factors that may contribute to the force of the infection during an epidemic. An examination of the Reed-Frost theory of epidemics Fuzzy modeling in population dynamics The SI epidemiological models with a fuzzy transmission parameter Development of three multiplex RT-PCR assays for the detection of 12 respiratory RNA viruses An approximate threshold condition for non-autonomous system: An application to a vector-borne infection Threshold conditions for a nonautonomous epidemic system describing the population dynamics of dengue Fuzzy modeling in symptomatic HIV virus infected population A non-standard family of polynomials and the final size distribution of Reed-Frost epidemic processes A schematic age-structured compartment model of the impact of antiretroviral therapy on HIV incidence and prevalence Some mathematical developments on the epidemic theory formulated by Reed and Frost Assessing the efficacy of a mixed vaccination strategy against rubella in São Paulo Fuzzy logic and measles vaccination: designing a control strategy Fuzzy epidemiology Forecasting versus projection models in epidemiology: The case of the SARS epidemics Yellow fever vaccination: How much is enough? The 1918 influenza A epidemic in the city of São Paulo A Reed-Frost model taking into account uncertainties in the diagnostic of the infection Fuzzy Partial Differential Equations and Relational Equations Fuzzy dynamical systems in epidemic modelling Fuzzy gradual rules in epidemiology A property of linear fuzzy differential equations An Introduction of Fuzzy Sets: Analysis and Design The dimension of Reed-Frost epidemic models with randomized susceptibility levels Fuzzy expert system in the prediction of neonatal resuscitation Solutions in composite fuzzy relation equations: application to medical diagnosis in Brouwerian logic On the fuzzy initial value problem Seroprevalence of rubella antibodies in the State of São Paulo, Brazil, 8 years after the introduction of vaccine This work was supported by the CNPq and FAPESP grants.