key: cord-0766158-a95hh4yk authors: Small, Michael; Tse, C.K. title: Clustering model for transmission of the SARS virus: application to epidemic control and risk assessment date: 2005-06-15 journal: Physica A DOI: 10.1016/j.physa.2005.01.009 sha: d42a95cf1d2da6101741a4a4e6d3deb571e13d93 doc_id: 766158 cord_uid: a95hh4yk We propose a new four state model for disease transmission and illustrate the model with data from the 2003 SARS epidemic in Hong Kong. The critical feature of this model is that the community is modelled as a small-world network of interconnected nodes. Each node is linked to a fixed number of immediate neighbors and a random number of geographically remote nodes. Transmission can only propagate between linked nodes. This model exhibits two features typical of SARS transmission: geographically localized outbreaks and “super-spreaders”. Neither of these features are evident in standard susceptible-infected-removed models of disease transmission. Our analysis indicates that “super-spreaders” may occur even if the infectiousness of all infected individuals is constant. Moreover, we find that nosocomial transmission in Hong Kong directly contributed to the severity of the outbreak and that by limiting individual exposure time to 3–5 days the extent of the SARS epidemic would have been minimal. The transmission of SARS in Hong Kong [1] (see Fig. 1 ) may be characterized by two apparently unusual features [2] : so-called super-spread events (SSE), in which a single individual initiates a large number of cases; and persistent transmission within the community. Two notable SSE were observed early in the epidemic and have been widely reported: at the Amoy Gardens housing estate and at the Prince of Wales hospital. Epidemiological studies [3, 4] have found that in Hong Kong: the fatality rate was approximately 17% (compared to 11% globally); the mean incubation period was 6.4 days (range 2-10) [5] ; the duration between onset of symptoms and hospitalization was 3-5 days; and the mean number of individuals infected by each case during the initial phase of the epidemic was 2.7 [2] . Like any epidemic, an initial exponential outbreak is evident in the SARS infection data for Hong Kong [5] , but this initial explosion was soon tempered. In this paper, we present an alternative These are the ''revised'' data in the lower panel. As we are unsure of the revision process use, and we believe it to include strong assumptions about the etiological agent involved, we analyze only the reported data in this paper. explanation for the occurrence of SSE. We show that SSE may occur, as a result of social structure, even if the etiological agent is equally infectious in all patients. Standard mathematical models of the spread of infectious diseases [6] are well known and widely applied. According to such models, individuals in the community can be classified as either susceptible to the disease (S), infected with the disease (I) or removed (and therefore immune to the disease) (R). Classically, transition between these three groups may then be governed by deterministic difference (or differential) equations and the transmission dynamics may exhibit only exponential growth or decay subject to population limitations. For a disease such as SARS the change in transmission dynamics are usually modelled by non-stationarity in the model parameters [7] . In this paper, we apply small world (SW) models to simulation of the spread of SARS in Hong Kong, transmission is only allowed to occur along a limited number of direct links between individuals. By doing this, we will avoid one of the most flawed assumptions of standard susceptible-infected-removed (SIR) models: a homogeneous fully connected populous. For social networks, SW models are characterized by the property that every individual is connected to almost every other individual through a short chain of mutual acquaintances: the so-called ''six degrees of separation'' [8] . This effect has been observed in a large number of social models: joint authorship of academic publications, professional acquaintances, email exchanges, spread of computer viruses [9] and even co-starring in Hollywood movies [10] . For infectious diseases that are spread through close personal contact, a similar model of propagation is likely. Our aim is to accurately mimic the qualitative features of the SARS epidemic with the simplest (fewest parameters) model. We propose four distinct states. Individuals can be susceptible (S), prone (P), infected (I), or removed (R). These four categories are similar to, but distinct from the SEIR type model of infection widely used to model diseases with a significant incubation period [11] . As is usual, the class S represents individuals that are susceptible to the infectious agent, but are not yet infected. Category R represents individuals that are permanently incapable of either transmitting or acquiring the disease: typically, this represents those that have recovered from the infection (and are assumed to be immune), those that have died, and those that are either isolated or quarantined. The I class represents individuals that are infected and infectious, whereas P represents those that are infected but not yet infectious (similar to E in the standard formulation). The transmission path (state transition graph) is depicted in Fig. 2 . Infected individuals can cause susceptible individuals, to whom they are linked, to become prone with some probability (p 1 or p 2 ). By infection we mean the transition from the susceptible to prone state. Infected individuals can cause their immediate neighbors to become infected with probability p 1 ; long range links cause infection with probability p 2 : Prone individuals become infected with probability r 0 and finally, infected individuals become removed with probability r 1 : Just as in the SIR model we do not distinguish fatalities from recoveries: in either case the individuals are assumed to have acquired immunity. We fix the population N and assume that there are no other births or deaths from any other cause. The population of N nodes are arranged in a regular grid, of side length L (L 2 ¼ N) and each node is connected directly to n 1 immediate neighbors. 1 An infected individual will infect each of its n 1 neighbors (provided they are still susceptible) with probability p 1 : Furthermore, each node has n 2 long distance links (see Fig. 2 ). These are links to nodes that are geographically remote from one another, infection occurs along these pathways with probability p 2 : For each node i the number n ðiÞ 2 is fixed and so are the links to its n ðiÞ 2 remote neighbors. The number n ðiÞ 2 is chosen to follow a discrete exponentially decaying distribution f X ðxÞ ¼ ð1=CÞe Àx=m with parameter m proportional to the expected (average) number of links to remote nodes, and parameter C ¼ 1=ð1 À e À1=m Þ ensures that f C is a probability distribution function. It is the inclusion of long distance links with a random number of links that gives rise to the network's SW structure. Moreover, the same long distance links can also cause the network to exhibit scale free (SF) properties. By SF we mean that logarithm of the probability of a node having n links is inversely proportional to log n (very large number of links occur relatively frequently). Whether the induced network of actual infections follows a SF distribution remains to be tested [7] and is not the focus of this paper. Finally, for each simulation we seed the model with one initial infection. We expect that computational simulations of this network will show that infection will spread locally, just as SARS spread within particular geographical regions of Hong Kong. Moreover, the system may also exhibit long-range infection, as a single individual may infect individuals in distant communities. Occasionally, individuals will infect a large number of other individuals exactly as was observed at the start of the SARS epidemic in Hong Kong (a SSE). As stated, our model has seven parameters: L, n 1 ; m; p 1 ; p 2 ; r 1 ; and r 2 : To simulate the population of Hong Kong we set L ¼ 2700 (N ¼ L 2 ¼ 7; 290; 000Þ: 2 We arbitrarily choose n 1 ¼ 4 (this choice will be partially vindicated later, but this number also slightly exceeds the average household size in Hong Kong 3 ). From the available data we know that the average incubation period between infection and becoming symptomatic is 6.4 days [5] . With transition probability r 0 the number of days in the prone state is the result of a series of independent Bernoulli trials with a mean 1=r 0 and therefore follows a geometric distribution f X ðxÞ ¼ ð1 À pÞ xÀ1 p (see Fig. 3 ). Similarly, the time before hospitalization (and presumably quarantining and therefore removal) is 3-5 days. Suppose that the average is 4. In our model, the number of days prior to hospitalization also follows a geometric distribution with mean 1=r 1 : Fig. 3 illustrates the distribution of the model parameter n 2 and the probability distribution of the number of days in state P and I. Hence the only free parameters are m; p 1 and p 2 : Without active control we also know that the average number of new infections per case is 2.7 [2] . In this state, each infectious individual will infect, on average, n 1 p 1 þ Eðn ðiÞ 2 Þp 2 new individuals every day. Since Eðn ðiÞ 2 Þ ¼ m and we suppose that the time before hospitalization is 5 days (on the upper end of the documented range) we have The actual permanent population of Hong Kong in 2004 was approximately 6,855,125 (see http:// www.cia.gov/). The over-estimation of Hong Kong's population implied by setting L ¼ 2700 is intended to account for a significant transient population. However, we have repeated the calculations reported here with smaller total populations and found no discernible distinction in the results. 3 See http://www.ypmap.com/. Hence p 1 % 1=n 1 ð0:54 À mp 2 Þ ¼ 0:135 À 0:25mp 2 : We can, therefore, parameterize the model behavior in terms of only the SW network parameters m and p 2 ; subject to mp 2 o0:54: Now, n 1 and n 2 represent the number of interactions an individual has each day. Therefore, the sum n 1 þ n 2 is a lower bound on the number of active acquaintances. The reason n 1 þ n 2 is only a lower bound, is that n 1 þ n 2 is the number of links that are sufficiently intimate to support transmission of the virus. In reality, some links would be closer than others and would be far more likely to lead to transmission. We make the simplification that transmission will occur with probability p 1 on all the n 1 -links and probability p 2 along all the n 2 -links. Informally, p 1 is the probability of transmission between members of the same household in a 24 h period, p 2 is the probability of transmission between acquaintances over one day. These two probabilities are not necessarily the same. Therefore n 2 may be defined as the number of individuals with this transmission probability. Experimental evidence suggests that any two individuals in continental North America (in 1967) are connected by (on average) no more than six links [12] . Ignoring mutual acquaintance, and assuming a population of 200 000 000 one can deduce that each individual has approximately logð2 Â 10 8 Þ= log 6 % 10:667 unique acquaintances. The problem of mutual acquaintances complicates things slightly, but this will not have an significant effect on the result. Moreover, if we set n 1 ¼ 4 (the horizontal and vertical neighbors only) then the probability of two nodes having mutual acquaintances is only n 2 ¼ N: Hence, in our simulations, we take n 1 ¼ 4 and Eðn 2 Þ ¼ m ¼ 7: Note that the choice of n 1 and n 2 is not critical, what is more important is the infection probability p 1 n 1 þ p 2 n 2 : 4 In fact, the variation in control (such as quarantining of suspected infected individuals) can be equally modelled as a reduction in n 1 and n 2 : In this model we aim to keep the parameter changes as simple as possible and achieve the same result by changing only p 2 : We summarize the model parameters as follows: L ¼ 2700 ; r 0 ¼ 1 7:4 % 0:135135 ; m ¼ 7; and p 1 ¼ 0:135 À 7 4 p 2 : Note that because we have the possibility of P to I transition after zero days r 0 ¼ 1 7:4 rather than 1 6:4 : This does not have a significant affect on our results, it is merely a computational convenience. Finally, we have only one free parameter p 2 : In what follows, we will investigate the effect of time dependent changes in p 2 (i.e., p 2 ðtÞ). Therefore we constrain the constant p 1 ¼ 0:135 À 0:25p 2 ð0Þ: Our reasoning in the preceding section has left us with a single unknown parameter p 2 : We start by computing multiple simulations for various values of p 2 in the range ½0; 0:08 with p 1 determined by Eq. (1). Our aim is to identify parameter values p 1;2 that exhibit dynamics most like those observed in Hong Kong. We start by simulating only the uncontrolled case (i.e., both p 1 and p 2 are constant). For each set of parameter values we simulate 1000 random realizations starting from a single infected individual until either: (i) the disease dies out (no I or P individuals remain); 4 In future work we plan to explore further the effect of variation in n 1 and n 2 : (ii) the disease becomes uncontrollable (more than 3000 I and P individuals); or (iii) the disease persists for 50 days. Fig. 4 illustrates estimates on the number of casualties, and likelihood of outbreak for r 1 ¼ 0:25 and p 2 2 ½0; 0:08: From Fig. 4 we see that, consistent with our expectations, for large p 2 the scale of the outbreak increases. Moreover outbreaks are most likely when there is clustering (i.e., p 2 40). However, we also note that the total number of infections, even in the worst case is significantly lower than for the true data. In comparison the number suspected cases in Hong Kong 50 days after the first reported case was 1589. From the revised data for Hong Kong, the number of identified SARS cases 50 after the index patient was hospitalized was 1161. Hence, even with external control measures the true data is significantly larger than these model simulations. Because we assume that the time between onset of symptoms (entering state I) and isolation (entering state R) is 3-5 days, Fig. 4 represents simulations from our model under the assumption that hospitalization implies complete isolation (hospitalization occurred in Hong Kong after 3-5 days). However, Fig. 4 clearly underestimates the extent of transmission and we are therefore forced to abandon this assumption. For the extent of transmission observed in Hong Kong, significant transmission must have occurred after hospitalization. In other words, the least reliable assumption of our model was that hospitalization represented isolation and that isolation would therefore occur after 3-5 days. Hence, the data depicted in Fig. 4 excludes the possibility of nosocomial infection. We can modify our model and include this possibility by decreasing r 1 : In Fig. 5 we show equivalent results for r 1 ¼ 0:16 (corresponding to removal after a mean of about 6 days). Fig. 5 illustrates that with a much lower removal rate, infection levels similar to those seen in Hong Kong during 2003 are observed. Moreover, from Fig. 5 it is clear that large scale transmission occurs when models exhibit a large number of clusters. Hence quarantine, travel restrictions and closure of schools and work places are all effective control strategies for this epidemic. However, such measures need to be implemented early. We now explore more closely variation in r 1 for fixed values of p 1;2 : We fix p 2 ¼ 0:04 and p 1 ¼ 0:065 and estimate the expected casualty rate for various removal rates r 1 o0:33 is given in Fig. 6 . As should be expected, the model exhibits a sensitive dependence on r 1 : But more significantly, widespread outbreak is likely only if individuals cannot be isolated within a few days. Furthermore, we see that the rate of transmission of the disease is inversely proportional to r 1 (and therefore directly proportional to the expected number of days prior to removal of an individual). Finally, we now provide simulations of the Hong Kong epidemic by varying the control parameter r 1 : We initiate the model with a single infected individual and a relatively low removal rate r 1 : As the disease progresses we increase r 1 (after 20 and 40 days) to reflect improved governmental control measures. After 20 days the first cases of SARS were isolated in Hong Kong. After 40 days it had become clear that a serious health problem was evolving and improved isolation measures (quarantine and school closure) were introduced. We also decrease p 2 to reflect the changing health practices of the community. For the sake of simplicity we introduce only a single step change in p 2 at the peak of the epidemic. This corresponds to increased public awareness of appropriate hygiene measures and a corresponding drop in the transmission rate. Fig. 7 presents our results. It is apparent from Fig. 7 that the simulations and the data share many common features. The approximately exponential growth prior to 45 days and exponential decay subsequent to this is a direct result of the selected parameters ðr 1 ; p 2 Þ: More significantly, all simulations exhibit ''bursty'' data consistent with localized outbreaks resulting from SSEs, apart from peaks at day 45 (or shortly thereafter) all extremum result from the stochastic nature of the model. The parameter values selected in Fig. 7 are arbitrary, and rather simplistic. Obviously the true situation would include a more irregular and gradual change in parameters. However, we have found that this simple situation is sufficient to a very large variation on the number of casualties. However, in all cases the parameter values of Fig. 7 provided effective control of SARS transmission after approximately 150 days. It is therefore clear that with effective control measures in place the likelihood of a significant outbreak is low. The standard susceptible-infected-removed (SIR) model [6] assumes that at any time all susceptible (S) individuals have an equal small probability of becoming infected (I), and that any infected individual may become removed (R): either through death, isolation, or recovery and henceforth immunization. However, by imposing a SW-SF structure on the system we simulated the effect of propagation by acquaintance: one can only become infected after contact with an infected individual [13, 14] . Furthermore, the explicit modelling of associations allows more effective simulation of control measures such as quarantining (or school closure). Finally, localized outbreaks (SSE) can be more realistically modelled with the inhomogeneous SW-SF model structure. The implication of this is that the SSE do not necessarily require highly contagious individuals, only a highly connected individual, and, highly connected individuals occur naturally in any community. In other words, high rates of infection for certain individuals are not necessarily related to the action of the disease, only to their degree of social connectedness. Therefore we conclude that a highly variable infection rate is not necessarily a significant factor in the transmission of SARS. Significantly, our simulations indicate that nosocomial transmission of SARS was a significant factor during the early part of the epidemic. Without hospital-based transmission the rate of disease transmission is substantially lower than the observed behavior. Hence, isolation of patients as soon as symptoms are identified is key to containing future outbreaks of SARS. However, our analysis is far from complete. A more systematic analysis of the effect of hospitalization-possibly including separate categories in the compartmental state model-is required [15] . Such work has already been done for the SARS outbreak in Toronto [15] , and this needs to be extended to the larger population in Hong Kong. Our simulations indicate that exponential growth of the epidemic was only prevented by improved isolation (higher r 1 and lower p 2 ). It is not clear whether this improved behavior was a result of changing public practice (higher r 1 ) or simply changing weather conditions (lower p 2 ). It is usual to model changing control of a disease by changing the removal rate r 1 : In addition to this our models suggest that improved quarantine and public health procedures could be modelled by decreasing n 2 (or m). It is clear from our investigation that n 2 40 is necessary for widespread contamination. The physical meaning of n 2 is clear, and further analysis with this model may provide a more detailed indication of the efficacy of control measures (as belatedly implemented in Hong Kong during 2003) such as closure of schools and travel restriction. In a companion paper we derive analytic expressions for the expected behavior of the model [16] . Report of the severe acute respiratory syndrome expert committee Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions Severe acute respiratory syndrome (SARS): epidemiology diagnosis and management Epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in Hong Kong Plausible models for propagation of the SARS virus Six Degrees: The Science of a Connected Age Epidemic spreading in scale-free networks It's a small world The small world problem Modelling of contact tracing in social networks Contact tracing and epidemic control in social networks Critical role of nosocomial transmission in the Toronto SARS outbreak Small world and scale free model of transmission of SARS