key: cord-0682137-hd079dox authors: McBryde, E. S.; Gibson, G.; Pettitt, A. N.; Zhang, Y.; Zhao, B.; McElwain, D. L. S. title: Bayesian modelling of an epidemic of severe acute respiratory syndrome date: 2006-04-08 journal: Bull Math Biol DOI: 10.1007/s11538-005-9005-4 sha: 1c25b153fbe8c449d7adcde9c039db9edbb437be doc_id: 682137 cord_uid: hd079dox This paper analyses data arising from a SARS epidemic in Shanxi province of China involving a total of 354 people infected with SARS-CoV between late February and late May 2003. Using Bayesian inference, we have estimated critical epidemiological determinants. The estimated mean incubation period was 5.3 days (95% CI 4.2–6.8 days), mean time to hospitalisation was 3.5 days (95% CI 2.8–3.6 days), mean time from symptom onset to recovery was 26 days (95% CI 25–27 days) and mean time from symptom onset to death was 21 days (95% CI 16–26 days). The reproduction ratio was estimated to be 4.8 (95% CI 2.2–8.8) in the early part of the epidemic (February and March 2003) reducing to 0.75 (95% CI 0.65–0.85) in the later part of the epidemic (April and May 2003). The infectivity of symptomatic SARS cases in hospital and in the community was estimated. Community SARS cases caused transmission to others at an estimated rate of 0.4 per infective per day during the early part of the epidemic, reducing to 0.2 in the later part of the epidemic. For hospitalised patients, the daily infectivity was approximately 0.15 early in the epidemic, but fell to 0.0006 in the later part of the epidemic. Despite the lower daily infectivity level for hospitalised patients, the long duration of the hospitalisation led to a greater number of transmissions within hospitals compared with the community in the early part of the epidemic, as estimated by this study. This study investigated the individual infectivity profile during the symptomatic period, with an estimated peak infectivity on the ninth symptomatic day. symptomatic period, with an estimated peak infectivity on the ninth symptomatic day. Keywords SARS · Bayesian · Modelling · Infectious disease · Viral transmission Severe acute respiratory syndrome (SARS) caused a perplexing epidemic with propensity for hospital transmission, rapid worldwide spread and markedly different epidemic curves in different countries (Wallinga and Teunis, 2004) . Beginning in November 2002 in the Guangdong province of China, the SARS epidemic spread to Hong Kong, Viet Nam and Singapore by March 2003 and eventually to 29 countries around the world (Poon et al., 2004) . The World Health Organisation (WHO) issued a global alert on 12 March 2003 regarding a cluster of cases of severe atypical pneumonia and 3 days later gave a case definition and name to the condition (WHO, 2003c) . A novel coronavirus, named SARS-CoV, was identified as the infectious agent responsible for SARS in April 2003 (Drosten et al., 2003; Ksiazek et al., 2003; Peiris et al., 2003b) . In total, 8098 SARS infections and 774 deaths were reported in the 2002/2003 epidemic of SARS (Gumel et al., 2004) . The largest outbreaks occurred in mainland China, where 5327 infections and 349 deaths were reported (WHO, 2003a) . Despite the initial worldwide spread and early predictions of high case numbers, the 2003 SARS epidemic was contained relatively rapidly with no further spread reported after July 2003 . SARS-CoV is likely to have an animal reservoir, possibly the palm civet cat, Paguma lavatas (Guan et al., 2003; Webster, 2004) , and further epidemics are anticipated. Laboratory associated infections in Singapore , Taiwan (Orellana, 2004) and China (WHO, 2004) , the latter involving onward transmission (Normille, 2004) , remind us that further outbreaks of SARS could occur. To help contain future epidemics of SARS, it is essential to have an understanding of the infectivity, incubation period and likely course of the illness. Nosocomial transmission was a prominent feature of SARS epidemiology. Early in the SARS pandemic, a majority of cases arose from hospital transmission in many places, including Toronto (Booth et al., 2003) , Hong Kong Wong et al., 2004) and Singapore . Later in the course of the epidemic, hospitals were effective sites of containment of SARS . Factors believed to be important in reducing nosocomial transmission of SARS include handwashing and wearing of masks, while contact with respiratory secretions is highly correlated with SARS transmission (Teleman et al., 2004) . Thorough contact tracing and quarantine of exposed cases led to reduced transmission in Singapore . In this study, we compare the estimated infectivity of SARS cases in the community and in hospitals. We also examine how this changes over time. Mathematical models of the SARS epidemic have the potential to give insights into the disease process, to estimate critical epidemiological determinants and ultimately to predict outcomes of public health interventions. Models of SARS transmission published to date have already been useful tools for designing control strategies; estimating the incubation period , the infectivity (Lipsitch et al., 2003; Riley et al., 2003; Wallinga and Teunis, 2004) , and the potential impact of interventions . Models have been used to predict the effect of public health measures on the SARS epidemic in many countries including Canada (Choi and Pak, 2003; Chowell et al., 2003) , Hong Kong (Chowell et al., 2003; Lee et al., 2003; Riley et al., 2003) , Singapore (Chowell et al., 2003; Lipsitch et al., 2003) , Taiwan (Hsieh et al., 2004) , and mainland China (Wang and Ruan, 2004) . For transmission models to be realistic and predictive, accurate measures of the various transition times are required, including the incubation period, and the time from symptom onset to removal (isolation, recovery or death). Estimates of infectivity, particularly those based on the early behaviour of an epidemic, are sensitive to the estimate of the incubation period. Models are also sensitive to the full distribution of the transition periods (Lloyd, 2001) , such that summary measures (mean, median) alone are often inadequate in modelling the behaviour of the epidemic. In order to design effective and safe interventions, public health practitioners also need an accurate estimate of the incubation period. Decisions regarding quarantine time require estimates of the mean incubation period and the probability of outliers. Therefore, the full probability distributions of the incubation and symptomatic periods are required. The general aims of this study are to estimate accurately the full distribution of the transition times; the incubation period, time from symptom onset to hospitalisation, recovery and death, to determine the infectivity of SARS including the relative infectivity of symptomatic SARS cases in and out of hospital and early and late in the epidemic, and to estimate the individual infectivity profile over the course of SARS infection. This study makes some unique contributions to the study of SARS transmission. Firstly, it uses a Bayesian framework to infer transmission times and calculate the incubation period. In doing so, it investigates three different models of viral transmission. Secondly, it compares the infectiousness of SARS cases in the community and in hospital and during different times of the epidemic. Thirdly, this study considers three different models for individual infectivity profiles over time, using model selection criteria to determine the optimal model. The current study investigates a database from mainland China which has not been published previously. The model used in this study is an extension of the stochastic version of the compartmental Susceptible-Exposed-Infectious-Removed (SEIR) model (see Fig. 1 ) used extensively in infectious disease modelling literature (see, for example, Kermack and McKendrick, 1927) . In the SEIR model, individuals in a population begin as susceptible (S) and move to the exposed (E) state following transmission of a contagion. This occurs at a rate that is proportional to the number of infectious S E R I Fig. 1 The schematic of the SEIR model. (I) and the proportion of susceptible people in the community, S/N (the mass action effect) so that in a small time interval, dt, the probability of a transmission occurring is given by where β is a constant. In the simplest version of the SEIR model, transition between subsequent model compartments occurs at a constant rate, becoming infectious as they move into the I compartment and being neither infectious nor susceptible after being Removed (see Fig. 1 ). This leads to where δ and γ are constants. The assumption of a constant transition rate in the basic SEIR model, adopted for ease of calculation, leads to an exponential distribution of the probability density function for the time to transition. In the case of SARS, the incubation period, time to hospitalisation and time from hospital admission to discharge have been shown not to be exponentially distributed . Assuming an exponentially distributed incubation period, with a mode of zero (when, in fact, the mode of the incubation period is considerably greater than zero), leads to underestimation of infectivity inferred from the early epidemic growth curve. In the current study, we implemented an alternative parameterisation of the transition times. Following Donnelly et al. (2003) , the Gamma distribution was used. Other distributions could also be utilised to approximate the incubation period, such as the Weibull distribution, used by Lipsitch et al. (2003) . In this study, we use (α, β) notation, where α is the shape parameter and β is the reciprocal of the scale parameter, such that and The current study extends the SEIR model by considering two infectious groups and two removed groups. As shown in Fig. 2 , in this model the patients can either be infectious and in the community, I, or infectious and hospitalised H. Removal can represent either recovery, R, or death, D. This model, similar to that used by Riley et al. (2003) and Lipsitch et al. (2003) , will be referred to as the SEIHRD model. In addition to dividing the infectious compartments into two groups, community and hospitalised, the study also examines infectivity early and late in the epidemic. R D I Fig. 2 The schematic of the extended SEIHRD model used in this study. The heavy arrows represent the transitions that were observed or inferred in the current study. The thin arrows represent events that probably occur, but with a low frequency relative to other transitions and therefore are not considered in the current study. Hence, there are four infectious groups to consider (a) early community, (b) early hospitalised, (c) late community and (d) late hospitalised. In this study, three different models of individual infectivity profiles are considered: Uniform transmission model: Constant infectivity within each of the four groups of patients (a)-(d), but different between groups. Model with transmission proportional to viral load: Infectivity is modelled as a triangular distribution, with zero infectivity on day 0 and 20 and a peak at day 10, following the viral load as described by Peiris et al. (2003aba) . This is also influenced by the group (a)-(d) into which the patient falls. given by the Gamma distribution, the shape and scale parameters of which are inferred. Again this is modified by the co-efficient of infectivity based on the group into which the patient falls (a)-(d). This study assumes that the proportion of the population that is susceptible, S/N, remains at unity throughout the epidemic. The authors justify this by the large number of people in the region investigated in this study, with the largest city in Shanxi Province having a population of around 3 million, compared with the small number (354) of SARS cases observed in the epidemic in the region. The full description of the database used in this paper is given in the next section. Other assumptions implicit in the current model are that there is homogeneous mixing of the population and that SARS cases were only infectious during the symptomatic period. Early contact tracing studies suggest that infectivity is indeed low during the incubation period . The current study also assumes that sub-clinical SARS cases (not recorded in the database) did not contribute significantly to the epidemic. This assumption is supported by the finding of a very low seropositivity (0.2%) of SARS antibodies in people who did not have symptomatic SARS but who had close contact with SARS cases (Leung et al., 2004) . The data used in this study come from Shanxi province in China. On 23 April, the WHO travel warning to China was extended to include Beijing and Shanxi province (WHO, 2003d) . The travel warning was removed on 13 June 2003, after Figure 3 shows the daily number of hospital admissions of SARS cases in Shanxi province. It can be seen that the peak incidence of SARS cases admitted to hospital in Shanxi province was in mid-to late April 2003. Data recording the duration of exposure to another person with SARS were available in 85 cases. Exposure time, recorded by calendar day, ranged from zero to a maximum of 26 days as shown in Fig. 4 . The mean time from the day of first known exposure to the day of symptom onset (inclusive) was 8.5 days using the discrete data set. The time from the end of exposure to the symptomatic period had a mean of 2.9 days. This places an upper and lower limit on estimates of the mean incubation period. The time from symptom onset to hospitalisation was recorded in 351 of the 354 cases. In two cases, the recorded hospital admission day preceded the recorded time of symptom onset. This was due to quarantining of exposed individuals during the incubation period. These patients were excluded from the analysis of time from symptom onset to hospitalisation, leaving 349 available patient records. Figure 5 shows a histogram of the time from symptom onset to hospitalisation. It is an approximately exponential distribution and the majority of SARS cases reached hospital within 4 days. It is widely dispersed, however, with some people taking more than 10 days to reach hospital. There is a clear outlier among these data with one SARS case reporting 44 days of symptoms prior to hospitalisation. This is also evident on the Gantt chart, shown in Fig. A.1. It seems most likely that the date of onset of symptoms is erroneous and this case has been excluded from the remainder of the analysis. Of the 354 cases in the epidemic, 344 had a recorded outcome (recovery or death), of whom 20 died and the remainder were discharged from hospital following recovery. The time from symptom onset to recovery was available in all 324 cases and the time from symptom onset to death was available in 18 of the 20 cases. The distributions of symptom onset to recovery and symptom onset to death are shown in Figs. 6 and 7, respectively. A major challenge of the study was to estimate the distribution of the incubation period of SARS. The time of transmission of SARS is unobservable, such that estimates of the incubation period are necessarily based on inference. A Bayesian inference framework was used in this study as described in Section 5. Only a limited number of cases have recorded known symptomatic SARS contacts and these are used to infer transmission times and thereby estimate the incubation period. The cases with the shortest contact periods are most informative. Section 6 describes the methodology used to parameterise the distributions of time to hospitalisation, recovery and death. This is more straightforward, as the times are observed and recorded. In Section 7 we estimate the infectivity of the two compartments assumed to be infectious, the symptomatic patients in community and in hospital. This requires inference regarding missing data and transmission times. Extending the SEIR model to include the two infectious compartments allows us to estimate the relative impact of hospitalised and community SARS cases on the epidemiology. Additionally, we can compare how infectivity changed over time in each group, reflecting the effects of interventions. In this section, we also estimate the change point; the date that marked the transition from high to relatively low infectivity. Finally, this study explores individual infectivity profiles over the course of SARS illness, see Section 8. The incubation period was estimated only from those cases that had known contact with another SARS case, and when there was a single contact of known duration. In the Shanxi database, this included 85 cases. It was assumed that transmission occurred from the known contact during the contact period and that the rate of transmission, given the contact was independent of the state of the epidemic. The required times of exposure for transmission to occur for the 85 cases under consideration was assumed to be a set of independent random variables. Incubation periods of the SARS cases are also assumed to be independent. The model assumes that during periods of exposure to symptomatic SARS cases, susceptible individuals acquire the disease at a fixed daily hazard rate, λ. This constant hazard model is compared with two other models, a model assuming immediate transmission and a model in which the probability of transmission is uniform across the contact period. Following transmission, there is an incubation period that occurs before patients become symptomatic. This period is assumed to be drawn from a (α L , β L ) distribution. A Bayesian approach was used to estimate the incubation period: where π(λ, α L , β L ) is the prior probability of the parameters, L(data; λ, α L , β L ) the likelihood of the data given the parameters, and π(λ, α L , β L ; data) the posterior probability distribution of the parameters. Explanation of the choice of prior probability distributions for the parameters, use of augmented data and determination of likelihood of the data are given in this section. Details of computations are given in Appendix B. Gamma priors were chosen for the three parameters. Vague prior distributions, (0.001, 0.001), were chosen for λ, α L , and β L because little is known about the transmission rate. The data used for the estimation of incubation period are the durations of exposure to another SARS case, denoted by v i for each individual i, and the time from the first exposure to the onset of symptoms, denoted by s i for each individual i. If N is the total number of cases, the vector of the N exposure times is denoted by v and the vector of N times to symptom onset is denoted by s. The time that each individual in the data set acquired SARS-CoV is not known. It is assumed to be during the period of exposure to another symptomatic SARS case. The time to transmission, denoted by u i , was estimated and included in the model as an auxiliary variable. The remaining time to onset of symptoms (s i − u i ) is the incubation period. In the data set available, all patients developed SARS, so we are considering the probability density of u i conditional on transmission having occurred (therefore u i < v i ). Assuming a constant hazard of transmission throughout the contact period, the conditional probability density of u i is a truncated exponential distribution given by The likelihood of u is also dependent on the probability density of the incubation period, s i − u i . The distribution, g, of the incubation period, given by u i , is determined by the (α L , β L ) distribution, so that Assuming the observations are independent, the likelihood of the augmented data (observations plus auxiliary variables, u) is given by The likelihood of the full set of N observations is given by Because integral (10) is not straightforward to compute, a Markov chain Monte Carlo (MCMC) algorithm, given in Appendix 9, was used to determine the posterior probability distributions of the parameters. The posterior distribution of the hazard of transmission, λ, had a maximum density close to zero and a mean of 0.18 per day, see Fig. 8 . The inferred mean time from exposure to transmission was 2.5 days (95% CI 0.19-4.4). The estimated incubation period is shown in Fig. 9 . It follows a (1.4, 0.26) distribution. The standard deviation for the incubation period was 4.5 days (95% CI 3.4-5.9 days) and mean was 5.3 days (95% CI 4.2-6.8 days). The median is 4.2 days, shorter than that reported by Lee et al. (2003) , 6 days, but similar to that reported by Donnelly et al. (2003) , 3.8 and Meltzer (2004), 4 days. Appendix C.1 compares the sensitivity of the results for the incubation period to the value of λ and to model choice, showing that the conclusions regarding the incubation period are robust to these. Estimation of the incubation period for SARS-CoV has proven to be a considerable challenge. Numerous studies have attempted to make estimates (see Donnelly et al., 2004 for a review). Papers in which interval censoring methodology is outlined, a common strategy to deal with censored data is to assume a uniform probability of transmission across the exposure period (see, for example Donnelly Meltzer, 2004 ). An alternative is to assume immediate transmission upon exposure to a known symptomatic SARS case (see, for example Lee et al., 2003) . The methodology used to estimate the incubation period in the current study was to assume a constant hazard of transmission within the contact period. The estimated incubation period, for a given data set, using this model would be expected to be longer than the estimations using the uniform probability model, but shorter than the estimates based on the assumption of immediate transmission. The constant hazard model has the advantage that it has a biologically plausible basis. However, because the estimated value of the hazard of transmission, λ, had a large probability mass near zero in this study, it would be reasonable to use a uniform probability density function for time to transmission as an approximation. Figure C. 2 illustrates the estimated incubation period based on the two different models. There is little difference between the result of the incubation period assuming a constant hazard and that assuming a uniform probability of infection during the exposure period, and the subsequent conclusions of the model are robust to the estimates of λ. Figure C. 2 also gives the expected value of the incubation period assuming instantaneous transmission at the time of contact, which is considerably longer than the estimated incubation period in the constant hazard or uniform transmission models. Determining the incubation period following point exposure avoids the assumptions required to infer transmission times. Olsen et al. (2003) investigated cases following a 3 h in-flight exposure to a symptomatic SARS case and found an incubation period of 4 (2-8) days. The numbers in that study were small (22 cases), and the rapid transmission may reflect a large inoculum which could impact on incubation period. Studies using larger data sets of fully observed exposure times would be useful. A deficiency in this study is that there is only weak information on hazard of transmission, λ, since only those known to be infected with SARS-CoV are included in the data set. This leads to the posterior probability density for λ taking on values similar to the prior probability. In future studies, more informative estimates of λ could be obtained by incorporating knowledge about those who had exposure to a SARS case but did not become infected. Alternatively, the number of contacts per infectious patient per day could be incorporated into the model. This would provide a direct relationship between the daily hazard of transmission for a single contact and the infectivity per patient per day, which is estimated from the large-scale behaviour of the epidemic (see Section 7). A Bayesian framework was also used to estimate the other transition periods in the SEIHRD model: time from symptom onset to hospitalisation, time from hospital admission to recovery, and time from hospital admission to death. The transition periods were assumed to be drawn from (α, β) distributions. The parameters of the Gamma distributions were given vague prior probability densities (π(α, β) ∼ (0.001, 0.001)). All observations for transition periods were assumed to be independent. The posterior probability densities of the Gamma distribution parameters (α, β) were determined for each of the transition periods using π(α, β; z) ∝ π(α, β)L(z; α, β), where z is the vector of observations for each of the transition period and L(z; α, β) the likelihood given by where N is the number of observations. The calculations were performed using Metropolis-Hastings steps in a manner similar to that described in Appendix B. Figure 10 gives the parameterised posterior probability distribution of the time interval from symptom onset to hospitalisation, with the recorded discrete data in the background. The distribution is approximately exponential, with a mean of 3.5 days and a median of 2.9 days. Figure 11 shows the parameterised distribution of the time from symptom onset to recovery. The mean time from symptom onset to death was 26 days, with a standard deviation of 11 days. Figure 12 shows the parameterised distribution of the time from symptom onset to death. The distribution is widely dispersed, with a mean of 21 days and standard deviation of 9.4 days. Table 1 gives the means and standard deviation for the duration of each of the stages of infection. Appendix A gives the estimated values of the shape and scale parameters of the inferred Gamma distributions. The extended SEIHRD model was used to estimate the infectivity of SARS cases. Coefficients of infectivity were defined in this study as the expected number of new transmissions per infectious case per day. The infectious group was divided into community, I, and hospitalised, H symptomatic SARS cases. The epidemic was assumed to begin on 28 February 2003 when the first introduced SARS case became symptomatic. Following the SEIHRD model outlined in Section 2, the rate of new transmissions was assumed to be proportional to the number of infectious patients at that time and their infectivity. Two different states, community and hospitalised, and two different time periods, early and late in the epidemic, were investigated. The time of change from high to low infectivity was also estimated. The change point was considered an additional parameter, and its posterior probability was investigated. The parameters of interest in this part of the model are the coefficients of infectivity of symptomatic community SARS cases (prior to hospitalisation) early and late in the epidemic, denoted by x 1 and x 2 , respectively, and the coefficients of infectivity of hospitalised patients early and late in the epidemic, denoted by y 1 and y 2 , respectively. Also of interest is the change point, denoted by C. (0.001, 0.001) was used for the four coefficients of infectivity. A discrete, uniform U[1, n] distribution was used as the prior for the change point, where n is the number of days of the epidemic. Following the SEIHRD model and assuming constant infectivity within each of the four groups of symptomatic SARS cases, the transmission pressure, ρ j , on day, j, is given by where i = 1 ( j < C) and i = 2( j ≥ C). I( j) is the number of symptomatic community patients and H( j) is the number of symptomatic hospitalised patients. The likelihood of T j transmissions occurring on day j is assumed to be drawn from the Poisson distribution: In a small-scale epidemic, if the number of susceptibles were known, the Binomial probability distribution could be used. In this epidemic, in which there are approximately 3 million susceptibles, the Poisson approximation is reasonable, although it may underestimate the dispersion of the offspring distribution, particularly if there is marked heterogeneity of spreading (for example super-spreaders). With all data included, it is straightforward to find the full likelihood of the data given the parameters: where n is the number of days of the epidemic, and T, H and I represent the vectors of n values of daily transmissions, community case numbers and hospitalised case numbers, respectively. Because the times of transmission are unknown, and there are some missing values in the hospitalisation and recovery and death times, missing data and unobserved data need to be inferred. The simulated data are drawn from the distributions of the incubation period, time to hospitalisation and time to recovery and discharge estimated in the first part of the study. The techniques used for data augmentation and computation are given in Appendix E. The epidemic was measured from the day of symptom onset of patient 1, which was 28 February 2003. Figure 13 shows the posterior distribution for the estimated time of the change in infectivity (change point). The maximum density is taken to be the end of day 29 of the epidemic, corresponding to the beginning of 29 March 2003. Following this, the estimates of the coefficients of infectivity were performed assuming a change point at midnight 28/29 March. Figure 13 demonstrates that there is considerable uncertainty with this estimate, with the posterior probability also giving some support to an earlier change point time. The posterior weight rapidly declines for times after 29 March, suggesting later times are unlikely. in Table 2 . The relative infectivity of community compared to hospitalised SARS cases increases markedly after the change point, with x 1 /y 1 = 5.1 (95% CI 0.8-17), and x 2 /y 2 = 350 (95% CI 95-1400), where x i refers to symptomatic community SARS cases (prior to hospitalisation) and y i refers to hospitalised patients. The basic reproduction ratio, R 0 , is defined as the expected number of secondary cases per primary case in a fully susceptible population (Anderson and May, 1991; Diekmann and Heesterbeek, 2000) . As the epidemic progresses, the reproduction ratio could be modified both by a decrease in the number of susceptible cases or a change in infectivity (for example, due to infection control interventions). In this study, we estimated the effective reproduction ratio before and after the change point. The effective reproduction ratio can be deduced from the inferred coefficients and the known data. The mean time from symptom onset to hospital admission is 3.5 days and the mean time from hospital admission to either recovery or death is 22.2 days. The posterior probability distribution of the effective reproduction ratio can be calculated using where R a is the reproduction ratio prior to the change point, X the mean duration of symptoms prior to hospitalisation, and Y the mean duration of symptoms in hospital. Similarly, R b , the reproduction ratio after the change point, can be calculated using R a is estimated to be 4.8 (95% CI 2.2-8.8) and R b is estimated to be 0.75 (95% CI 0.65-0.85). The distributions for R a and R b are displayed in Fig. 14 . The greatest impact on the reproduction ratio was the change in infectivity of the hospitalised group. During the first part of the epidemic prior to 29 March, the expected number of transmissions resulting from each symptomatic SARS case is 1.4 during the community period, and 3.4 during the hospitalised period. For the SARS cases from 29 March onwards, the expected number of transmissions resulting from each symptomatic SARS case is 0.73 during the community period and 0.013 during the hospitalised symptomatic period. The ratio of infectivity in the community to infectivity following hospitalisation is 5.1, similar to estimate of 5. After 29 March, however, this figure was much higher, owing to a markedly reduced estimated infectivity in hospitalised patients. A preliminary analysis compares three models of individual infectivity over the course of SARS-CoV infection. The first is the uniform transmission model in which infectiousness within the four groups (community early, hospitalised early, community late, hospitalised late) is uniform over the course of illness. In the second model, transmission is proportional to viral load. In this model, infectivity takes on a triangular distribution peaking on day 10, following the results for viral load described by Peiris et al. (2003a) . In the third model, in which the transmission is given by a Gamma distribution, shape and scale parameters were inferred. Using the Akaike (1974) information criterion (AIC), the model with transmission given by a Gamma distribution is superior (AIC = 320) to the uniform transmission model (AIC = 328). The model in which transmission is proportional to viral load performs the worst (AIC = 356). With reference to the model with transmission given by a Gamma distribution, the inferred Gamma distribution of the infectivity profile is shown in Fig. 15 . The peak infectivity is estimated to be on the This finding is based on an initial exploration of the data set, and the analysis can be extended. In particular, the infectivity profiles could inform the transmission times. In this study, as a simplification, the unobserved transmission times were inferred using the uniform transmission model only. Important conclusions regarding the infectivity of SARS-CoV can be drawn from this analysis. The estimated daily infectivity of the hospitalised patients was lower than for community patients. Despite this, it was estimated that early in the epidemic, a larger number of secondary cases resulted from hospitalised patients because people remained in this stage for a longer time (an average of 22.2 days symptomatic in hospital compared with 3.5 days prior to hospitalisation). Later in the epidemic, the transmission rate of symptomatic community SARS cases decreased to around 50% of previous levels, whereas the decline in the transmission rate for the SARS patients in hospital reduced more dramatically to around 0.4% of previous levels. These results support the conclusion that interventions were effective in controlling the SARS epidemic in Shanxi province, particularly the interventions directed at hospital isolation. However, other possible causes for these results need to be considered. The relatively high infectivity of the community SARS cases could be due to their earlier stage in the course of SARS-CoV infection. Much of the time spent in hospital is associated with the convalescent stages of the illness, and it could be argued that SARS patients would be less infectious during this period. On the other hand, Peiris et al. (2003a) showed that viral shedding peaks around day 10, suggesting that for many people the most infectious stage of the illness occurs following hospitalisation. The reduction of infectivity over time could be partly explained by the fact that the proportion of contacts that is susceptible decreases as an epidemic proceeds. While depletion of susceptibles undoubtedly occurs in widespread viral epidemics, the authors believe that the transmission of SARS-CoV to 354 people in a population of over 3 million would not account for a significant drop in the proportion of contacts who are susceptible, assuming homogeneous population mixing. In the hospital setting and in families, in which contacts tend to cluster, depletion of susceptibles may account for some of the change in the reproduction ratio. This could be further explored using a network or household model. Another reason for the difference in infectivity before and after 29 March could be seasonal. It is possible that SARS-CoV, like many other respiratory viruses, is transmitted more efficiently in winter. However, this would result in a general decline in infectivity, which does not explain the much greater reduction in infectivity of hospitalised SARS cases compared with community SARS cases, observed in this study. The estimated date on which the infectivity of SARS declined (the change point) predated the peak incidence of admission of SARS cases to hospital. Both the incubation period and the delay between symptom onset and hospitalisation contributed to this lag. It is a lesson for future epidemics that even after appropriate interventions are successful in reducing transmission, we can expect a further increase in infection notifications. The reproduction ratio late in the Shanxi epidemic is very similar to those estimated by Wallinga and Teunis (2004) in Singapore and Hong Kong, both estimated to be 0.7. Wallinga and Teunis (2004) studied four countries (Singapore, Viet Nam, Hong Kong and Canada) and found that although the epidemic curves initially were markedly different, following interventions, the estimated reproduction ratio was very similar in three of the four countries examined in that study. Although it is reassuring that in most cases (all except Canada), a reproduction ratio of less than 1 was achieved, it was only following the implementation of stringent control measures. It could be predicted that if complacency occurs in future epidemics, it may be difficult to achieve a reproduction ratio of less than 1 for SARS. Three different models of infectivity profiles over the course of SARS-CoV infection were considered in this study. The model considering a Gamma shape for infectivity appeared statistically slightly superior to the model assuming uniform infectivity. Of interest is that the estimated peak infectivity occurs on the ninth day following symptom onset. This is consistent with specimen positivity in the lower and upper respiratory tract and gut reported by Cheng et al. (2004) . Additionally, Peiris et al. (2003aba) measured nasopharyngeal aspirate viral loads of 14 SARS cases on day 5, 10 and 15 following symptom onset and found that day 10 was consistently the highest of these measurements. The concordance between viral load data and infectivity inferred in this study warrants further investigation. A larger data set in which contact times are fully observed would be useful in elucidating infectivity profile. There are several ways in which the current model can be extended. This study assumed Gamma distributions for transition times. Other distributions could be considered including the Weibull and non-parametric approaches. A mixture model may be particularly useful for estimating susceptibility, infectiousness and duration of infectivity. The possibility of more than one change point or a gradual transition could also be explored. Reversible jump MCMC would be a useful tool in determining this. SARS models to date, including the current study, have assumed zero infectivity during the incubation period. Infectivity of SARS cases during the incubation period could be estimated by extending the Bayesian inference model. While there were no clearly identified super-spreaders in the Shanxi epidemic, heterogeneity of infectivity was a major feature of the epidemiology of SARS in Singapore and Hong Kong . This could be further investigated using the current data set; however, a data set containing detailed information on transmission trees would be more informative. Computations were performed using a MCMC algorithm. (1) Initialise the parameters λ, α L and β L . Fig. A.1 Gantt chart of epidemic. The time of exposure to another SARS case is mid-blue, the time that a patient is asymptomatic following exposure is light blue, the time of symptoms prior to hospitalisation is yellow, the time of hospitalisation is orange and the time of discharge or death is maroon. Patients are ordered according to hospital admission date. (2) For each patient i, propose a new u i by drawing u i randomly from the distribution described in expression (7). (3) Accept u i using the acceptance probability, (4) Propose λ using a simple random walk step such that λ = λ + , where is drawn from the N(0, 100) distribution. In this paper, we follow the Bayesian notation where N(0, 100) is used for a normal distribution with a precision of 100 and a variance of 0.01. The precision of the proposal distribution was chosen as a balance of the need to have rapid mixing and the desire to improve acceptance probability. (5) Accept λ with a probability P acc given by . (B.2) (6) Update α L , proposing a new value α L using a simple random walk, each step is drawn from a random normal distribution N(0, 100). Accept α L with probability P acc given by 3) (7) Update β L using a Gibbs step. A conjugate prior, π(β L ) ∼ (l, m), is assigned to β L , making the full conditional posterior for β L which enables a Gibbs update of β L by drawing a value randomly from this distribution. The "burn-in" period was 10,000 iterations. The posterior probability distributions of u, λ, α L , β L were determined by taking the next 90,000 updates. Visual inspection of the trace plots showed that the chains for all parameters appeared to converge within 1000 iterations. A number of different initial values were considered for the parameters and the results were essentially unchanged. Figure C. 1, for example, shows values of α L plotted against iteration number for six different initial values. The plots show that the estimates of α L settle down well before the end of the 10,000 iteration burn-in. Sensitivity analysis was performed on the choice of model used to estimate the incubation period. The current study assumed that during a contact the hazard of transmission remained constant, leading to an exponential probability density function for time to transmission. Two alternative approaches would be (1) to assume that the probability of transmission was constant throughout the contact period, a uniform probability density for time to transmission, effectively putting λ = 0; (2) to assume transmission coincides with onset of infection challenge, effectively putting λ = ∞. The posterior probability density of the incubation period was estimated using these models and compared with the estimation in the current study as summarised in Table C .1 and illustrated in Fig. C. 2. In the model used in the current study (the assumption of constant hazard), the maximum posterior density for the daily hazard of transmission, λ, was close to zero. Little information was available in the data set regarding λ; therefore, λ took on a distribution similar to its prior probability, with a large probability mass near zero and a long tail. This effectively makes the model in which a constant hazard is assumed equivalent to the model of uniform probability, the model that suggests infection is equally likely at any stage during the exposure period. Even at the extreme values of λ, the effects of the estimate of λ on incubation period shown in Figure C. 2 are relatively small. Therefore, the conclusions of the subsequent components of the model are robust to the choice of model for transmission and the value of λ. There were missing values for the time of symptom onset, hospitalisation times and time to recovery (1, 2 and 10 missing values respectively out of the 354 SARS cases in the database). Missing data were simulated using the inferred distributions of transition times. The likelihood of the data, given the parameters is given by where d is the known data and s is the simulated data. Because the aforementioned integral is not straightforward, L(d|θ ) was inferred by drawing s using the known times and the parameterised distributions, estimated in Section 6. For example, where recovery times were missing, these were inferred from the hospitalisation date and the parameterised time to recovery distribution. The date of each individual's acquisition of SARS-CoV also became an auxiliary variable in the model. The times were inferred from (1) the known date of onset of symptoms (taken directly from the database), (2) the parameterised incubation period, so that acquisition date = date of symptom onset − i ncubati on peri od, where the incubation period was drawn randomly from the (α L , β L ) distribution. If the time of exposure to another SARS case was known, the proposed transmission time (t i ) was drawn from a distribution based on the joint probability of (a) time to transmission, calculated using expression (7) and (b) the incubation period, with (α L , β L ) distribution. For each iteration of the model, the auxiliary variables were firstly determined using Gibbs sampling of the parameterised distributions. The likelihood of the augmented data was calculated using expression (15). Coefficients of infectivity were proposed and accepted according to: k(T j , H j , I j , x 1 , y 1 ) p(x 1 ) prop (x 1 → x 1 ) k(T j , H j , I j , x 1 , y 1 ) p(x 1 ) prop (x 1 → x 1 ) , (E.2) where C is the date of the change point and prop(x 1 → x 1 ) is the proposal probability of x 1 from x 1 . Similarly, x 2 is updated by: P acc = min 1, n i=C+1 k(T j , H j , I j , x 2 , y 2 ) p(x 2 ) prop (x 2 → x 2 ) k(T j , H j , I j , x 2 , y 2 ) p(x 2 ) prop (x 2 → x 2 ) , (E. 3) The number of days of the epidemic. Acceptance equations were similarly constructed for y 1 and y 2 . The change-point day was updated as follows: (1) For each iteration a new change-point day was proposed drawn as an integer from the U[1, n] distribution, where the epidemic begins on day 1 and ends on day n. (2) The change-point day was updated using a Metropolis step based on the full likelihood given by: P acc = min 1, p(C ) C j=1 k(T j , x 1 , y 1 ) n j=C +1 k(T j , x 2 , y 2 ) p(C) C j=1 k(T j , x 1 , y 1 ) n j=C+1 k(T j , x 2 , y 2 ) . The process was iterated 100,000 times and the first 10,000 iterations were used as a burn-in period. The following 90,000 updates of x 1 , x 2 , y 1 , y 2 and C were used to determine the posterior distribution. A new look at the statistical model identification Infectious Diseases of Humans: Dynamics and Control Clinical features and short-term outcomes of 144 patients with SARS in the greater Toronto area Viral shedding patterns of coronavirus in patients with probable severe acute respiratory syndrome A simple approximate mathematical model to predict the number of severe acute respiratory syndrome cases and deaths SARS outbreaks in Ontario, Hong Kong and Singapore: The role of diagnosis and isolation as a control mechanism Mathematical Epidemiology of Infectious Diseases: Model Building, Analysis and Interpretation Epidemiological and genetic analysis of severe acute respiratory syndrome Epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in Hong Kong Identification of a novel coronavirus in patients with severe acute respiratory syndrome Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China Modelling strategies for controlling SARS outbreaks Contributions to the mathematical theory of epidemics: part 1 A novel coronavirus associated with severe acute respiratory syndrome A major outbreak of severe acute respiratory syndrome in Hong Kong SARS-CoV antibody prevalence in all Hong Kong patient contacts Predicting super spreading events during the 2003 severe acute respiratory syndrome epidemics in Hong Kong and Singapore Laboratory-acquired severe acute respiratory syndrome Transmission dynamics and control of severe acute respiratory syndrome Destabilization of epidemic models with the inclusion of realistic distributions of infectious periods Multiple contact dates and SARS incubation periods Mounting lab accidents raise SARS fears Transmission of the severe acute respiratory syndrome on aircraft Laboratory-acquired SARS raises worries on biosafety Clinical progression and viral load in a community outbreak of coronavirusassociated SARS pneumonia: A prospective study Coronavirus as a possible cause of severe acute respiratory syndrome The aetiology, origins, and diagnosis of severe acute respiratory syndrome Identification of severe acute respiratory syndrome in Canada Transmission dynamics of the etiological agent of SARS in Hong Kong: Impact of public health interventions Factors associated with transmission of severe acute respiratory syndrome among health-care workers in Singapore Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures Simulating the SARS outbreak in Beijing with limited data Wet markets: A continuous source of severe acute respiratory syndrome and influenza? Summary of probable SARS cases with onset of illness from 1 Update 80-Change in travel recommendations for parts of China, situation in Toronto Update 95-sars: Chronology of a serial killer WHO extends its SARS-related travel advice to Beijing and Shanxi province in China and to Toronto Investigation into China's recent SARS outbreak yields important lessons for global public health Cluster of SARS among medical students exposed to single patient