key: cord-0959372-2uudklob authors: Cauchemez, Simon; Boëlle, Pierre-Yves; Donnelly, Christl A.; Ferguson, Neil M; Thomas, Guy; Leung, Gabriel M.; Hedley, Anthony J; Anderson, Roy M.; Valleron, Alain-Jacques title: Real-time Estimates in Early Detection of SARS date: 2006-01-03 journal: Emerg Infect Dis DOI: 10.3201/eid1201.050593 sha: 12c156dc659a9b4c3008ce4d6d10f464d60b5b65 doc_id: 959372 cord_uid: 2uudklob We propose a Bayesian statistical framework for estimating the reproduction number R early in an epidemic. This method allows for the yet-unrecorded secondary cases if the estimate is obtained before the epidemic has ended. We applied our approach to the severe acute respiratory syndrome (SARS) epidemic that started in February 2003 in Hong Kong. Temporal patterns of R estimated after 5, 10, and 20 days were similar. Ninety-five percent credible intervals narrowed when more data were available but stabilized after 10 days. Using simulation studies of SARS-like outbreaks, we have shown that the method may be used for early monitoring of the effect of control measures. T he reproduction number R of an epidemic (the mean number of secondary cases infected by a single infectious case) is a key parameter for the analysis of infectious diseases because it summarizes the potential transmissibility of the disease and indicates whether an epidemic is under control (R<1). Up to now, this parameter has only been estimated retrospectively for periods from which all secondary cases had been detected. In terms of policy development and evaluation during the epidemic, obtaining estimates of the temporal trends in the reproduction number relating to as recent a time as possible would be critical. If all incident cases could be traced to their index cases, estimating the reproduction number would simply be a matter of counting secondary cases. However, if tracing information is incomplete or ambiguous, modeling or statistical approaches are required. For example, a mathemat-ical model for disease transmission fitted to available data can provide estimates of R (1). An approach requiring fewer assumptions has been proposed by Wallinga and Teunis (2) , in which the distribution of the generation interval of the disease and the epidemic curve are directly analyzed and suffice to provide estimates. For an ongoing epidemic, this method could be used to estimate the number of secondary cases infected by a primary case-patient, but only for periods from which all secondary cases would have been detected. For severe acute respiratory syndrome (SARS), the required lag would be on the order of 15 days (95th percentile of the distribution of the generation interval described by Lipsitch et al.) (3) . In this report, we show how to estimate the reproduction number in an ongoing epidemic, which will account for yet unobserved secondary cases. The method is applied to data from the 2003 SARS outbreak in Hong Kong (4). Using simulated data, we demonstrate how the method may be used for early detection of the effect of control measures. We propose a Bayesian statistical framework for realtime inference on the temporal pattern of the reproduction number of an epidemic. Here, the reproduction number R t for day t will be defined as the mean number of secondary cases infected by a case with symptom onset at day t. Denoting n t as the number of cases with symptom onset at day t and X t as the number of secondary cases they infected, the reproduction number R t is the ratio X t /n t , defined for n t >0. Assume that we would like to compute the daily values R t from day 0 to present day T, before the epidemic has ended. Although daily incident case counts can be known up to day T, provided no delay in reporting occurs, the corresponding counts of secondary cases X t cannot. Secondary case-patients infected before day T, whose illness had a long incubation time, may have clinical onset only after day T. Furthermore, since the exact chain of transmission is seldom observed in practice, attributing secondary cases to previous cases is difficult. Focusing on these 2 issues, we show that the daily counts of symptom onset available until day T are sufficient to estimate R t . A 3-step construct is necessary. We first predict the eventual number of late secondary cases (as yet unobserved), for cases reported at day t, assuming the number of early secondary cases (reported before day T) is known. The method described by Wallinga and Teunis (2) is then used to estimate the number of early secondary cases from the daily counts of symptom onsets. These 2 steps are finally combined and yield an estimate of the predictive distribution of R t . Technical details are given in the online Appendix (available from http://www.cdc.gov/ncidod/ EID/vol12no01/05-0593_app.htm). The estimation procedure depends on 3 assumptions: 1) ascertainment of patients whose symptoms appear before day T is complete, 2) transmission events are independent, and 3) the generation interval, the time from symptom onset in a primary case to symptom onset in a secondary case, has a known frequency distribution. The method was retrospectively used to analyze the SARS outbreak in Hong Kong. The data consisted of the dates of symptom onset of the 1,755 case-patients who were detected in Hong Kong in 2003 (4). Using simulations, we explored the ability of the method to quickly detect the effect of control measures. Five hundred epidemics were simulated with the following characteristics. During the first 20 days of the epidemics, the theoretical reproduction number was 3. Control measures were implemented at day 20. In a first scenario, control measures were completely effective (no transmission occurred after day 20). In a second scenario, the theoretical reproduction number after control measures were implemented was 0.7. Details on the simulations are available from the corresponding author. In a simulation study, the bias and precision of the realtime estimator were investigated in situations in which the theoretical reproduction number remained constant with time. We also evaluated the effect of the length of the generation interval on the results. Detailed information can be obtained from the corresponding author. Figure 1A shows the dates of symptom onset of the 1,755 SARS patients detected in Hong Kong in 2003. Figure 1B -F shows the expectation and 95% credible intervals of the predictive distribution of R t based on data available at the end of the epidemic and after a lag of 2, 5, 10, and 20 days. After a lag of 2 days, the 95% credible intervals were wide and displayed an undesirable feature: they sharply decreased to 0 as soon as no cases had been observed for 2 consecutive days ( Figure 1C ; note especially days 1-4 and 13). After a 5-day lag, this undesirable feature had vanished ( Figure 1D ). With lags >5 days, the trends of expected values were relatively similar, with a peak around day 20, a decreasing trend after this date, and the expectation of R t decreasing to <1 around day 40. These observations suggest that after a lag of only 5 days, the temporal trends in the expectation of R t are well captured. For a lag of 5 days, the credible interval of R t was wide when <20 cases were detected (periods 063), but was relatively narrow when more cases were detected (period 21