key: cord-0312269-kwhshiok authors: Kim, J.-H.; Lee, H.; Won, Y. S.; Son, W.-S.; Im, J. title: Rapid transmission of coronavirus disease 2019 within a religious sect in South Korea: a mathematical modeling study date: 2021-08-07 journal: nan DOI: 10.1101/2021.08.05.21261683 sha: c7f43be9313945f39d4113f88ee90d90efb15c5b doc_id: 312269 cord_uid: kwhshiok Rapid transmission of coronavirus disease 2019 (COVID-19) was observed in the Shincheonji Church of Jesus, a religious sect in South Korea. The index case was confirmed on February 18, 2020 in Daegu City, and within two weeks, 3,081 connected cases were identified. Doubling times during the initial stages of the outbreak were less than 2 days. A stochastic model fitted to the time series of confirmed cases suggests that the basic reproduction number (R0) of 18 COVID-19 was 8.5 [95% credible interval (CrI): 6.3, 10.9] among the church members, whereas R0 = 1.9 [95% CrI: 0.4, 4.4] in the rest of the population of Daegu City. The model also suggests that there were 4 [95% CrI: 2, 11] undetected cases when the first case reported symptoms on February 7. The Shincheonji Church cluster is likely to be emblematic of other outbreak-prone populations where R0 of COVID-19 is higher. Understanding and subsequently limiting the risk of transmission in such high-risk places is key to effective control. Coronavirus disease 2019 has become a global pandemic since it was first reported in 26 Wuhan, China in December 2019 with the name of novel coronavirus disease [1] . The causative agent, 27 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), transmits mainly through human-28 to-human contact [2] , which can happen even during the infector is asymptomatic [3, 4] . Infection 29 with the virus causes diseases with varying degree of symptoms including death [5, 6] . Infection 30 mortality ratio is lowest among children aged between 5 and 9 years and increases loglinearly with 31 age [7] . 32 One key characteristic of COVID-19 pandemic is that transmission events in high-risk settings such 34 as super-spreading events (SSEs) contribute to most transmissions [8, 9] . The risk of transmission is believed to high in places with high occupancy and poor ventilation [10]. One extreme 36 example is the outbreak in the Diamond Princess cruise ship, where 17% (619/3711) of the passengers 37 were infected from January 25 to February 20, 2020 [11] . Other examples include transmission events 38 in bars and wedding [8] in Hong Kong, nursing homes in U.S. [12] , telemarketers working in group 39 in closed places [13] and fitness classes [14] in South Korea, and also religious gatherings, which we 40 describe below. 41 existing data before we fit the model. First, in the beginning of the outbreak, KDCA provided both 78 daily and cumulative numbers of cases confirmed for SARS-CoV 2, which did not agree always. If 79 there is a discrepancy between these numbers, we prioritized cumulative numbers as this figure was 80 reported continuously throughout the outbreak. Second, data were missing for some days for the 81 number cases for Shincheonji members. We imputed missing values using the cubic spline method 82 ( Figure S1 in the Supplementary Material). 83 84 The epidemic doubling time ( ) represents the duration in which the cumulative incidence doubles. 86 Assuming exponential growth with a constant epidemic growth rate ( ), the epidemic doubling time 87 can be calculated by the following equation [18, 25, 26] ln (2) . (1) Epidemic growth rate ( ) may be estimated based on the data. where ( ) indicates the cumulative number of infected people at time and is the duration over 94 which ( ) is assumed to be constant. ( ) can be calculated over the fixed time interval (e.g., 1 day 95 or 1 week) [27, 28] or variable time intervals (e.g., days on which the number of cases doubles, 96 quadruples, etc.) [19, 25] . We calculated doubling based on prior 7 days or 1 day from 18 February 97 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 to 5 March 2020, when the epidemic peaked and no further doubling of cumulative number of cases 98 occurred onward. 99 100 The basic reproduction number ( 0 ) is defined as the average number of secondary cases caused by 101 a single infected case in an entirely susceptible population and it provides sufficient information to 102 produce doubling times in the beginning of an outbreak. However, estimating 0 requires additional 103 information such as generation time or developing a mechanistic model, and its estimates come with 104 higher degree of uncertainty [26] . Calculating doubling times requires fewer assumptions and also 105 allows us to compare our results with estimates from different settings where doubling times, but not 106 reproduction numbers, are available. 107 We developed a stochastic model of COVID-19 transmission within the Shincheonji community and 109 the overall population of Daegu City. The model includes six disease states: susceptible ( ), exposed 110 but not infectious ( ), pre-symptomatic but infectious ( ), symptomatic and infectious ( ), 111 asymptomatic but infectious ( ), confirmed and isolated ( ), and recovered ( ). The model includes 112 two patches to model Shincheonji and non-Shincheonji people, separately. Transmission rates may 113 differ for each patch and person from one patch may infect people from the other patch. (Figure 2 ). 114 115 This modeling framework of mixing between two distinct sub-populations has been adopted in 116 previous works, ranging from sexually transmitted diseases [29] to vector-borne diseases such as 117 dengue [30] , where formulations for mixing between patches vary. We adopted the formulation used 118 in the work on modeling transmission of cholera between hotspot and non-hotspot areas [31] . Mixing 119 between two sub-populations are defined by the 2 × 2 contact matrix, 120 = ( 12 12 21 22 ), where indicates the fraction of time that individuals from patch i spends in patch j. Next, we 122 impose two conditions on the matrix : 123 (i) Individuals must reside in either of the two patches, i.e., 1 + 2 = 1 for =1 124 (Shincheonji) and 2 (non-Shincheonji). 125 (ii) The population in each patch remains constant, i.e., 12 1 = 21 2 , where 1 and 126 2 represent population size for patch 1 and 2, respectively. 127 The above conditions may transform the contact matrix to the following form: The force of infection for individuals from patch i at time , ( ), is defined as follows: 134 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 The transitions between states are modeled using an explicit tau-leap algorithm [32] to account for 138 stochasticity of the infection transmission process. The number of susceptible people in patch i at 139 time + Δ , ( + Δ ), is written as follows: 140 ( , + Δ ) represents the number of people who transit from state to state from to + Δ 142 in patch and is a random variable with binomial distribution: 143 That is, it is represented as an integer varying between 0 and ( ). For states from which more than 144 one potential transition exist (e.g., to either or ), multinomial distributions were applied. For 145 instance, the number of people transit from to either or are given as follows: 146 where is a vector given as 147 148 149 The first element of indicates a probability of transition from to and the second element 150 indicates the probability of transition from to . The number of people in other states (i.e., 151 , , , , ) at time t can be described similarly. The model was implemented in a combination of R 152 Bin( ( ), Δ ( )). (5) Multi( ( ), Δ ), ( 1 − 1⁄ − 1⁄ , 1⁄ − 1⁄ ) . . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 and C++ languages, in which the core transmission model part is expensive and was written in C++. 153 All the computer codes that generate the results in this paper are available on GitHub [33] . 154 155 To account for intensification of the intervention such as case isolation and contact tracing with 157 subsequent testing during the outbreak, we assumed case isolation rate (1 / mean time between 158 symptom onset and case isolation) and transmission rate of the infectious people per unit time change 159 over time. Specifically, we assumed that the case isolation rate, α( ), starts increasing on February 160 20 from the initial value of init when 4,474 out of 9,334 Shincheonji members were identified and 161 were asked to self-isolate. During model fitting, we let data suggest the duration of intervention, 162 in day, which is the time required for the case isolation rate to reach its minimum, ( ) = final for 163 > February 20 + . We assumed that the mean time between symptom onset and case isolation 164 linearly decreases over the intervention period . In other words, α( ) is formulated as follows: 165 166 167 where final is assumed to be 1 day based on the experiences in Busan City in Korea and α( ) is 168 assumed to be zero before February . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 Here, β init and β final indicate the transmission rate per unit time before the intervention and after the 174 intervention measures fully take effect, respectively. They can be derived once 0, and final are 175 given as: 176 Our model of COVID-19 transmission requires 15 parameters (Table 1) . We divided the model 178 parameters into three classes depending on our belief on their relative certainty. The first class 179 includes parameters related to the natural history of infection and population size and we deemed that 180 available parameter estimates are reliable. For these parameters, we used their point estimates based 181 on analyses of data on COVID-19 transmissions in Korea or China. For the second class, which 182 includes parameters related to intervention programs, we used our best guesses based on supporting 183 evidence but still acknowledged their uncertainty. Therefore, we analyzed the models under various 184 assumptions on their values within some pre-specified ranges. Finally, we defined six parameters that 185 are critical for characterizing dynamics of COVID-19 transmission in Shincheonji members and non-186 Shincheonji people. We estimated these parameters by fitting the model to daily confirmed COVID-187 19 cases of Shincheonji members and non-Shincheonji people. 188 and . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 Estimation of parameters = ( 0,1 , 0,2 , 0 , 12 , , final ) was based on Approximate Bayesian 190 Computation Sequential Monte Carlo (ABC-SMC) [34] . The ABC is a method for approximating 191 posterior distributions given data , ( | ), by accepting proposed parameter values when the 192 difference between simulated data * and , ( , * ), is smaller than tolerance ϵ: 193 For our model, ( , * ) is defined as the sum of the squared differences in daily confirmed cases 195 over the outbreak of duration T days, that is, 196 , 197 for Shincheonji ( = 1) and non-Shincheonji ( = 2). Here, and * represent observed daily 198 confirmed cases and model predicted values at time day , respectively. ABC-SMC was designed to 199 increase efficiency of the ABC method and ABC is applied in a sequential manner by constructing 200 intermediate distributions, which converge to the posterior distribution. Tolerance ϵ is gradually 201 decreased and each intermediate distribution is obtained as a sample that is drawn with weights from 202 the previous distribution and then perturbed through a kernel ( | * ). The kernel helps keep the 203 algorithm from being stuck in local optimum while maintaining the efficiency of the ABC-SMC 204 method. Minimally informative uniform distributions were used as prior distributions and estimation 205 procedure was repeated for ten different random seeds. The resulting distribution was summarized as 206 median, 50% credible intervals (CrI; interval between 25% and 75% percentiles) and 95% CrI 207 (interval between 2.5% and 97.5% percentiles). More details of the algorithm such as prior 208 distribution for each parameter, the number of steps, the tolerance values for each step, perturbation 209 kernel appear in the Supplementary Material. 210 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101/2021.08.05.21261683 doi: medRxiv preprint Over the period of February 18 -March 5, during which doubling of confirmed cases occurred 12 213 times, doubling times were <1 day in the beginning and increased subsequently with daily doubling 214 time presenting higher variability for both Shincheonji and non-Shincheonji values ( Table 2) . 215 Doubling times calculated over sliding one-week intervals remained shorter than 3 days for the most 216 part for both Sincheonji and non-Shincheonji population. 217 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101/2021.08.05.21261683 doi: medRxiv preprint Rapid transmission of COVID-19 within the Shincheonji community is likely to have been facilitated 238 by high intensity contact between individuals gathering during services and in residential areas. Our 239 mathematical modeling analyses quantify the rapid spread of COVID-19 in Daegu City driven by a 240 community of Shincheonji members. The median 0 among Shincheonji members ( 0,1 ) was 8.5, 241 which is over 4-fold higher than what was estimated for the rest of the population in Daegu City 242 ( 0,2 =1.9). While the 0 in the Shincheonji community is higher than estimates from most 243 transmission hotspots (e.g., in China [35] [36] [37] [38] [39] and Korea [40] [41] [42] [43] ), such high 0 is not unusual in 244 particular considering that 0 can be different depending on the local settings with varying contact 245 rates [44] . Studies do report that 0 estimates of COVID-19 that are comparable or even higher than 246 our estimates for the Shincheonji community. During periods of intensive social contacts near the 247 Chinese New Year in China, 0 was estimated to be 6 [45, 46] . Also, 0 estimates were around 5 248 among those traveled from Wuhan and were subsequently confirmed in other countries [47] , and 249 around 7 during the initial growth phase in the UK [48] . In an extreme setting such as the Diamond 250 Princess ship, much higher estimates ( 0 = 14.8) were reported [49] . Although the previous studies 251 that included data on the outbreak in Shincheonji community report smaller 0 estimates [40, 42] 252 than our estimates, the difference might stem from that prior studies did not model the Shincheonji 253 community separately from the rest of the population and therefore measured the 0 averaged across 254 sub-population that are highly heterogeneous. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 Although estimated daily doubling times show some variability (e.g., 14 days on February 28 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 The relationship between the doubling time and the 0 provides two insights on our inferences on 277 0 . For an model, there exists an algebraic formula that describes the inverse relationship 278 between initial epidemic growth rate and 0 [51, 52] . This inverse relationship suggests that short 279 doubling times during the early phase of the epidemic we calculated using the growth rates are 280 consistent with high 0 for Shincheonji we estimated using the stochastic dynamic transmission 281 model ( Table S2 in there are some quantitative differences in our parameter estimates in response to the change in our 296 assumptions on fixed parameters, 0,1 was always over 4-fold higher than 0,2 . 297 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101/2021.08.05.21261683 doi: medRxiv preprint large clusters. Such social settings as bars, restaurants, weddings, and religious sites appeared at 318 increased risk of large outbreaks [8] . 319 320 One limitation of our analyses is that the model was fit to date of case confirmation because the date 321 of symptom onset, which is more closely related with the date of infection, was not available. The 322 daily number of confirmed cases can abruptly change depending on the intensity of intervention 323 measures, of which the dynamics may not be consistent with disease transmission process. This means 324 using the data on case confirmation under dynamics intervention measures is challenging. We tried 325 to mitigate this difficulty by incorporating the dynamics of intervention programs by assuming that 326 the start date and duration of enhanced case detection vary while the case detection rate increases 327 over time and let the data suggest the values for those parameters. 328 329 The potential for large variations in 0 for COVID-19 has important implications for the design and 331 effectiveness of control strategies. The efficacy of components of intervention programs, such as 332 contact tracing and physical distancing, is dependent on various environmental and societal factors 333 (e.g., large gatherings, physical proximity, high risk behaviors such as singing, etc.) that influence the 334 transmissibility of disease. Our analyses provide important insights that in order to minimize the risk 335 of sudden outbreaks, efforts to identify and preempt high transmission scenarios will be key to 336 controlling the spread of the COVID-19. Understanding and subsequently limiting the risk of 337 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021. ; . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2021 Figure S1 shows the cumulative cases before and after imputation. We used the cubic spline method provided in the imputeTS package of R. The model comprises two sets of above equations that describe two patches (i.e., a community of Shincheonji members and the non-Shincheonji people in Daegu City) and these equations are linked through the force of infection function, ( ), which is defined in the main text. For the differential equation-based model, the initial (i.e., the entire population is susceptible) epidemic growth rate * can be given as − , (Ma, 2020) where and represent transmission rate and recovery rate, respectively, as we defined in our model. Similarly, for a differential equation-based model * is given as below (Ma, 2020; Ma et al., 2014) : where represent the rate at which the exposed individuals become infectious (i.e., 1 = mean latent period) as we defined in the main text. The above equation gives and therefore 0 for given * , , . Assuming * = , which is the growth rate we calculated in the main text, we can see to what value of 0 the doubling times we calculated in the main text are translated and qualitatively see if 0 estimates from the current study are reasonable. Table S2 . Basic reproduction number, 0 , calculated by assuming the empirical daily or weekly growth rate is the same as the * calculated for the SEIR model. 1.9 6.3 23.9 2020-02-26 3.9 14.7 13.0 14.7 2020-02-27 5.9 14.7 9.6 11.6 2020-02-28 6.6 1.5 7.5 8.8 2020-02-29 6.6 17.0 6.3 10.5 2020-03-01 5.9 1.1 5.6 6.6 2020-03-02 2.7 5.1 4.5 7.5 2020-03-03 1.9 5.6 4.5 7.5 Figure S3 . Sensitivity of our parameter estimates to simplifying assumption of three selected parameters 1) Fraction of asymptomatic infection 2) Relative rate ρ of isolation of asymptomatic patients compared to the symptomatic patients before the intervention. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person Doubling Time of the COVID-19 Epidemic by Province, China Shadowy Church Is at Center of Coronavirus Outbreak in South Korea Reproduction number (R) and growth rate (r) of the COVID-19 epidemic in the UK Approximate accelerated stochastic simulation of chemically reacting systems Approximate Bayesian Computation for infectious disease modelling The Estimate of the Basic Reproduction Number for Novel China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak Pattern of early human-to-human transmission of Wuhan The Novel Coronavirus, 2019-nCoV, is Highly Contagious and More Infectious Than Initially 47 Analysis of the epidemic growth of the early 2019-nCoV outbreak using 4 If > 1, sample * from the previous generation { −1 } with weights { −1 } and perturb the particle to obtain Generate data sets * * from the model using * * and calculate Estimating epidemic exponential growth rate and basic reproduction number Estimating initial epidemic growth rates 2 The model is implemented using a tau-leap algorithm. The number of people who transit from state x to state y over the time interval from t to t +Δt in patch i, ( , + Δ ) , is defined as follows:where Bin(n, p) represents binomial and with parameters n and p, respectively.( , + Δ ) = Bin( ( ), Δ ),where Multi(n, π ={ π1, π2}) indicates multinomial distribution with parameters n and π. π is given as follows:The first element of indicates a probability for x = I (i.e., transition from P to I) and the second element indicates the probability of transition from P to A (x = A).The first element of indicates a probability for x = C (i.e., transition from I to C) and the second element indicates the probability of transition from I to R (x = R).( , + Δ ) = Mult( ( ), ),The first element of indicates a probability for x = C (i.e., transition from A to C) and the second element indicates the probability of transition from A to R (x = R).(1)The number of people in each state at time t + Δt can be described using the terms defined above: and dividing 20 equidistance pieces. Finally, final two value were manually adjusted to provide good fit between data and the model predictions through trial and error. Below are actual values used but were rounded for presentation.1.. {2140, 2035, 1928, 1823, 1717, 1611, 1505, 1399, 1293, 1187,1081, 975, 869, 764, 658, 552, 446,340, 290, 250}, 1.. 2 = {1325, 1260, 1194, 1129, 1063, 998, 932, 867, 801, 736, 670, 604, 539, 473, 408, 342, 277, 211, 190, 180}.