key: cord-307322-h7vqmlq9 authors: Gongalsky, Maxim B title: Early detection of superspreaders by mass group pool testing can mitigate COVID-19 pandemic date: 2020-04-27 journal: nan DOI: 10.1101/2020.04.22.20076166 sha: doc_id: 307322 cord_uid: h7vqmlq9 Background Most of epidemiological models applied for COVID-19 do not consider heterogeneity in infectiousness and impact of superspreaders, despite the broad viral loading distributions amongst COVID-19 positive people (1-1 000 000 per mL). Also, mass group testing is not used regardless to existing shortage of tests. I propose new strategy for early detection of superspreaders with reasonable number of RT-PCR tests, which can dramatically mitigate development COVID-19 pandemic and even turn it endemic. Methods I used stochastic social-epidemiological SEIAR model, where S-suspected, E-exposed, I-infectious, A-admitted (confirmed COVID-19 positive, who are admitted to hospital or completely isolated), R-recovered. The model was applied to real COVID-19 dynamics in London, Moscow and New York City. Findings Viral loading data measured by RT-PCR were fitted by broad log-normal distribution, which governed high importance of superspreaders. The proposed full scale model of a metropolis shows that top 10% spreaders (100+ higher viral loading than median infector) transmit 45% of new cases. Rapid isolation of superspreaders leads to 4-8 fold mitigation of pandemic depending on applied quarantine strength and amount of currently infected people. High viral loading allows efficient group matrix pool testing of population focused on detection of the superspreaders requiring remarkably small amount of tests. Interpretation The model and new testing strategy may prevent thousand or millions COVID-19 deaths requiring just about 5000 daily RT-PCR test for big 12 million city such as Moscow. Though applied to COVID-19 pandemic the results are universal and can be used for other infectious heterogenous epidemics. Funding No funding Computer simulations based on various mathematical models are widely used for prediction of evolution of pandemics and help to make decisions and choose appropriate governmental interventions 20 . The models can be stochastic or deterministic, and most of them are based on SEIR (S-suspected, Eexposed, I-infectious, R-recovered) approach or its modifications. Simulations can give estimations for R 0 21 , predict probability of spreading of pandemic to new geographical areas 22 , or calculate influence of isolation delay of infectious people on the spreading rate 23 . However, the majority of the proposed models do not take into account heterogeneity of infectiousness and existence of superspreaders except rare ones 24 . Thus, the aim of the present article is to describe possible group pool testing strategy, which can detect superspreaders on early stages within reasonable amount of RT-PCR tests, and demonstrate the efficiency of the strategy by means of SEIR derivative model Monte Carlo simulations applied for London, Moscow and New York City as examples. Results and discussion 1. Compartments of the model I used stochastic SEIAR compartments model (A stands for admitted -see below) resolved by Monte-Carlo simulations. The model emulates behavior of n people in a city. The structure of the model was inspired by Moscow social life, however the results are also applicable for other cities, counties, other societies, etc. The program code was written in python language and provided in Suppl. Info. SEIAR model means that each citizen is presented in one of five groups (see Figure 1A ): • S -suspected. A healthy person. • E -exposed. A person, who already infected, but does not have symptoms yet. He can transmit infection although with smaller probability than infected person with symptoms. • I -infectious. A person with symptoms and full probability of infection transmission. • A -admitted. A person with symptoms, who sought medical help. The model suggests that the person got COVID-19 positive test was admitted to hospital or quarantined at home and stopped transmission of the infection. • R -recovered. Includes both recovered and deceased. Recovered people are suggested to be immune. All of them do not influence to other compartments of the model. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2020. . Figure 1 . A) Model compartments sequence. S -suspected (healthy), E -exposed (partially infectious, before onset of symptoms), I -infectious (fully infectious, after onset), A -Admitted (admission to hospital or complete isolation), R -Recovered (including deceased). Average periods are shown. B) Dependence of symptoms onset fraction on period after infection (duration of incubation period). Blue dots -data from Ref. 25 Red curve is best fit by gamma distribution cumulative function. C) Distribution for periods between onset and admission. Blue bars -data from Ref. 26 Red curve is best fit gaussian. D) Recovery time distribution. Blue bars -data from Ref. 27 Red curve is best fit gaussian. Please note, that there are about 17% of asymptomatic patients according to investigation on Diamond Princess liner 28 . The model does not take them into account for simplicity and because the infectiousness of asymptomatic people is unknown and can be negligible. However, existence of asymptomatic infectors will even support main conclusions of the article. Each citizen has his own values of incubation period (from infections to onset of symptoms), period between onset and admission, and recovery period (see Figure 1 , all curve fitting was done in MagicPlot software). All values follow standard distributions fitting respective experimental data: • Incubation period. From exposure to onset of symptoms (E → I) - Figure 1B . Experimental data (blue circles) from Ref. 25 Red curve is best fit cumulative gamma distribution with shape parameter k = 4, scale parameter t = 1.3. Average incubation period is 5 days. • Period from onset to admission (I → A) - Figure 1C . Experimental data (blue bars) from Ref. 26 Red curve is best fit of gauss distribution with mean m = 2.12 and standard deviation s = 3.42. • Recovery time (E → R) - Figure 1D . Experimental data (blue bars) from Ref. 27 Red curve is best fit of gauss distribution with m = 19 and standard deviation s = 9. Note, that recovery time was calculated from start of the infection not from start of the admission and the recovery may occur before the onset or admission. Average recovery time is 19 days, therefore average period between admission and recovery is 12 days as it shown in Figure 1A . The model is stochastic, therefore it does not use the concept of serial interval 29 , which was used for many deterministic models. Serial interval is average period between infection of a person and his first transmission. Serial interval is difficult to measure from common clinical data. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04. 22.20076166 doi: medRxiv preprint All the SEIAR population discretely changes daily in accordance with sequence presented in Figure 1A until amount of infected population (IP) (exposed+infectious+admitted) becomes 0. Initially infected people in population is chosen randomly in order to have 10-50 infected in a city. Figure 2 . Pattern of daily contacts. Social groups: retired -men with a cane; able-bodied -featureless men, children and students -girls with pigtails. Locations: house (red), trains (2 each way -yellow), office (yellow), grocery (dark red), school (green). Pics under each location show possible attendants, numbers show sizes of groups in location. Arrows show possible daily movements. Blue test tubes show locations, where mass testing can be applied. Percentages in the corner show relative abundance of social groups in city population. Schematic view of daily contacts is shown in Figure 2 . All population of the city is split into 3 group: • Retired -24%. Shown as man with a cane pics. • Adult able-bodied -56%. Shown as simple man pics. • Children and students -20%. Shown as girl with pigtails pics. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04. 22.20076166 doi: medRxiv preprint The division was made in accordance with Moscow demographic statistics. Each picture represents location of contacts, amount of contacts used in model and participating social groups. Model suggests that exactly 3 people live in each house (shown red in left top corner), all 3 groups can live, but the composition is random. There are 3 ways available every day: • To office (shown yellow on the right of the Figure 2 ) connected by two subway trains each way, which is typical for Moscow. 5 people in each train represent amount of citizens, who are located relatively close to each other and may transmit the infection, despite that usually there are more people in a single coach, but they are scattered along the coach. • To grocery (shown dark red in left bottom corner of Figure 2 ). It is suggested that attendance of the children is negligible. Groceries may also represent pharmacies or other shops. • To school or university (shown green below the house). Teachers are not taken into account. People in the same houses (housemates), same offices (co-workers), same schools or universities (classmates) are given once randomly and do not change from one day to another. On contrary, people in each train and grocery are absolutely random each time. That represents the real situation, when people have both regular and accidental contacts. Color of contact places shows relative susceptibility to quarantine (red color corresponds to independence to quarantine, dark red and yellow -moderate susceptibility, green -high susceptibility). Mass public events are out of scope of the model, because they are implied to be banned already. Relative probabilities of the infections per day, P pl , were proportionate to duration of the presence in particular place and tuned to make the model balanced in order to use all ways of infection transmission. P pl were: 3 for house, 1 for office, 0.1 for each couch in subway, 0.5 for school and 0.05 for grocery. Absolute probability of a contagion in a place, P i , was dependent on P pl , total contagiousness of all people in the place, SP j , and relative virulence of SARS-CoV-2, P vir . The model suggests that contagiousness of a person, P j , is proportionate to amount of viruses exhaled by him per minute, which is in turn proportionate to concentration of viruses in sputum or pharyngeal mucus, C vir . The latter can be estimated by RT-PCR tests from sputum, throat or nasal swabs, i.e. Where Dct is differential cycle threshold for the specimen and the etalon and C et is the concentration of RNA in etalon sample. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . Figure 3 . A) RT-PCR relative cycle threshold (Dct) distribution. Blue bars -data from Ref. 12 Red curve is best fit gaussian. B) Random generated log-normal viral loading distribution (blue bars). Red area corresponds to superspreaders. Different superspreaders thresholds are shown as black bars with corresponding prevalence. Figure 3A shows RT-PCR cycle threshold distribution for COVID-19 positive patients shown as blue bars 12 . Difference in Dct equal to 24 means that viral loading for COVID-19 patients may alter in 2 24 = 16 million times. Error of Dct is about 3 11 . Red curve shows gaussian fit for the Dct, which means log-normal for viral loading. The best fit gave us: m = 2 and s = 3.6. Those values were used to generate distribution of simulated viral loading proportionate to contagiousness of citizens, presented in Figure 3B . The value of C 0 corresponds to median viral loading. The distribution gives us fractions of superspreaders depending of chosen superspreaders threshold, St, i.e. 10% for x100 threshold (superspreaders are defined as people, who are 100+ times more contagious than median infectors), 5.6% for x300, 2.8% for x1000 as it is shown in red area in Figure 3B . Data from Ref. 12 contain only 75 tests, but similar log-normal dependences were obtained for SARS 15 , 778 tests of pandemic H1N1 influenza outbreak 30 or even for virus concentration on fomites 16 (see Figures S1-S3 in Suppl. Info for details). The model takes quarantine into account. The efficiency of quarantine for different locations and social groups is shown in Figure 2 as color legend from red to green. Contagions in houses are not affected by quarantines. Schools and universities are closed in any quarantine. Offices, trains and groceries are partially affected by quarantines. The model assumes that amount of workers or customers are dropped by the quarantine factors, Q of , and Q gr , correspondingly, after its beginning. Office quarantine affects certain subgroup of able-bodied citizens, who start to work remotely and do not comute to office. Other workers (like policemen, e.g.) are not affected by quarantine. Grocery quarantine affects all able-bodied and retired people, i.e. they shop less often, sometimes prefer delivery services, so their average attendance to groceries reduced, but there is no division on stable subgroups. The Q factors can be changed several times during evolution of pandemic. I used the following relationship: . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. Mass testing is another option of the program and it can be applied for all customers of public transport and groceries as it marked as blue test tubes in Figure 2 . The screening detects only superspreaders with given threshold and put them in complete isolation (programmed as admission) after obtaining test results (2 days). Mass testing can be switched on a particular day. Typical curves for all 5 SEIAR compartments are shown in Figure 4A . The sum of E, I and A was used as infected population (IP). Simulation of pandemic dynamics in London, Moscow and New York City (NYC) is shown in Figure 4B . Arrows point to quarantine interventions with corresponding Q factors. Times of IP doubling are shown above the simulated curves for all cities. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . Figure 5 . Simulated COVID-19 pandemic curves for London, Moscow and New York City without mass testing (blue), and with mass testing with different superspreaders threshold, St, 100 (green), 300 (yellow) and 1000 (orange). Official data are shown as black circles. Pie charts show prevalence of superspreaders between different St ranges and their impact on transmission. Bar charts show total and peak infected population values for all cities and testing options. Figure 5 shows COVID-19 developments in three cities without testing (blue curves), and with testing and isolation of superspreaders with St = 100 (green), 300 (yellow), 1000 (orange). Public data of confirmed cases are shown as black dots. Hospital beds and intensive care units (ICU) capacities are shown as horizontal dashed lines (see Suppl. Info for details). Rapid isolation of superspreaders leads to substantial mitigation of pandemic for both total and peak IP (shown on bar chart in Figure 5 , see numerical data for Figure 5 in Suppl. Info.). For cities with strong quarantine (London and NYC) this testing strategy with St = 100 strongly reduces amount of total IP (4.5-4.8 fold decrease is predicted), while for Moscow with weak quarantine more prominent is reduction of peak IP (8-fold) . Choice of St is very important for practical implementation of mass testing strategy, because it is a trade in between difficulty and efficiency. Higher St corresponds to lower amount of daily required RT-PCR tests, but it fails to detect superspreaders below St. Prevalence of superspreaders between different St values as well as corresponding effect of COVID-19 transmission by them are shown on pie charts in Figure 5 . Note, that regardless to St value, throat or/and nasal swabs must be taken from all people attending offices and groceries, which is also a challenge for metropolises. Low St is less important for London, where both relatively low initial IP and strong quarantine, i.e. it is 10% difference for peak IP and 50% for total IP between St = 100 and 1000. Differences increase for total IP with increase of initially IP and reaches 160% for NYC, while low Q in Moscow gives 300% reduction for peak IP. The last one is crucially important, because predicted 2 million COVID-19 patients in Moscow is far over its healthcare capacity, which can result in about 0.5 million excessive deaths. Nevertheless, the prognosis demonstrates catastrophic scenario for Moscow, therefore strong quarantine such as one used in London or NYC is highly recommended regardless of possible new testing strategy application. Superspreaders isolation based mitigation requires mass testing strategy for detection of them. The simplest strategy is make one test for person, but it requires about 2 million test daily, which is obviously impossible to do. However, daily tests requirements can be reduced at least by the factor of 500, if smart matrix group pool testing scheme used (see Figure 6 ). Note, that this scheme is very efficient for detection of superspreaders, but it misses all other infected people. The exact amount of tests depends on the superspreaders threshold St. If it is assumed that viral loading for median spreader is 10 times higher than sensitivity of RT-PCR test (which is true for most used test-systems), then test will give positive result for a superspreader mixed into a pool with 10*S t = 1000 for St = 100 and even more for higher thresholds. But matrix testing requires 2 tests for each specimen, i.e. one in a row and one in a column. Then all specimens in the intersections (highlighted blue) can be tested separately without mixing. This will provide extremely low false positive result ratio, because viral loading is huge in positive tests. In case of low prevalence the additional amount of separate tests will be much lower than 500. Thus, that strategy allows to detect superspreaders by perform 5000 tests daily for megapolices such as Moscow, New York City and London. Thus, the proposed stochastic SEIAR-model for COVID-19 pandemic demonstrated crucial importance of superspreaders, who are people with SERS-CoV-2 viral loading at least 100 exceeding median value. Superspreaders with 10% prevalence amongst infected people transmit 45% cases of COVID-19, therefore their rapid isolation can significantly mitigate pandemic and save thousands of people, as it was shown for London, Moscow and New York City. The isolation can be performed via mass matrix group pool strategy, applied for all people attended to offices and groceries. This strategy requires reasonable amount of RT-PCR tests about 5000 per day. The obtained results can be also applied to other cities and countries and used not only for COVID-19 pandemic, but for other infectious diseases with high heterogeneity of spreaders. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04. 22.20076166 doi: medRxiv preprint Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Effectiveness of Reverse Transcription-PCR, Virus Isolation, and Enzyme-Linked Immunosorbent Assay for Diagnosis of Influenza A Virus Infection in Different Age Groups Double etched porous silicon nanowire arrays for impedance sensing of influenza viruses COVID-19 pandemic. The charts of our World in Data Evaluation of COVID-19 RT-qPCR test in multi-sample pools. Infectious Diseases (except HIV/AIDS) Evaluation of Group Testing for SARS-CoV-2 RNA. Infectious Diseases (except HIV/AIDS) Superspreading and the effect of individual variation on disease emergence South Korea infects nearly 40 people with coronavirus Dimensions of superspreading Autonomous Targeting of Infectious Superspreaders Using Engineered Transmissible Therapies Quantitative Detection and Viral Load Analysis of SARS-CoV-2 in Infected Patients Viral dynamics in mild and severe cases of COVID-19 Detection of SARS-CoV-2 in Different Types of Clinical Specimens Respiratory virus shedding in exhaled breath and efficacy of face masks Viral Load Distribution in SARS Outbreak Transmission of Influenza A in a Student Office Based on Realistic Person-to-Person Contact and Surface Touch Behaviour The logarithm in biology Lognormal Distribution of Epiphytic Bacterial Populations on Leaf Surfaces Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries The reproductive number of COVID-19 is higher compared to SARS coronavirus Early dynamics of transmission and control of COVID-19: a mathematical modelling study Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts Effects of superspreaders in spread of epidemic The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application Epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in Hong Kong Estimates of the severity of COVID-19 disease Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship Serial interval of novel coronavirus (COVID-19) infections Pandemic H1N1 and seasonal H3N2 influenza infection in the human population show different distributions of viral loads, which substantially affect the performance of rapid influenza tests