key: cord-0731275-tr6hgu41 authors: Srinivasa Rao, Arni S.R.; Krantz, Steven G.; Bonsall, Michael B.; Kurien, Thomas; Byrareddy, Siddappa N.; Swanson, David A.; Bhat, Ramesh; Sudhakar, Kurapati title: How relevant is the basic reproductive number computed during the coronavirus disease 2019 (COVID-19) pandemic, especially during lockdowns? date: 2020-12-14 journal: Infection control and hospital epidemiology DOI: 10.1017/ice.2020.1376 sha: 49f6ca07a7a12433359dae0741668505c1fcb2e6 doc_id: 731275 cord_uid: tr6hgu41 Basic reproductive numbers during COVID-19, either through standard approaches or through time-varying approaches, are key for understanding the pandemic growth. However, improper usage and interpretations and computational difficulties during lockdowns could be misleading for planning and mitigation. To the Editor-The basic reproductive number R 0 in epidemiology is defined as the average number of secondary infections that will be likely produced by a primary infected person in a predominantly susceptible population. Mathematically, it is an accurate measure of disease spread. 1 However, the value of R 0 is difficult to estimate from epidemiological data, for example, during the ongoing coronavirus disease 2019 (COVID-19) pandemic. In recent studies on COVID-19, for example, 2-4 computed a timevarying R 0 has been computed, which researchers called R t . They ascertained that the decline in R t is due to continued lockdowns and nonpharmaceutical interventions. Although the conclusions in those studies are supported by the data, estimates of R t raise methodological issues that require further consideration. Here, we convey the essential and technical difficulties in estimating either R 0 or R t from the data, and we discuss how a model-based R 0 may not adequately capture the actual spread of the disease. Although these limitations are generally unavoidable (even after defining appropriate error structures and statistical modeling), the inappropriate use of this metric, especially in the ongoing COVID-19 pandemic, has important implications for infectious disease mitigation planning. Suppose that Y 0 is the number of infected people at time t 0 who could generate secondary infections between t 0 and t 1 , say, Y 1 . However, the testing of all the potential infected individuals during this period need not be complete. Y 1 could generate further secondary infections between t 1 and t 2 , say, Y 2 , and so on. Again, the testing of the samples through contact tracing need not be complete (Fig. 1 ). That is, Y iþ1 at t iþ1 could be generated by Y i at t i for i = 0, 1, : : : . In reality, during most epidemics, and especially for the COVID-19 pandemice, only a fraction of Y i , say, Y 0 i are ever reported (and also diagnosed due to incomplete testing) such that 5, 6 This partial reporting (including partial diagnosis and partial testing) could also be due to lockdowns and lack of proper knowledge regarding COVID-19 (forced or natural behavior changes in the community, eg, lockdowns and use of masks). The average number of secondary infections generated by Y i individuals is Y iþ1 =Y i . If there is variation in the infected people or a rapid aggregation of infected people, then it is more appropriate that we should use the geometric mean instead of the arithmetic mean approaches to determine expected reproductive numbers. Not only is the former far better suited than the latter to deal both with fluctuations and numbers that are not independent of one another, it also is the only correct mean when using results that are presented as ratios. [7] [8] [9] Suppose that Y iþk is the number of infected people at time t iþk when lockdowns are introduced at k for k = 0, 1, 2 : : : . Assume that The percentage of growth in the number of infected people during the 4 time intervals (t iþk , t iþkþ1 ) for k = 0, 1, 2, 3, 4, are, say, iþk % for k = 0, 1, 2, 3, 4, respectively. These growth percentages are computed as The secondary infections caused by an infected individual (Fig. 1 ) are the people who were not traced by the system. This step assumes that all of the infected people who were identified by the system were either quarantined or were controlled not to spread the virus further. Only a proportion of infected people who were tested and identified during lockdowns was reported, and others were either not diagnosed or not reported. Asymptomatic individuals could be anywhere in the process; that is, they were part of the identified and reported group or were among those who had not been contact traced or diagnosed. The mean (geometric) number of secondary infections would be appropriate because we were considering proportionate secondary infections. Hence, the mean number of secondary infections during (t i , t i þ 4) is given by Similarly, the trend in eq. (1) continues for k ¼ 0; 1; . . . n, then the mean number of secondary infections during the lockdown period (t i , t i þ n) is given by This point applies to several studies in which the reporting over time of the study is not constant. Even if the testing numbers and testing patterns are constant over a period, the proportion of underreported cases may not be constant. Thus, the estimation of R 0 is likely to be highly variable in any given situation. For the practical purposes of computing R 0 or R t we usually have data on Y 0 i , the number tested. When the ratios Y iþkþ1 =Y iþk for k ¼ 0; 1; . . . n are considered, then the geometric mean of these growth rates would be However, b R 0 or b R t , (the estimated basic and time-varying reproductive numbers at the start or ongoing through an epidemic, respectively) may not be at all close to R 0 or R t even if the Y i values are generated from a mathematical model for a period i > 0 that uses data on susceptible, exposed, infected, and recovered in which the underlying epidemiological processes are time varying. This factor will introduce bias to estimates of model-based basic reproductive rates and time-varying reproductive rates. Some other limitations in various studies arise due to computing R t after lockdowns were relaxed. Possibly, heterogeneity exists in the data that could have masked R t measures due to the computation of subnational and regional parameters in several COVID-19-affected countries. The lesson here is that mathematical models must be used with care. They must be fitted to the data, and their accuracy must be carefully monitored and quantified. 10 Any alternative course of action could lead to wrong interpretation and mismanagement of the disease with disastrous consequences. f gin (a) was 5þ2 2 ¼ 3.5, but the true average by them was 7þ4 2 ¼ 5.5. In (b), the third secondary infection in (a), say, y 13 becomes a primary infected that generates 4 secondary infections out of which all were traced and diagnosed. In (b), the second secondary infection in (a), say, y 22 becomes a primary infected that generates 7 secondary infections out of which only 5 were traced and diagnosed. Finally, in (b), the fourth secondary infection in (a), say, y 24 by primary infected y 2 becomes a primary infected that generates 3 secondary infections out of which only 2 were traced and diagnosed. The observed arithmetic average secondary infections by y 13 ,y 22 ,y 24 f g was 4þ5þ2 3 ¼ 3.67, but if every COVID-19 patient was diagnosed, then the true average secondary infections by them was 4þ7þ3 3 ¼ 4.67. Note that the total traced and tested could be many fold more than the actual positive cases found. Suppose 22 secondary infections generated during the third generation, then the mean number of secondary infections (geometric) obtained during three generations of spread is ffiffiffiffiffiffiffiffi 3.61 R 0 : How scientists quantify the intensity of an outbreak like coronavirus and its pandemic potential. School of Public Health Estimation of time-varying reproduction numbers underlying epidemiological processes: A new statistical tool for the COVID-19 pandemic Epidemiology and transmission dynamics of COVID-19 in two Indian states Estimating the time-varying reproduction number of SARS-CoV-2 using national and subnational case counts Measuring underreporting and under-ascertainment in infectious disease datasets: a comparison of methods True epidemic growth construction through harmonic analysis Geometric mean definition. Investopedia How not to lie with statistics: the correct way to summarize benchmark results Time varying basic reproductive number computed during COVID-19 Level of underreporting including underdiagnosis before the first peak of COVID-19 in various countries: preliminary retrospective results based on wavelets and deterministic modeling Acknowledgments. We thank Dr Natasha Martin, University of California San Diego, and Dr Chris T. Bauch, University Waterloo for providing useful comments on our original draft and pointing us to critical literature. Authors' contributions. All authors contributed in writing. ASRSR and SGK designed the study, and ASRSR wrote the first draft and conceptualized the study. MBB, TK, SB, DS, RB and SK have contributed in writing, editing and discussions. All authors approved the manuscript.Financial support. No financial support was provided relevant to this article.Conflicts of interest. All authors report no conflicts of interest relevant to this article.