key: cord-0854117-uqxtj0rh authors: HERNANDEZ-SUAREZ, C. M.; Murillo-Zamora, E. title: Using COVID-19 deaths as a surrogate to measure the progression of the pandemics. date: 2020-09-29 journal: nan DOI: 10.1101/2020.09.27.20202564 sha: 18b712d304e3ecb5a15c68c06cf1f64c092123c9 doc_id: 854117 cord_uid: uqxtj0rh The IFR (Infection Fatality Risk) is one of the most important parameters of an infectious disease. If properly estimated, the observed number of deaths divided by the IFR can be used to estimate the current number of infections and, if immunity is permanent, we can estimate the fraction of susceptible which can be used to plan reopening of activities and vaccine deployment, when these become available. Here we suggest how to use the observed deaths by COVID-19 in an arbitrary population as a surrogate for the progression of the epidemic with the purpose of decision making. We compare several estimates of IFR for SARS-CoV-2 with our estimate that uses the number of additional deaths in households in a database population of 159,150 laboratory-confirmed (RT-qPCR) COVID-19 by SARS-COV-2 in Mexico. The main result is that if the number of deaths in a region is close to 2 per thousand individuals, the fraction of remaining susceptible may be too small for the vaccine to make a difference in the total number of infected. COVID-19 pandemic has shown to have a low lethality, nevertheless, the burden of the disease so far is huge. Economical activities are suspended or reduced and there is pressure for reopening of schools which requires more information, not only available to policymakers but also to the general public, to 5 ameliorate civic unrest. The competition for developing a vaccine is keen, with about 321 vaccine candidates with 32 in clinical trials in progress (Le et al., 2020) , raising safety concerns (Harrison and Wu, 2020; Peeples, 2020) , and the additional economic cost for the purchase and deployment of vaccines has not been yet added to the burden of the disease. 10 At the moment, without vaccines or effective pharmaceutical treatments available, the decision on whether reopening activities or not depends mostly on the number of additional infections/deaths that will be caused because of some policy, say, opening schools, allowing public gatherings, opening touristic places, increasing the current density allowed in some places as cinemas, restaurants, 15 buses, and planes, etc. For a highly infectious virus as SARS-CoV-2, the decision must be rooted in the amount of remaining susceptible in the region affected for some change in policy. Using the number of individuals that access private of public hospitals as a surrogate of the current number of infections, or even the number of confirmed 20 cases, is not accurate, because those quantities strongly depend on the availability of medical services that are not always accessible to the bulk of a population in many countries, or simply because of lack of resources for test deployment or because of policies that disregard testing. The number of deaths is less dubious. if we manage to calculate the average number of infections that will result in 25 the death of the average individual, then, we can use this number to estimate the number of infections that were required to observe the current number of deaths, and from here, the share of susceptible available in a well-defined population. Even although deaths were not reported as COVID-19 related, records may exist that shed light on the likelihood that death was or not caused by 30 2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint SARS-CoV-2 infection, which is part of the policy adopted by some countries like Belgium, which is, without doubt, one of the reasons why this country keeps one of the highest number of deaths per capita in Europe. Before proceeding, we need to deploy two facts. The first one attempts to establish that, with few exceptions, it is almost impossible to stop the pandemic 35 by mitigation and control measures, and the most we can do is reducing the infection rate (at a huge economic and social cost) which is known as flattening the curve. The second establishes that we are in the condition now to have reliable lower bound estimates for IFR, the Infection Fatality Risk of SARS-CoV-2. It is the confluence of these two facts that allow us to establish a surrogate for the 40 progression of the epidemic in a region and thus, for the calculation of the share of susceptible available, which is the basis to decide on reopening activities and vaccine deployment. Fact 1: SARS-CoV-2 is highly contagious and will infect most of the population 45 Several studies have shown that the basic reproductive number R 0 is high (Sanche et al., 2020; D'Arienzo and Coniglio, 2020; Najafimehr et al., 2020; Liu et al., 2020; Alimohamadi et al., 2020) , and it has been calculated as high as 5.7. It is important to remember that R 0 is the potential of disease transmission in the absence of any control or mitigation measures, and it is the potential 50 that comes into play as soon as control or mitigation measures are suspended. We calculated the R 0 for 40 countries using only the first two weeks of data, analyzing the progression in the number of deaths. These 40 countries were the first countries affected by pandemics except for China. The R 0 calculated is shown in Figure 1 . The estimation of the true potential of the disease is 55 more evident for those countries that were affected first (e.g. Italy and Spain), where our estimate of R 0 is about the same as the reported by Sanche et al. (2020) . The learning curve is evident in the reduction of R 0 when the pandemic advances. Estimates of R 0 in countries where the pandemic arrived at later 3 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . stages will not reflect the true potential of the virus alone, but the preparedness 60 of a population. Table 1 for expanded data. Table (1) in Appendix contains the main statistics derived from the analysis of λ and R 0 for the countries analyzed. From the estimated λ's, we can see that the first countries to be affected had an R 0 larger than 5 whereas this was decreasing to achieve values just above 2.5, this implies that at the beginning 65 λ was as high as one effective contact every three days to one effective contact every 5 days, on average. There are some facts we need to consider when a virus spreads with that strength: first, forecasting the size of the epidemics (total number of infected) is simpler, since social contact structures are bypassed and become irrelevant, 70 and thus the contact network resembles more a random mixing pattern. For this random mixing, the estimated fraction of infected f , can be approximated 4 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10.1101/2020.09.27.20202564 doi: medRxiv preprint (Kermack and McKendrick, 1927 ) with: (for a probabilistic derivation, see Hernandez-Suarez and Mendoza-Cano, 2009 ). For instance, if R 0 = 5.7 the fraction of infected is f = 0.99. However, if we 75 manage to reduce R 0 by half, f = 0.93, which is not a big difference in the size of the epidemic. Nevertheless, reducing R 0 by half is an incredible task that involves mainly lockdowns and face-masks, and the former has a huge economic cost that can not hold for long except for rich countries or countries whose political or cultural organization allows its implementation. Nevertheless, even 80 if a country manages to implement actions to reduce R 0 to a value smaller to one, and can support citizens economically to maintain the lockdown for long periods, a handful of infected individuals that enter the country from abroad can restart the infection process, if lockdowns are lifted. From here, the observed waves (see Figure 2 for examples in Japan, Cuba, S. Korea and New Zealand). This 85 is particularly true for a disease where the number of asymptomatic individuals is by far, larger than that of symptomatic and thus, difficult to control. As a consequence, equation (1) can be seen as a quota that must be fulfilled, as long as the fraction of remaining susceptible is larger than f . the progress of the epidemics in an arbitrarily defined region or population, but the main inconvenient is that while the number of deaths from SARS-CoV-2 5 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10.1101/2020.09.27.20202564 doi: medRxiv preprint can be approximated, the denominator, the total number of infected from the 100 disease is elusive, especially considering that a large fraction of individuals that are infected are asymptomatic. Besides, the efficacy of immunity tests to detect who has been infected and recovered has been challenged with the findings that the ability to detect antibodies is reduced in a few days, especially in those with mild or no symptoms (Long et al., 2020; Ibarrondo et al.) . Recently, Eyre et al. (2020) reported that a high fraction of individuals with none or mild symptoms may be undiagnosed mainly due to the calibration strategies of some standardized tests and concluded that samples from individuals with mild/asymptomatic infection should be included in SARS-CoV-2 immunoassay evaluations. Some studies to estimate the IFR have been recently released for Iceland 110 (Gudbjartsson et al.) , India (Mukhopadhyay and Chakraborty, 2020), Germany (Streeck et al., 2020) and Denmark (Erikstrup et al., 2020) among others. In a review of 23 studies where the IFR was estimated Ioannidis (2020) a median of 0.25% is reported, which is about 2.5 deaths for each 1000 individuals. The study in Iceland, (Gudbjartsson et al.) , the most comprehensive to date, proposes an 115 IFR of 0.3% (95% CI, 0.2 to 0.6). Special care must be taken with this later estimate, since Iceland has a population of 321,641, with only 29 deaths so far. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10. 1101 Estimation of the IFR using an alternative method Hernandez-Suarez et al. (2020) proposed an alternative method to estimate the IFR based on the assumption that all those living with an infected individual 120 will be infected. We use the same data set that has grown from 3,193 households to 131,240 households. The essence of the method is as follows: let n j be the number of individuals living in a house where an infected was confirmed infected with SARS-CoV-2, whose symptoms started on day d j . Let x j the number of deaths among the remaining n j − 1 individuals whose symptoms started after 125 day d j , then, an estimate of the IFR is: In this example we build an approximation to (2) Every individual has an associated NSS (Social Security Number) that is shared among all members of a family. As an approximation, we assumed all 140 individuals sharing the same NSS live in the same household, and then grouped the cases in households. If there were more than one case in a household, we only considered the household if all cases were already solved a deaths or recoveries. In every household with more than one case we consider the index case as the individual with the earliest symptom onset and counted the number of deaths 145 7 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 29, 2020. . https://doi.org/10.1101/2020.09.27.20202564 doi: medRxiv preprint Table 1 ). From Table 1 we have j x j = 1, 108 We do not have j n j and we build If the disease will affect most of the population, as suggested previously, then, the most effective way to estimate the overall IFR is by measuring the 160 fraction of deaths in a population in which the epidemic has evolved for a long 8 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 29, 2020 . . https://doi.org/10.1101 9 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 29, 2020 . . https://doi.org/10.1101 time and it is over or near the end. It is evident that we need to focus in regions with: a) high observed IFR * and b) lack of new waves. Such is the case of regions as New Jersey and New York states (see figure 3) . The current IFR * calculated for these states is 1.826 and 1.707 respectively. The lack of new waves in these two states indicates that the virus has infected and killed a fraction of individuals close to the IFR, and thus the epidemic rampaged those states and it stopped due to what is commonly called herd immunity, that actually conveys no immunity at all, but a lack of a significative mass of susceptible for the R 0 to be effective. Thus, the progression of the 170 epidemic in a region can be estimated with IFR * /IFR. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10.1101/2020.09.27.20202564 doi: medRxiv preprint 2. Consequences from the previous facts for vaccines and the return to normality As a consequence of the two previous facts, we can conclude that it is possible to monitor the development of the epidemic in a region by following the 175 number of deaths. One can estimate the fraction of total infected at time t, asymptomatic or not, with: where x t is the number of observed deaths at time t and N is the population size. The estimated fraction of susceptible as a function of the IFR in New Jersey and New York is shown in figure 4 . We can see that if IFR=2/1000, the 180 fraction of remaining susceptible in New Jersey and New York is 9.5% and 15% respectively, nevertheless, if it is as high as 2.3, the fraction of susceptible raises to 20% and 25% for those states. True lethality of SARS-CoV-2 (deaths per 1,000) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10.1101/2020.09.27.20202564 doi: medRxiv preprint COVID-19 related deaths and every effort has to be made to improve current estimates of the IFR. Hernandez-Suarez et al. (2020) suggested that in light of the high contagiousness of SARS-CoV-2, the secondary cases resulting in fatalities in a household with at least one confirmed infected, may be useful to estimate the IFR, which would provide a large amount of data and minimal 190 testing requirements. Antibody testing in New York shows to date, that out of 2,298,248 tests, 576,769 were positive for antibodies (NYC-Health) which implies 25% of the population may be immune, contradicting our findings that for an IFR as high as 2.3 per thousand the percentage of infected should be close to 75% already. Nevertheless, the fraction of people with antibodies is 195 not a good surrogate of the fraction of infected with SARS-CoV-2 because it underestimates the number of infected for two reasons: first, antibody testing is voluntary and it is natural to expect that individuals with no symptoms are less compelled to be tested than those symptomatic, so, a large fraction of infected is not tested, and second, we already mentioned some failures in testing reported, 200 apparently due to calibration procedures that tend to fail in individuals with none or mild symptoms (Eyre et al., 2020) . The idea that the share of susceptible in New Jersey is relatively small, is supported on the fact that the number of active cases reported to date is close to 20,000 and unless they are fully quarantined the infectious pressure from these 205 individuals and those unreported must be huge, nevertheless, no new peak is observed. The same happens in New York, although the share of susceptible is higher. The progression of the pandemics in San Marino (see Figure 5 ) also supports this idea, since despite of being the country at the top of the mortality per individual (1.237 per thousand) and having only two deaths since the end 210 of April, it appears that the saturation required for the epidemic to halt has not been reached, and a second wave has started in the last two weeks. Everything leads to the following question: what will be the purpose of a 12 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. tion is especially valid from the comprehensive study in Iceland that strongly suggests the existence of immunity due to infection (Gudbjartsson et al.) . Unless it is proven that immunity wanes beyond a protective level after some time, everything seems to indicate that those regions where the share of estimated susceptible is already low, should have less priority in the distribution of vac-220 cines. If facts 1 and 2 are not taken into account, a vaccination campaign in a region where the proportion of deaths among the population is close to the IFR, will just give the impression of being effective but will not play a role, regardless of the efficacy of the vaccine. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10. 1101 suit of a covid-19 vaccine. Proceedings of the National 16 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10.1101/2020.09.27.20202564 doi: medRxiv preprint 18 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2020. . https://doi.org/10. 1101 Humoral immune response to sars-cov-2 in iceland Applications of occupancy urn models to epidemiology Estimation of the 280 infection fatality rate and the total number of sars-cov-2 infections Rapid decay of anti-SARS-CoV-2 antibodies in persons with mild COVID-19 The infection fatality rate of COVID-19 inferred from seroprevalence data A contribution to the mathematical theory of epidemics Evolution of the covid-19 vaccine development landscape The reproductive number of COVID-19 is higher compared to SARS coronavirus Clinical and immunological assessment of asymptomatic SARS-CoV-2 infections Estimation of undetected covid-19 infections in india Estimation of basic reproduction number for covid-19 and the reasons 305 for its differences COVID-19: Data News feature: Avoiding pitfalls in the pur-Academy of High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2 Infection fatality rate of SARS-CoV-2 infection in a german community with a super-spreading event