key: cord-0850647-h0mcpxs8 authors: Manski, Charles F.; Molinari, Francesca title: Estimating the COVID-19 infection rate: Anatomy of an inference problem date: 2020-05-06 journal: J Econom DOI: 10.1016/j.jeconom.2020.04.041 sha: e6dd11bf76e2c2bcb955c99f5b5a5bdd113ecb69 doc_id: 850647 cord_uid: h0mcpxs8 Abstract As a consequence of missing data on tests for infection and imperfect accuracy of tests, reported rates of cumulative population infection by the SARS CoV-2 virus are lower than actual rates of infection. Hence, reported rates of severe illness conditional on infection are higher than actual rates. Understanding the time path of the COVID-19 pandemic has been hampered by the absence of bounds on infection rates that are credible and informative. This paper explains the logical problem of bounding these rates and reports illustrative findings, using data from Illinois, New York, and Italy. We combine the data with assumptions on the infection rate in the untested population and on the accuracy of the tests that appear credible in the current context. We find that the infection rate might be substantially higher than reported. We also find that the infection fatality rate in Illinois, New York, and Italy is substantially lower than reported. P(C d = 1) is not directly observable. However, population surveillance systems provide daily data on two quantities 23 related to P(C d = 1). These are the rate of testing for infection and the rate of positive results among those tested. To 24 simplify analysis, we assume that a person is tested at most once by date d. This assumption may not be completely 25 accurate, for reasons that will be explained later. In Section 2.2, we discuss the implications of its possible failure for our 26 analysis. Let T d = 1 if a person has been tested by date d and T d = 0 otherwise. Let R d = 1 if a person has received a positive 28 test result by date d and R d = 0 otherwise. Observe that T d = 0 ⇒ R d = 0 and R d = 1 ⇒ T d = 1. By the Law of Total 29 Probability, the infection rate may be written as follows: 31 where P(R d = 1) = P(R d = 1|T d = 1)P(T d = 1), (2) Now consider each of the component quantities that together determine the infection rate. Assuming that reporting of 32 testing is accurate, daily surveillance reveals the testing rate and the rate of positive results among those tested. Thus, the 33 quantities P(T d = 0), P(T d = 1), P(R d = 0|T d = 1), and P(R d = 1|T d = 1) are directly observable. The remaining quantities 34 are not directly observable. 35 The quantities P(C d = 1|R d = 1) and P(C d = 1|T d = 1, R d = 0) are determined by the accuracy of testing. The former is 36 the positive predictive value (PPV) and the latter is one minus the negative predictive value (NPV). We note that medical 37 researchers and clinicians often measure test accuracy in a different way, through test sensitivity and specificity. The 38 sensitivity and specificity of tests for COVID-19 on the tested sub-population are P(R d = 1|T d = 1, C d = 1) and P(R d 39 = 0|T d = 1, C d = 0) respectively. Sensitivity and specificity are related to PPV and NPV through Bayes Theorem, whose 40 application generally requires knowledge of P(C d = 1|T d = 1), the infection rate in the tested sub-population. An exception 41 to this generalization is that PPV equals one if specificity equals one, whenever sensitivity is positive and P(C d = 1|T d = 1) Medical experts believe that the PPV of the prevalent tests for COVID-19 is close to one, but that NPV may be 1 considerably less than one. We have obtained this information in part from personal communication with an infectious 2 disease specialist at Northwestern Memorial Hospital and in part from the public literature. For example, USA Today has 3 reported as follows 5 : 4 ''Dwayne Breining, executive director for Northwell Labs in New Hyde Park, New York, said the test is extremely 5 accurate and can detect even low levels of the virus. False positives are highly unlikely, he said, though false 6 negatives may result from poor-quality swabs or if the instrument is blocked by mucus. Those factors might have 7 been at play in a number of false negatives initially reported. Patients who continue to have symptoms after a 8 negative test are advised to get retested''. 9 We therefore find it credible to assume that P(C d = 1|R d = 1) = 1. It can be shown that this is equivalent to assuming 10 that test specificity P(R d = 0|T d = 1, C d = 0) = 1. The final sentence of the Breining quote explains part of why it may 11 not be completely accurate to assume that persons are tested at most once. Another reason is that hospitalized patients 12 are tested to verify recovery before they are released from the hospital. Nevertheless, we maintain this assumption for 13 simplicity. 14 There does not appear to presently be a firm basis to determine the precise NPV of the prevalent nasal-swab tests, It is not clear whether NPV has been constant over the short time period we study or, contrariwise, has varied as 21 testing methods and the subpopulation of tested persons change over time. 7 The NPV may also vary over longer periods 22 if the virus mutates significantly. The illustrative results that we report later assume that NPV is in the range [0.6, 0.9], 23 implying that P( It remains to consider P(C d = 1|T d = 0), the rate of infection among those who have not been tested. This quantity 25 has been the subject of much discussion, with substantial uncertainty expressed about its value. It may be that the value 26 changes over time as criteria for testing people evolve and testing becomes more common. The illustrative results that 27 we report later show numerically how the conclusions one can draw about P(C d = 1) depend on the available knowledge 28 of P(C d = 1|T d = 0). To finalize the logical derivation of a bound on P(C d = 1), let [L d0 , U d0 ] and [L d10 , U d10 ] denote credible lower and upper bounds on P(C d = 1|T d = 0) and P(C d = 1|T d = 1, R d = 0) respectively. Now combine these bounds with the assumption that P(C d = 1|R d = 1) = 1 and with empirical knowledge of the testing rate and the rate of positive test results. Then Eqs. (1)-(4) imply this bound on the population infection rate: The width of bound (5) is 31 Inspection of (6) shows that uncertainty about test accuracy and about the infection rate in the untested sub-population, 32 measured by U d10 − L d10 and U d0 − L d0 , combine linearly to yield uncertainty about the population infection rate. The 33 fractions P(T d = 1) and P(T d = 0) of the population who have and have not been tested linearly determine the relative 34 contributions of the two sources of uncertainty. 35 36 As of April 2020, the fraction of the population who have been tested is very small in most locations. For example, 37 the fraction who have been tested by April 24, 2020 was about 0.015 in Illinois, 0.04 in New York, and 0.027 in Italy; see 38 Section 3 for details on the data sources. Hence, the present dominant concern is uncertainty about the infection rate in 39 untested sub-populations. We now consider the problem of obtaining a credible bound on this quantity. 7 Failure of a test to detect that a person has been infected may occur for multiple reasons. In the current context, where eligibility for testing requires a person to exhibit symptoms or to have been in recent close contact with a confirmed case, we expect that imperfect administration of tests is the dominant reason for inaccuracy in results. In settings with widespread eligibility for testing, inaccuracy also occurs if a person is tested after recovery from COVID-19. Then a nasal swab test would show no presence of the virus and yield a negative result, missing the fact that the person was infected previously. 8 Rather than assume a bound on NPV directly, a medical researcher or clinician could assume a bound on sensitivity. It can be shown that a bound on sensitivity combined with the assumption that specificity equals one implies a bound on the NPV. in the rate of testing would be required to substantially narrow the width of the bound. The best feasible case would occur with random testing of a large enough sample of persons to make statistical imprecision a negligible concern. Then P(C d = 1) = P(C d = 1|T d = 1) and uncertainty stems only from incomplete knowledge of the NPV of testing. The Law of Total Probability, the maintained assumption that positive test results are always accurate, and the specified bounds This bound has width (U d10 − L d10 )P(R d =0|T d = 1). 7 We judge the current situation to be intermediate between the worst and best case scenarios. We thus far have Observe that if testing for infection were random rather than determined by the current criteria, it would be credible to 18 impose a much stronger assumption, namely P( . However, testing clearly has not been 19 random. Hence, we only impose assumption (8). 1|T d = 0). Bound (9) is methodologically interesting because U d0 is now a function of U d10 rather than a separate quantity. It thus 25 enhances the importance of securing an informative upper bound on P(C d = 1|T d = 1, R d = 0). In particular, (9) implies 26 that U d0 ≥ U d10 , whatever the rate P(R d = 1|T d = 1) of positive test outcomes may be. The monotonicity assumption does not affect the lower bound L d0 , which is zero in the absence of other information. Hence, inserting L d0 = 0 and (9) into the bound (5) on P(C d = 1) yields The width of bound (10) is In the present context where P(T d = 1) is very small, the width of the bound approximately equals the sum of the rate Bound (10) assumes that a person is tested at most once by date d. If some persons are tested multiple times, publicly 33 reported count data for ''total tested'' overestimates P(T d = 1). In that case, the bound is too small. To see this, rewrite 34 expression (10) as 9 To obtain the upper bound, use the Law of Total Probability to write Then combine the upper bound in expression (7) with the one in expression (9). Observe that, if the event T d = 1 sometimes occurs because the same person is tested multiple times, the lower bound 1 is too high and the upper bound is too low. Temporal Monotonicity 3 A second form of monotonicity holds logically rather than by assumption. Our analysis thus far has only considered 4 the infection rate by a specified date. A person who has been infected by an early date necessarily has been infected by 5 every later date. Hence, for two dates d and d', we have the temporal monotonicity condition Inequality (12) makes date a monotone instrumental variable as defined in Manski and Pepper (2000) . Proposition 1 of that article shows that, given a set of date-specific lower and upper bounds on the infection rate for various dates, condition (12) implies that P(C d = 1) must be greater than or equal to the maximum of the date-specific lower bounds for all d' ≤ d. Moreover, P(C d = 1) must be less than or equal to the minimum of the date-specific upper bounds for all d' ≥ d. 10 Applying this result to the date-specific bounds (10) yields this result: Bound (13) (13) is less than the one in (10). Thus, the 12 temporal monotonicity condition may or may not have identifying power, depending on the testing data. We find that it 13 modestly improves lower bounds with the data we use. 14 2.3. Bounding the fraction of asymptomatic infections 15 We are presently unaware of other assumptions or logical conditions that enjoy credibility comparable to the above 16 monotonicity assumptions and that have identifying power. One may, however, perhaps feel comfortable bringing 17 to bear assumptions whose credibility stems from limited evidence or from the judgment of respected medical and 18 epidemiological experts. We provide an example here to illustrate how this may be done and the identifying power 19 studied. We do not endorse the specific assumptions made here. Consider the decomposition of COVID-19 episodes into those where the patient does and does not manifest discernible 21 symptoms. Dr. Anthony Fauci, the director of the National Institute of Allergy and Infectious Diseases, has been quoted 22 as saying that the fraction of cases in which the patient is infected but shows no symptoms is ''somewhere between 25 23 and 50 percent''. Fauci went on to say ''And trust me, that is an estimate. I don't have any scientific data yet''. 11 Supposing it to be correct, Fauci's bound has identifying power when combined with a further assumption. Let A d = 1 25 or S d = 1 if a person has respectively had an asymptomatic or symptomatic case of COVID-19 by date d. Let each quantity 26 equal zero otherwise. The two categories of illness are mutually exclusive, so 28 Fauci imposes the assumption (15) 30 Combining (14) and (15) yields (16) 32 The value of P(S d = 1) is unknown. However, the existing criteria for testing require the presence of symptoms or Then the fraction who are tested and infected is known to be at least 37 10 Proposition 1 of Manski and Pepper (2000) shows that this bound is sharp. That is, it is the tightest bound achievable with the available information. Molinari (2020, Section 2.1) shows that it is a more complex matter to obtain sharp bounds for functions of the infection rate that vary with time. (17') 13 We also find it noteworthy that the rate of confirmed cases in the hospital was 33/215 = 0.153. With universal testing, (17'') 23 Surveillance systems may report several rates of severe illness (V), including hospitalization (H), ICU usage (U), and 25 death (D). 13 The present discussion considers these reports to be accurate. Thus, one may have empirical knowledge of 26 the rates P(V d = 1) for V ∈ {H, U, D}. Surveillance systems do not report rates of severe illness conditional on infection. These have the form The numerator P(V d = 1, C d = 1) may logically differ from the reported rate P(V d = 1). This may occur for H and U inaccurate. For simplicity, we assume here that such errors do not occur. However, we caution that the assumption may 32 not be realistic. 14 33 In the absence of reporting errors, 12 In addition to this subpopulation being female, pregnancy typically occurs during a limited age range, and we have no information on whether age and gender systematically affect the presence of symptoms conditional on infection. 13 Hospitalization includes ICU usage. 14 See https://www.washingtonpost.com/investigations/coronavirus-death-toll-americans-are-almost-certainly-dying-of-covid-19-but-being-leftout-of-the-official-count/2020/04/05/71d67982-747e-11ea-87da-77a8136c1a6d_story.html There are both clinical and public health reasons why one would like to know P(C d = 1|X) and P(V d = 1|X, C d = 1) for persons with specified personal characteristics X. For example, it has been thought important to know these rates conditional on the demographic characteristics X = (age, gender, race). Whatever X may be, the bound on P(C d = 1|X) is this X-specific version of (5): . When computing the bound, one should bring to bear credible X-specific bounds on P(C d = 1|X, T d = 0) and P(C d = 1 1|X, T d = 1, R d = 0). If one imposes a monotonicity restriction conditional on X as in Section 2.2, the bound in (10) is 2 updated similarly as we did in (14) for the bound in (5). The bound on P(V d = 1|X, C d = 1) is computable if surveillance 3 additionally reports P(V d = 1|X). Table 2 reports the bounds in (13) for Illinois, New York, and Italy, under the monotonicity assumptions presented in 26 Section 2.2. The temporal monotonicity condition is reflected in the fact that, for each state and country, both the lower 27 and upper bounds on P(C d = 1) weakly increase over time. The widths of the bounds varied (non-monotonically) from 28 0.455 to 0.521 for Illinois, with a peak of 0.523; 0.48 to 0.601 for New York, with a peak of 0.613; and remained about 29 0.47 for Italy throughout the period. 30 The substantial width of the bounds reflects the fact, previously discussed, that the fraction of tested individuals 31 was very small throughout the period. Nonetheless, the bounds have substantial informational content relative to the 35 We next bring to bear information on the rate of asymptomatic infections as discussed in Section 2.3. This information 36 does not lower the upper bound on the probability of infection, but it raises the lower bounds. Using the expert opinion 37 15 In all locations, a person is classified as having a confirmed case of the disease by date d if the person has obtained a positive test result by that date. The New York documentation of testing indicates that non-positive results are sub-classified into those that are negative and inconclusive. This sub-classification is not made in the documentation for Illinois and Italy. Receipt of test results may take several days or longer. We are not certain how agencies classify persons while results are still pending. Our analysis interprets the reported data on confirmed cases to exclude cases where test results are pending. Whereas statistics on cumulative confirmed cases count the number of persons with positive test results to date, statistics on cumulative testing count the number of tests performed. The New York documentation states this: ''Test counts reflect those completed on an individual each day. A person may have multiple specimens tested on one day, these would be counted one time, i.e., if two specimens are collected from an individual at the same time and then evaluated, the outcome of the evaluation of those two samples to diagnose the individual is counted as a single test of one person, even though the specimens may be tested separately. Conversely, if an individual is tested on more than one day, the data will show two tests of an individual, one for each date the person was tested''. Thus, to the extent that persons are retested on different days, the cumulative test counts overstate the cumulative number of persons who have been tested. 16 Our data source for deaths in New York is COVID Tracking Project 2020 (2020). Probability of being tested, of receiving a positive test result if tested, and of death. Date This paper has used standard methods of partial identification analysis to study two key aspects of the uncertainty 18 that has frustrated attempts to learn the COVID-19 cumulative infection rate and rates of severe illness conditional on about the NPV of the tests in use. The simple analysis of Section 2 shows how available data and maintained assumptions 2 combine to determine the inferences that can logically be drawn. We have used monotonicity assumptions that have 3 strong credibility in the current context. We also have used suggestive information on the rate of asymptomatic infection 4 to illustrate how further assumptions having a less firm foundation may be brought to bear, should one find them credible. We have used data for two American states and for Italy to illustrate application of the analysis. Given that the tested 6 fraction of the population has been very low, one can barely draw any conclusion about the population infection rate 7 without making assumptions that bound the rate of infection in the untested sub-population. Imposing the monotonicity 8 assumptions restricts the population infection rate to bounds that have about width 0.5 in the current COVID-19 context. One naturally may prefer bounds of narrower width. Given the available data, this is logically possible to achieve only if 10 one imposes stronger assumptions with considerable identifying power. We have not reported narrower bounds because 11 we do not immediately see a credible basis to add assumptions that would justify them. Readers who feel that they can 12 motivate stronger assumptions may adapt our analysis to determine their implications. Among the possibilities for narrowing the bounds that we plan to investigate, it has often been suggested that we can bounds shows how to proceed formally to tighten inference. See Manski (2020) and Molinari (2020) . 20 We also plan to explore imposition of assumptions on the dynamics of the epidemic that have been used in 21 epidemiological modeling and that may have some credibility. For example, a shape restriction commonly maintained Section 4 as ''estimates'' and we do not provide measures of statistical precision. Instead, we view states and nations as 11 the units of interest rather than as realizations from some sampling process. Measurement of statistical precision requires 12 specification of a sampling process that generates the available data. Yet we are unsure what type of sampling process 13 would be reasonable to assume in this work. of new infection. A difficulty in studying the former is that the duration of the disease is apparently quite heterogeneous State population totals COVID Tracking Project Covid 19 statistics Resident population as of Suppression of COVID-19 outbreak in the municipality of Vo Anatomy of the selection problem Identification Problems in the Social Sciences Partial Identification of Probability Distributions Identification for Prediction and Decision Toward credible patient-centered meta-analysis Monotone instrumental variables: With an application to the returns to schooling How do right-to-Carry laws affect crime rates? Coping with ambiguity using bounded-variation assumptions Microeconometrics with Partial Identification New York state statewide COVID-19 testing Performance of rapid influenza diagnostic testing in outbreak settings Emergenza coronavirus: la risposta nazionale Universal screening for SARS-CoV-2 in women admitted for delivery