key: cord-1035719-frivrm3s authors: Korolev, Ivan title: Identification and estimation of the SEIRD epidemic model for COVID-19 date: 2020-07-30 journal: J Econom DOI: 10.1016/j.jeconom.2020.07.038 sha: e105ed2c219286f7612fd3098775285673500313 doc_id: 1035719 cord_uid: frivrm3s This paper studies the SEIRD epidemic model for COVID-19. First, I show that the model is poorly identified from the observed number of deaths and confirmed cases. There are many sets of parameters that are observationally equivalent in the short run but lead to markedly different long run forecasts. Second, I show that the basic reproduction number [Formula: see text] can be identified from the data, conditional on epidemiologic parameters, and propose several nonlinear SUR approaches to estimate [Formula: see text]. I study the performance of these methods using Monte Carlo studies and demonstrate that they yield fairly accurate estimates of [Formula: see text]. Next, I apply these methods to estimate [Formula: see text] for the US, California, and Japan, and document heterogeneity in the value of [Formula: see text] across regions. My estimation approach accounts for possible underreporting of the number of cases. I demonstrate that if one fails to take underreporting into account and estimates [Formula: see text] from the reported cases data, the resulting estimate of [Formula: see text] may be biased downward and the resulting forecasts may exaggerate the long run number of deaths. Finally, I discuss how auxiliary information from random tests can be used to calibrate the initial parameters of the model and narrow down the range of possible forecasts of the future number of deaths. The SIR (Susceptible, Infectious, Recovered) model and its variations is widely used in epidemiology to model the spread of epidemics. Since the outbreak of COVID-19, it has seen increased popularity among economists who are trying to asses the economic consequences of the coronavirus and various mitigation policies, such as Acemoglu et al. (2020) , Atkeson (2020b,c) , Avery et al. (2020) , Berger et al. (2020) , Eichenbaum et al. (2020) , Ellison (2020), Fernandez-Villaverde and Jones (2020), Piguillem and Shi (2020), Toda (2020), and others. In this paper, I study identification and estimation of the modification of the SIR model called SEIRD (Susceptible, Exposed, Infectious, Recovered, and Dead) and present several findings. First, I show that the SEIRD model has too many degrees of freedom and is poorly identified from the short run data on the number of deaths and confirmed cases. Conditional on the values of epidemiologic parameters, i.e. parameters that reflect the clinical progression of the disease, the only model parameter that is identified is the basic reproduction number R 0 . While R 0 governs the speed of spread of the virus, the key driver of the long run number of deaths in the model is the infection fatality rate (IFR), which is not identified separately from initial values. As a result, models that are observationally equivalent in the short run can produce markedly different long run forecasts of the number of deaths. Second, I propose several nonlinear seemingly unrelated equations (SUR) approaches to estimate R 0 based on the deaths and confirmed cases data. The approaches I consider differ in whether they use cumulative or daily data and how they introduce errors in the model. I study the performance of different approaches in simulations and find that the methods based on cumulative data typically outperform those based on daily data in terms of the mean squared error (MSE) of the estimate of R 0 . While there is no clear ranking of the different values of epidemiologic parameters. I show that there is substantial heterogeneity in the values of R 0 : for the same values of epidemiologic parameters, the estimates of R 0 for the US and California are about 2-4 times higher than for Japan. Moreover, the estimates of R 0 are highly sensitive to the values of epidemiologic parameters. There is no agreement in the medical literature on the length of the incubation and infectious period for COVID-19, and different values of these parameters result in the estimates of R 0 for the US that range from under 5 to around 17. Despite these large differences in the estimates of R 0 , the resulting models lead to virtually identical fit of the observed data. These findings highlight that there is no single value of R 0 that is consistent with the data, at least in the short run. The appropriate value of R 0 depends both on the region and on the model. My model and estimation strategy take into account possible underreporting of the number of COVID-19 cases. Even though the fraction of all cases that is reported is not identified, I show that it is important to allow it to differ from one. I demonstrate that if one does not take underreporting into account and estimates R 0 from the confirmed cases data, assuming that all cases are reported, the estimate of R 0 may be biased downward and the long run number of deaths may be overestimated. Finally, I use the example of Iceland to show how auxiliary data can be used to narrow down the range of possible forecasts of the long run number of deaths from the epidemic. presents the empirical results. Section 8 concludes. Appendix A presents additional results. In this paper I study a version of the SEIR model that includes dead among its compartments. Similar models have been used in epidemiology by Chowell et al. (2007) , Lin et al. (2020) , Wang et al. (2020) , and others. More advanced versions of the model with more compartments are considered in Chowell et al. (2003) and Chowell et al. (2006) . I consider a model with five groups of people: susceptible (S), exposed (E), infectious (I), recovered (R), and dead (D). Susceptible are those who have not gotten the virus yet and can become infected. Exposed are those who have gotten the virus but cannot transmit it to others yet. This corresponds to the so called incubation period. Infectious are those who have the virus and are contagious. Recovered are those who were sick in the past but have recovered from the virus. Dead are those who have died because of the virus. Including the exposed compartment in the model is important because, according to the CDC, COVID-19 involves an incubation period of up to 14 days. 1 As a result, the SEIRD model should reflect the progression of the epidemic more accurately than a simpler SIRD model that does not include an incubation period. The number of people in different groups evolves over time as follows: N is the population size of a given country or region. I assume that it is fixed and does not vary over time. I could model the dynamics of the population size to account for the fact that some people die from the disease, but then I would also need to model births and deaths due to other causes. In order to avoid these complications, I simply fix N , as is commonly done in the literature. C(t) is the cumulative number of cases confirmed. It does not affect the model dynamics but is used to match the model to the confirmed cases data. In my main analysis, I assume that it is the people who are infectious, rather than exposed, who are tested for the virus. In my robustness checks, I replace I(t) in equation (2.6) with E(t) and find that the results remain virtually unchanged. The evolution of the SEIRD model depends on several parameters. I will refer to γ and σ as epidemiologic parameters. The parameter γ reflects the estimated duration of illness. Its estimates in the literature vary from 1 /18 (e.g. Wang et al. (2020) The parameter α is the infection fatality rate (IFR). As discussed in Korolev (2020), the IFR has serious limitations and heavily depends on the composition of people who get sick. The initial conditions for the number of recovered and dead are R(0) = 0 and D(0) = 0. I need to pick the initial number of infectious I(0) and exposed E(0). I discuss their choice later in the paper. Finally, the initial number of susceptible people is In my estimation, I use the deaths and confirmed cases data for COVID-19. Atkeson (2020a), written concurrently and independently of this paper, attempts to answer the question similar to mine in the context of the usual SIR model. Manski and Molinari (2020) discuss identification problems in estimating the COVID-19 infection rate, but they do not consider SIR type models studied in this paper. In the econometrics literature, identification studies whether the parameters of the model would be known if the researcher knew the population that data is drawn from (see, e.g., Lewbel (2019)). In the context of this paper, the question is somewhat different: if we ob-cases paths from models with different parameter values and investigate whether these paths are identical instead of studying identification theoretically. I present a more rigorous treatment of identification in the simplified SIRD, rather than SEIRD, model in Supplementary Appendix A.1 and obtain similar results. First, I assume that epidemiologic parameters γ and σ are known constants and study identification of the remaining parameters. The parameters of the model then include the basic reproduction number R 0 , the infection fatality rate α, the fraction λ of all cases that is reported, and the initial conditions: the number of infectious people I(0) = I 0 , the number of exposed people E(0) = E 0 , and the time T 0 that has passed since the epidemic started. For instance, T 0 = 1 means that the epidemic just started and the initial values (E 0 , I 0 ) correspond to the first period we observe. T 0 = 2 means that the epidemic started last period and the current period corresponds to (E(1), I(1)). T 0 = 10 means that the epidemic started 9 periods ago from values (E 0 , I 0 ) and the current values are (E(9), I(9)). I denote the vector of parameters θ = (R 0 , α, λ, E 0 , I 0 , T 0 ) and study whether these parameters can be identified based on the short run (say, 60 days) data. The upper panel of Figure 1 plots the simulated paths of deaths and confirmed cases for three sets of parameters: θ 1 = (5, 0.01, 0.2, 2, 2, 2), θ 2 = (5, 0.005, 0.1, 4, 4, 2), and θ 3 = (5, 0.004, 0.08, 2, 2, 10). The first two sets of parameters share the same start date and reproduction number, but differ in the initial values and the values of α and λ. Essentially, the epidemic that corresponds to the second set of parameters just scales the first epidemic up by a factor of two, but cuts the fatality rate and the reported fraction of cases in half. As a result, these two epidemics are indistinguishable in the short run. In other words, we cannot tell from the short run data whether we observe a large epidemic with a low fatality rate and large number of unreported cases, or a small epidemic with a high fatality rate and small number of unreported cases. The third epidemic starts from the same values of (E 0 , I 0 ) as the first one, but nine periods ago instead of last period. At the same time it reduces the fatality rate and the 8 J o u r n a l P r e -p r o o f observable fraction by a factor of 2.5. It produces more cases in the current period than the first epidemic, but a smaller fraction of them is reported and a smaller fraction leads to death. As a result, the third epidemic is indistinguishable from the first one. While the three sets of parameters are indistinguishable in the short run, the middle panel of Figure 1 shows that the resulting epidemics lead to very different long run deaths forecasts. The epidemic with the highest fatality rate α will result in about twice as many deaths as any of the other two epidemics. Next, I study identification of R 0 . The bottom panel of Figure 1 shows that R 0 affects the curvature of the deaths and reported cases curves, while other parameters only tilt it around the origin. As a result, R 0 can be uniquely identified from the curvature of deaths and confirmed cases. while changes in the reproduction number R 0 change the slope of the lines. Thus, R 0 can be identified from the slope of the log series, but the remaining parameters cannot be separately identified. Because one cannot separately identify α, λ, E 0 , I 0 , and T 0 , I set T 0 = 1 and I 0 = 0. In my empirical analysis, I will generally set E 0 = 1, unless it leads to computational issues. J o u r n a l P r e -p r o o f and reported cases respectively, but they have no effect on the model dynamics. Finally, the initial values E 0 and I 0 affect the timing of the model, but to a much smaller extent than the value of R 0 . Thus, if we are interested in modeling the evolution of the epidemic and its burden in terms of the number of deaths, the primary parameters of interest are R 0 and α, while the remaining model parameters can be viewed as nuisance parameters. In this section, I turn to estimation of the SEIRD model. In order to rationalize the model with the observed data, one needs to introduce errors in the model. There are several possible ways to introduce errors, and they can lead to different estimation approaches. First, one could introduce errors directly in the cumulative numbers. They would be given by where E[(ε D (t), ε C (t))] = 0. D(t, R 0 , α) and C(t, R 0 , λ) denote the cumulative number of deaths in the model, 10 while D obs (t) and C obs (t) denote the cumulative number of deaths and reported cases observed in the data. Because the errors ε D (t) and ε C (t) enter the cumulative equations, they would likely exhibit strong autocorrelation. One could introduce errors that are possibly independent over time by modeling the daily numbers of deaths and reported cases. Denote by ∆A(t) the first differences in the series Then ∆D obs (t) and ∆C obs (t) would correspond to the observed daily number of deaths and reported cases. One could model them as In fact, the model in equations (5.1) and (5.2) can be viewed as the model in equations (5.3) and (5.4) if ε D (t) = t s=1 η D (t) and ε C (t) = t s=1 η C (t). Alternatively, one could assume that additive errors enter the equations for the logarithms of the daily numbers of deaths and reported cases rather than their levels: i.e. the errors in the daily numbers are multiplicative. Yet another estimation approach introduces errors in the logarithms of the cumulative number of deaths and reported cases: log(D obs (t)) = log(D(t, R 0 , α)) + ζ D (t), (5.9) log(C obs (t)) = log(C(t, R 0 , λ)) + ζ C (t), There is no agreement on the choice of the model and estimation method in the literature. Chowell et al. (2007) J o u r n a l P r e -p r o o f All these papers use data either on the number of reported cases or on the number of deaths, but not both. In contrast, I estimate the model using both series simultaneously. All four models (5.1)-(5.2), (5.3)-(5.4), (5.5)-(5.6), and (5.9)-(5.10) can be viewed as versions of the nonlinear SUR model (see, e.g., Gallant (1975) and Chapter 9 in Davidson and MacKinnon (1993) ). This model has the following form: . Then the objective function is given by where W is a 2 × 2 weighing matrix. Several choices of W are possible. The simplest possible choice is W = I, a 2 × 2 identity matrix, which assigns the same weight to both equations. Another possibility is Yet another choice is given by whereŨ (t) = (ũ 1 (t),ũ 2 (t)) , andũ j (t) are the same as above. This choice of the weighting matrix accounts not only for possibly different variances of the error terms, but also for the correlation between them. Finally, if one suspects heteroskedasticity, one could use the objective function where W (t) is an estimate of the variance of U (t) that is allowed to vary over t. Because estimating W (t) may be somewhat tricky, I do not consider this class of estimators in this paper. While inference is also beyond the scope of this paper, I note that there are several interesting directions for future research. First, one could study how to conduct inference on the model parameters, e.g. the basic reproduction number R 0 . While there are certain methods that would seem reasonable, such as the asymptotic SUR standard errors or the bootstrap (as in Chowell et al. (2007) ), one would need to carefully account for partial identification of model parameters, possible heteroskedasticity and autocorrelation in the errors, and for the fact that the estimates of α and λ may be close to the boundary of the parameter space (0 or 1). Second, one could be able to develop specification tests based on the difference between different estimates, e.g. with different weighting matrices, along the lines of Hausman (1978) and White (1981). In this section I investigate the performance of different estimation methods described above in a number of Monte Carlo studies. I consider several data generating processes In other words, where the star superscript denotes simulated values. Thus, and E[∆C * (t, R 0 , λ)] = ∆C(t, R 0 , λ), so the errors in the simulated daily deaths and reported cases have mean zero by construction. Finally, I reconstruct the cumulative series D * (t, R 0 , α) and C * (t, R 0 , λ) and estimate the parameters using all of the methods discussed above. In the DGP, the initial values are E 0 = 2, I 0 = 0. In the estimation, I try several fixed choices of E 0 as well as attempt to estimate E 0 and I 0 . When I estimate E 0 and I 0 , I restrict my attention to the approaches based on cumulative numbers, because the approaches based on daily numbers turn out to be prone to numerical issues. 11 Tables 1-4 report the mean, bias, standard deviation, and MSE of the estimates of R 0 across 500 simulation draws. I consider two estimation approaches when logarithms are used: one trims all observations with D obs (t) ≤ 25 or C obs (t) ≤ 75, another one uses all observations with D obs (t) > 0 and C obs (t) > 0. I refer to the former approach as "Trim" and to the latter as "No Trim" in the tables. 11 Suppleementary Appendix A.2 presents some additional computational details. 14 J o u r n a l P r e -p r o o f As we can see, the estimation methods based on taking the logarithms without trimming have the largest bias and standard deviation, and as a result the worst MSE. Among the remaining four approaches, the ones based on cumulative numbers (or their logarithms) tend to outperform the ones based on the daily numbers (or their logarithms). Overall, it appears that estimation based on the cumulative numbers, done in levels rather than logarithms, leads to the lowest MSE of the estimates of R 0 , and the estimation results are pretty insensitive to the choice of the weighting matrix W . Using incorrect initial conditions (e.g. E 0 = 16 instead of E 0 = 2) has almost no effect on the quality of resulting estimates. It introduces small bias but does not affect the standard deviation of the estimates. Because in most cases the variance term dominates the squared bias term, the MSE remains almost unchanged. Moreover, in terms of the MSE, it is actually better to use an incorrect fixed initial condition rather than to estimate it from the data. Estimating E 0 and I 0 may reduce bias, but increases variance and hence leads to slightly worse MSE. Table 5 presents the mean of the estimates of the fatality rate α for different initial conditions. In line with the identification results, when E 0 increases (decreases), the estimates of α decrease (increase) proportionally. For instance, the mean of the estimates of α for E 0 = 2 is twice as large as for E 0 = 4, but is smaller by a factor of two than for E 0 = 1. Next, I study the robustness of the estimate of R 0 to a particular form of model misspecification. In the true DGP, I replace equation (2.6) with dC(t) dt = λσE(t), so that the reported cases are based on the number of exposed rather than infectious. However, I estimate the model as if it was generated by equations (2.1)-(2.6). Table 6 reports the results. As we can see, this form of misspecification has no noticeable effect on the estimates of R 0 . Another DGP I consider is based on the model in equations (2.1)-(2.6) but involves a different way of introducing errors. I focus on the model in logarithms of the daily numbers and simulate the data as in equations (5.5)-(5.6), where ν D (t) and ν C (t) are both N (0, 0.025 2 ). I then compute ∆D * (t, R 0 , α) and ∆C * (t, R 0 , λ) by taking exponents and construct the cu-15 J o u r n a l P r e -p r o o f Journal Pre-proof mulative series accordingly. Unlike the previous DGPs, which produced integer values as a result, this DGP will generally produce the number of deaths and reported cases that are not integers. In addition to this DGP I also consider its modified version that rounds the number of daily deaths and reported cases to the nearest integer. The true initial values are E 0 = 2, I 0 = 0. In estimation, I use these fixed values as well as attempt to estimate the initial values. Tables 7-10 report the mean, bias, standard deviation, and mean squared error of the estimates of R 0 across 500 simulation draws. The upper two panels of each Overall, based on the simulation results, methods based on the cumulative numbers seem to outperform methods based on daily values. Relative performance of different estimation approaches (e.g. based on levels versus logarithms) depends on the true DGP. One advantage of the approach based on levels is that it does not involve trimming. The use of the SUR approach to estimation with the efficient weighting matrix may be motivated by the desire to obtain more efficient parameter estimates. Interestingly, this is This section presents the empirical results. Before I move on to the main results, I discuss computational issues associated with estimation of the model. Based on the arguments from the previous sections, when initial parameters change, the estimate of R 0 should remain virtually unchanged, while the estimates of α and λ should change proportionally to the changes in initial parameters. In practice, however, this is not always the case. Both α and λ are constrained to lie between 0 and 1, and when these constraints are binding or close to binding, changes in the initial values can have a substantial effect on the estimate of R 0 . Tables 11 and 12 compare the estimation results for USA and California for the values of epidemiologic parameters σ = 1 /4 and γ = 1 /10 when the initial conditions change from The parameter estimates for California behave as expected: the estimate of R 0 remains virtually unchanged, while the estimates of α and λ are cut in half. However, for the US as a whole the picture is different. The estimates of R 0 change substantially as the initial conditions change, while the changes in the estimates of α and λ do not follow the expected pattern. For instance, when the model is estimated in levels, the estimates of α and λ remain virtually unchanged as the initial condition changes. This example illustrates that when the parameter estimates are close to the boundary, changes in initial conditions may lead to somewhat unexpected estimation results. J o u r n a l P r e -p r o o f epidemiologic parameters γ and σ, I consider several scenarios in the table: "fast" with σ = 1 /3, γ = 1 /5, "medium" with σ = 1 /4, γ = 1 /10, and "slow" with σ = 1 /5, γ = 1 /18. I also consider a modified version of the SEIRD model that replaces β S(t) N I(t) in equations (2.1) and (2.2) with β S(t) N (I(t) + qE(t)) for q = 0.5. This version of the model assumes that people in the exposed compartment can be contagious, but to a lesser extent that people in the infectious compartment. While these exact choices are somewhat arbitrary, considering several values of epidemiologic parameters instead of just one allows me to better understand their effect on the estimation results and forecasts. The initial conditions are E 0 = 1, I 0 = 0 for the US and California and E 0 = 10, I 0 = 0 for Japan. The choice of the initial conditions for Japan is tricky because the estimate of λ is at the upper bound of 1, and the choice of initial conditions affects the estimate of R 0 . Given that Bommer and Vollmer (2020) estimate that the detection rate in Japan in March was around 20-25%, even the initial condition E 0 = 10 may be too low. As we can see from the table, the estimates of R 0 for a given country or region change a lot as epidemiologic parameters change. For example, the estimate of R 0 for the US ranges from under 5 in the "fast" scenario to around 15-18 in the "slow" one. Moreover, for the fixed values of epidemiologic parameters γ and σ, the estimates of R 0 differ a lot between regions. For instance, in the "medium" scenario with σ = 1 /4 and γ = 1 /10, the estimates of R 0 vary from 2.6 for Japan to around 9.5 for the US. While the magnitude of these differences might be surprising, heterogeneity itself is not. If various mitigation or suppression policies can reduce R 0 , then one could expect that different countries have different values of R 0 due to the differences in their approaches to dealing with COVID-19, as well as differences in social norms, population density, etc. demonstrates that the predicted total number of deaths from the COVID-19 epidemic in the four models ranges from around 70 thousand to more than 300 thousand. Next, Figure 5 fixes the values of epidemiologic parameters σ = 1 /4 and γ = 1 /10 and considers the pessimistic and optimistic scenarios, given by different initial conditions, for which the resulting models are observationally equivalent. The pessimistic scenario corresponds to the initial condition E 0 = 1, I 0 = 0, while the optimistic scenario corresponds to E 0 = 30, I 0 = 0. Intuitively, the lower the initial values, the lower the cumulative number of people who have had the virus, the higher the estimated fatality rate, and the higher the forecasted death toll. We can see that different initial conditions lead to observationally equivalent models in the short run. However, there are large differences in estimates of unobserved variables and in long run forecasts. For instance, the model with E 0 = 1, I 0 = 0 estimates that the number of people with COVID-19 (here I count people both in the exposed and infectious compartments) in California on March 22 was around 50 thousand and predicts over 150 thousand deaths in the long run. In contrast, the model with E 0 = 30, I 0 = 0 estimates that there were around 1.6 million people with COVID-19 and predicts less than 5 thousand deaths in the long run, a 32-fold difference. Next, I demonstrate empirically why it is important to allow the fraction of all cases that is reported to differ from one. Figure 6 presents the results for different estimates of R 0 in the "medium" scenario with σ = 1 /4 and γ = 1 /10. One model corresponds to the pessimistic case scenario from Figure 5 withR 0 = 5.06 and E 0 = 1. The forecasted long run number of 19 J o u r n a l P r e -p r o o f Journal Pre-proof deaths for that model is around 150 thousand. The remaining two models estimate R 0 from the confirmed cases data assuming that all cases are reported, i.e. λ = 1, and then recover α from the deaths data. The resulting estimates of R 0 are lower, 4.18 when the initial condition is I 0 = 0, E 0 = 1, and 2.90 when the initial condition is I 0 = 10, E 0 = 10. The estimates of R 0 , α, and λ for these models are reported in Table 14 . As we can see from the upper panel of the figure, the resulting models, especially the latter one, provide a poorer fit of the observed data: they cannot generate enough curvature because of the low R 0 . The bottom panel of Figure 6 shows that the forecasted long run number of deaths from both these models is over 600 thousand, a lot higher than in the pessimistic model with higher R 0 that fits the data well. Thus, estimating R 0 based on the confirmed cases data under the assumption that all cases are reported leads to the downward bias in the estimate of R 0 , poor fit of the observed data, and severe overestimation of the long run number of deaths. Finally, I study whether additional information can help calibrate the initial conditions using Iceland as an example. Iceland is an interesting country to study because it was among the first countries to launch wide-scale random, or nearly random, testing of its population. 12 While it is debatable whether testing in Iceland is completely random, my goal here is to demonstrate how information from these tests could in principle be used to calibrate the initial values. I fix the epidemiologic parameter values at σ = 1 /4 and γ = 1 /10. Thus, the use of additional information leads to a more than 4-fold reduction in the range of forecasted deaths for observationally equivalent models. This result demonstrates the value of auxiliary information that becomes available due to random testing. In this paper, I show that the SEIRD model for COVID-19 is poorly identified from the short run data on deaths and reported cases. There can be many different models that are indistinguishable in the short run but result in markedly different long run forecasts. For instance, the forecasted number of deaths in California in observationally equivalent models ranges from under 5 thousand to over 150 thousand. Thus, this paper highlights that long run forecasts for COVID-19 heavily depend on arbitrary choices made by the researcher. Available data cannot be used to determine which model is correct because there are many models that look identical in the short run. Finally, I demonstrate that auxiliary information from random tests for COVID-19 can help calibrate the initial values of the model and reduce the range of possible forecasts that are consistent with the observed data. Random, or nearly random, tests were conducted in Iceland, and utilizing the information from these tests leads to a more than 4-fold reduction in the range of the forecasted number of deaths. The model I consider is fairly simplistic and does not take into account important factors such as possible overloading of the health care system, mitigation efforts, behavioral responses to the epidemic, etc. There are more sophisticated and realistic epidemic models that may be able to predict the spread of COVID-19 and the long run number of deaths better than the model studied here. However, those models usually have even more parameters, so one may worry that their identification would be even more troublesome. The table presents the mean of the estimates of R 0 across 500 simulation draws for different estimation methods. The true value is R 0 = 5.755. The estimates in the first column are based on the deaths data only. The estimates in the second column are based on the reported cases data only. The estimates in the remaining three columns combine the deaths and reported cases data and use different weighting matrices. Journal Pre-proof The table presents the standard deviation of the estimates of R 0 across 500 simulation draws for different estimation methods. The estimates in the first column are based on the deaths data only. The estimates in the second column are based on the reported cases data only. The estimates in the remaining three columns combine the deaths and reported cases data and use different weighting matrices. J o u r n a l P r e -p r o o f J o u r n a l P r e -p r o o f The table presents the mean of the estimates of α 0 across 500 simulation draws for different estimation methods. The true value is α = 0.0067. The estimates in different columns combine the deaths and reported cases data and use different weighting matrices. J o u r n a l P r e -p r o o f Journal Pre-proof The table presents the mean, bias, standard deviation, and MSE of the estimates of R 0 across 500 simulation draws when λγI(t) in equation (2.6) is replaced with λσE(t). The estimates in the first column are based on the deaths data only. The estimates in the second column are based on the reported cases data only. The estimates in the remaining three columns combine the deaths and reported cases data and use different weighting matrices. The table presents the average bias of the estimates of R 0 across 500 simulation draws for different estimation methods. The estimates in the first column are based on the deaths data only. The estimates in the second column are based on the reported cases data only. The estimates in the remaining three columns combine the deaths and reported cases data and use different weighting matrices. Journal Pre-proof The table presents the estimates of R 0 , α, and λ for γ = 1 /10, σ = 1 /4, and different initial values. The table presents the estimates of R 0 , α, and λ for γ = 1 /10, σ = 1 /4, and different initial values. The upper panel shows the short run number of deaths and reported cases for three sets of parameters θ 1 = (5, 0.01, 0.2, 2, 2, 2), θ 2 = (5, 0.005, 0.1, 4, 4, 2), and θ 3 = (5, 0.004, 0.08, 2, 2, 10), where θ = (R 0 , α, λ, E 0 , I 0 , T 0 ). The middle panel shows the long run forecasts from these models. The lower panel fixes the initial conditions and shows the short run number of deaths and reported cases for (R 0 , α, λ) = The upper panel shows the logarithms of the short run number of deaths and reported cases for three sets of parameters θ 1 = (5, 0.01, 0.2, 2, 2, 2), θ 2 = (5, 0.005, 0.1, 4, 4, 2), and θ 3 = (5, 0.004, 0.08, 2, 2, 10), where θ = (R 0 , α, λ, E 0 , I 0 , T 0 ). The lower panel fixes the initial conditions and shows the logarithms of the short run number of deaths and reported cases for (R 0 , α, λ) = (5, 0.01, 0.2), (3, 0.01, 0.2), and (3, 0.05, 0.8). J o u r n a l P r e -p r o o f Journal Pre-proof The The upper panel shows the fit of the actual cumulative deaths and reported cases by models with four different values of epidemiologic parameters σ and γ. The middle panel shows the fit of the logarithms of the actual cumulative deaths and reported cases for the same four models. The lower panel shows the forecasts from the same four models. J o u r n a l P r e -p r o o f Journal Pre-proof The upper panel shows the fit of the actual cumulative deaths and reported cases by models with different initial conditions. The middle panel shows the fit of the logarithms of the actual cumulative deaths and reported cases by these models. The lower panel shows the forecasts from these models. The values of epidemiologic parameters are σ = 1 /4, γ = 1 /10. J o u r n a l P r e -p r o o f Journal Pre-proof The upper panel shows the fit of the actual cumulative deaths and reported cases by models with and without underreporting and different initial conditions. The middle panel shows the fit of the logarithms of the actual cumulative deaths and reported cases by these models. The lower panel shows the forecasts from these models. J o u r n a l P r e -p r o o f Journal Pre-proof The figure presents the results for Iceland. The left panel does not use any additional information. The right panel matches the number of active COVID-19 cases on March 21 to the one estimated based on testing a random sample of population. The upper panel shows the cumulative deaths fit by models with different initial values E 0 . The middle panel shows the cumulative reported cases fit by these models. The lower panel shows the deaths forecasts from these models. In this section, I study an approximate solution of the simplified SIRD model to illustrate which model parameters can be identified from the data. The model is: During the early stages of the epidemic, S(t)/N ≈ 1, so that the equation for the evolution I(t) is, approximately, dI(t) dt ≈ (β − γI(t)) = γ(R 0 − 1)I(t) The solution is given by I(t) = I(0) exp(γ(R 0 − 1)t). It is possible to show then that the approximate solutions for D(t) and C(t) are given by Because D(t) and C(t) depend on α, λ, and I(0) through the products αI(0) and λI(0), α and λ cannot be identified separately from I(0). R 0 can be identified for a fixed value of γ, but it cannot be identified separately from γ in general: one can increase R 0 and decrease γ, so that γ(R 0 − 1) remains unchanged, and adjust α and λ accordingly so that α R 0 −1 I(0) and λ R 0 −1 I(0) remain unchanged. J o u r n a l P r e -p r o o f Finally, note that for a large enough t, exp(γ(R 0 − 1)t) >> 1, so that log(exp(γ(R 0 − 1)t) − 1) ≈ log exp(γ(R 0 − 1)t) = γ(R 0 − 1)t. Thus, for a large enough t, log D(t) ≈ γ(R 0 − 1)t + log α + log I(0) − log(R 0 − 1) (A.8) log C(t) ≈ γ(R 0 − 1)t + log λ + log I(0) − log(R 0 − 1) (A.9) These equations demonstrate that the log series are approximately linear in t and that R 0 and γ affect the slope of the log series, while α, λ, and I(0) only affect the level. In this section, I describe some of the computational challenges associated with estimating the SEIRD model. Estimation consists of two steps. First, I simulate the model for a given choice of R 0 , α, and λ using the DifferentialEquations package in Julia, developed by Rackauckas and Nie (2017) . More specifically, the routine takes the model (2.1)-(2.6) and simulates the paths of S(t), E(t), I(t), R(t), D(t), and C(t) for the given values of R 0 , α, and λ. Next, I compute the difference between the modeled and observed quantities (cumulative or daily, in levels or logarithms) and minimize the appropriate nonlinear objective function using the Optim package. 14 I use the Nelder-Mead algorithm to find the solution, as I found that it outperforms other algorithms available in Optim, such as simulated annealing or particle swarm. The Nelder-Mead algorithm in Optim does not allow for bounds on the parameters. Because R 0 is constrained to be nonnegative and α and λ are constrained to lie between 0 and 1, I use a reparametrization to ensure that all parameter estimates satisfy the constraints. Namely, I write R 0 = exp(x 1 ), α = exp(x 2 )/(1 + exp(x 2 )), λ = exp(x 3 )/(1 + exp(x 3 )), where (x 1 , x 2 , x 3 ) are the parameters of the routine. The choice of the parameters R 0 = 5.75, α = 0.0067, λ = 0.12, used in simulations, corresponds to (x 1 , x 2 , x 3 ) = (−5, 1.75, −2). Optimal Targeted Lockdowns in a Multi-Group SIR Model What Will Be the Economic Impact of COVID-19 in the US? Rough Estimates of Disease Scenarios Policy Implications of Models of the Spread of Coronavirus: Perspectives and Opportunities for Economists An SEIR Infectious Disease Model with Testing and Conditional Quarantine Average detection rate of SARS-CoV-2 infections has improved since our last estimates but is still as low as nine percent on March 30th Transmission dynamics of the great influenza pandemic of