key: cord-0814036-y6aa1btx authors: Zhai, Z.-M.; Long, Y.-S.; Kang, J.; Li, Y.-L.; Zeng, L.; Han, L.-L.; Lin, Z.-H.; Zeng, Y.-Q.; Wu, D.-Y.; Tang, M.; Xu, D.; Liu, Z.; Lai, Y.-C. title: State-by-State prediction of likely COVID-19 scenarios in the United States and assessment of the role of testing and control measures date: 2020-04-29 journal: nan DOI: 10.1101/2020.04.24.20078774 sha: 4ca005a460e7c8749e22be23cfb4b480e4314adb doc_id: 814036 cord_uid: y6aa1btx Due to the heterogeneity among the States in the US, predicting COVID-19 trends and quantitatively assessing the effects of government testing capability and control measures need to be done via a State-by-State approach. We develop a comprehensive model for COVID-19 incorporating time delays and population movements. With key parameter values determined by empirical data, the model enables the most likely epidemic scenarios to be predicted for each State, which are indicative of whether testing services and control measures are vigorous enough to contain the disease. We find that government control measures play a more important role than testing in suppressing the epidemic. The vast disparities in the epidemic trends among the States imply the need for long-term placement of control measures to fully contain COVID-19. On about April 3, 2020, the United States registered the largest number of confirmed patients with the 2019 novel coronavirus (COVID- 19) among all countries in the world: over 260,000 with the number of deaths exceeding 6,600. As of April 16, there had been over 641,000 cases and over 31,000 deaths. This development is rather astonishing, considering that there were only a few confirmed cases two months ago, when a massive outbreak in China had occurred. In fact, from the end of January, for various reasons the number of reported cases in the US increased quite slowly for a stretched period of time. The onset of an exponential increase in the US occurred in the middle of March, where it became apparent that COVID-19 began the phase of community spreading. In response, the White House issued a nationwide social-distancing order on March 16. Statewide stay-at-home or shelter-in-place orders were given by the governors of various States at different time (at the time of writing, there are still seven States that have not issued such an order). To quantitatively predict the effectiveness of the federal and State government measures to control COVID-19 spreading in the United States is utterly urgent. (Here we use "State" to denote a State in the US to distinguish it from a "state" as in an epidemic state.) The United States differs from other countries in that the circumstances under which COVID-19 spreads vary dramatically among different States: not only are the levels of travel restriction orders dissimilar, but other factors affecting the disease spreading such as the population, medical resources, and social/political attitudes are also distinct among the States. A quantitative assessment of the effects of the control measures taken by the government to contain the COVID-19 pandemic thus needs to be carried out on a State-by-State basis. A complication is that each individual State is not a closed system: people move into and out of the State on a daily basis. ison with the available data after March 29 enables us to pin down the specific scenario(s) for each State, attesting to the predictive power of the model. The predicted scenarios vary among the States and they are indicative of whether testing services and control measures are vigorous enough to fully contain the spreading of COVID-19. Systematic simulations reveal that, while sufficient testing can be beneficial, the control measures play a more important role in suppressing the epidemic, where strict government actions can reduce the epidemic duration and size in an exponential manner. An alarming result is that the length of the epidemic duration can differ by as much as one year among the States, posing a grave challenge to the government efforts in suppressing the pandemic and implying the continuous need for long-term and vigorous enforcement of the current control measures. RESULTS We present prediction results from our generalized, dual coupled system, five-state model (described in Methods) for ten States: New York (the current epicenter), Washington (the previous epicenter), New Jersey, California, Michigan, Florida, Illinois, Massachusetts, Louisiana, and Arizona. (Source of data -Center for Systems Science and Engineering at Johns Hopkins University [25] ). State-wise, up to March 29, the first nine States in this list had the top nine largest numbers of confirmed cases among the 50 States. The last State in the list, Arizona, is the home State of one of the co-authors. We use the daily number of confirmed cases up to March 29 to estimate the model parameters. For each State, we generate a number of epidemic scenarios and use the empirical data after March 29 to determine the most likely one(s). Detailed results for Arizona, Washington, and New York are described below, while the results for the remaining seven States are presented as SNs. In all the cases, each State is an open system with the influences of the rest of the country treated as a perturbation. Predictions of the epidemic trend of the entire country are also included as an SN. (Because the US is effectively a closed system, in this case the generalized coupling model is not necessary; instead, the single system, five-state model that we have developed recently [24] suffices.) For quantitatively assessing the effects of interstate travel on the epidemic through comparison, we also include results from the closed system model for two States: New York and Arizona. In addition to predicting the most likely epidemic scenario(s) for each State, we seek to quantitatively assess the impacts of the government testing capability and control measures such as the nationwide social-distancing and the statewide stay-at-home or shelter-in-place orders that result in an exponential decrease in the human social and movement activities. Quantitatively, for any State, the collective effects of these measures can be described by two parameters: the exponential decay rates of the activities associated with interstate and intrastate travel, denoted by l inter and l intra , respectively, where a larger rate corresponds to more stringent control measures. The values of the two parameters characterizing the control measures can be estimated based on the available epidemic data [24] . Another key parameter is the fraction of undocumented infections, denoted as h, the role of which in the COVID-19 epidemic has been studied recently in the framework of the closed system, five-state model [24] . The value of h is determined by the testing and surveillance capability of the State. For clarity, we organize our results in terms of different combinations of l inter , l intra , and h, leading to distinct epidemic scenarios under government imposed control measures. The human movement data for the populations into and out of a given State suggest that the interstate and intrastate activity decay rates are approximately equal (Methods): l inter ⇡ l intra ⌘ l. For choosing reasonable values of l, we use data from China to gain insights. In particular, the control measures imposed by the Chinese government were extremely stringent, where the human social activities are reduced to about 20% of the normal level after seven days of lockdown [24] , giving rise to l ⇡ 0.24. In the US, the government measures are not as stringent as in China, so we set l = 0.18, corresponding to a seven day reduction of about 30% in the social activities from the beginning of the implementation of the control measures. For comparison, we also simulate the hypothetical case of more strict control measures: l = 0.24. To set the possible values of h for New York and other States in the US, we note that in China, due to the widespread government actions with extensive and sufficient testing and surveillance capabilities, most cases of infection were claimed to have been detected and reported, leading to an extremely low value of h: h = 0.012 as reported by the Chinese government (a recent work [26] reporting a substantial fraction of undocumented infections based on data from the early phase of the epidemic in China notwithstanding). In the US, due to lack of sufficient testing, the value of h can be much larger. While it is not possible to obtain an accurate estimate of the value of h for the US, it is not unreasonable that it can be at least 50%. We thus simulate two situations: h = 0.5 and 0.8. Taken together, for each of the ten States and the US as a whole, we carry out systematic simulations for four combinations of (h, l): (0.5, 0.18), (0.5, 0.24), (0.8, 0.18), and (0.8, 0.24). We focus on the time evolution of three quantities that are the key dynamical variables of our coupled five-state model (Methods): (i) the population of the individuals in the "hidden" state who have been infected but who are asymptomatic or show only mild symptoms, denoted as H(t), (ii) the population of infected individuals who show clear symptoms, denoted as I(t), and (iii) the population of confirmed individuals, denoted as J(t). In general, H(t) and I(t) exhibit a "humped" structure: with time they increase from zero, reach a maximum, and then decay to zero. The inflection point occurs when I(t) reaches maximum, and the epidemic is deemed over when I(t) has approached zero. The number of confirmed cases, J(t), is a non-decreasing function of time and reaches a constant value as I(t) approaches zero. The epidemic duration is measured by the date on which the H-state and I-state populations reach zero. The epidemic size can be characterized by the constant value of the final number of confirmed cases. To make quantitative predictions, it is necessary to have the values of the initial number of individuals in the hidden state and the infection rate, i.e., the probability for an individual in the susceptible state to switch to the H state, which can be estimated through an optimization procedure [24] . The values of the two parameters so obtained are denoted as H ⇤ (0) and b ⇤ . The starting date of COVID-19 epidemic in Arizona was March 5 (excluding a few sporadic cases before this date). The State population is about 7.28 millions. Figure 1 (a) shows that, for h = 0.5 and l = 0.18, the inflection point should occur on April 5 with the peak I value of about 2,600 and the epidemic will end around September 20 with 5,200 final confirmed cases. For the same value of h but l = 0.24, the inflection point would occur three days earlier with the peak I value reduced by about 600, as shown in Fig. 1(b) . In this case, the epidemic will be over three months earlier: on June 20, with 2,800 final confirmed cases -a reduction of about 46%. The results for h = 0.8 and l = 0.18 are shown in Fig. 1 about 8,800 final confirmed cases (one fifth of the actual infections). For h = 0.8 but l = 0.24, the inflection point would occur around April 3 with the peak I value of 2,100, as shown in Fig. 1(d) . Comparing with the case in Fig. 1(b) where tests are more extensive (h = 0.5) but the restriction measures are more strict (l = 0.24), we see that, even when the test scale is significantly reduced (h = 0.8), if more stringent control measures are imposed, the epidemic would only be slightly worse: delay of inflection by one day, increase in the peak I value by 100, one month longer in duration, and an approximate 17% increase in the number of final confirmed cases. Comparing with the real data of daily number of confirmed cases, we see that scenario (c) is the most likely current scenario for Arizona. Overall, the results for Arizona indicate the need to impose the 5 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . strictest possible control measures, especially when extensive testing is not available. If Arizona was closed to traffic into and out of the State, simulation results (SN 2) reveal similar effects of the population movements on the epidemic quantities as those for the State of New York: small changes in the inflection occurrence time and peak I value but non-negligible impacts on the epidemic duration and the final infection size. For example, for the most likely current scenario (i.e., h = 0.8 and l = 0.18), interstate travel will make the epidemic about half month longer and increase the final number of confirmed cases by about 1,200. This means that, even for Arizona where the epidemic is relatively mild, interstate travel poses a not-so-insignificant burden in State's efforts to control the epidemic. The State of Washington is unique because the first case of COVID-19 in the US occurred there and it was the epicenter in the early phase of the epidemic. However, the spreading subsided and stabilized quickly while outbreaks in other parts of the country occurred. Our model correctly predicts this rather unusual behavior, as follows. In January and through most of February, there were a few cases in Washington State, a "nonepidemic" situation to which our model is not applicable. We use the daily number of confirmed cases from February 26 to March 29 to estimate the model parameters. The State population is approximately N = 7.6 ⇥ 10 6 . Figure 2 shows four distinct epidemic scenarios. In particular, Fig. 2 (a) shows that, for h = 0.5 and l = 0.18, the inflection point will occur on about April 3 with the peaked infected population of about 8,000. The epidemic will last through August 2 and the final number of confirmed cases will be about 13,000. For the same value of h (0.5), if the government measures are more stringent (l = 0.24), as shown in Fig. 2(b) , the occurrence of the inflection point will be two days earlier, the peak value of I(t) would be about 7,500, the epidemic would be approximately two months shorter (ending date around June 10), and about 11,000 cases will be finally confirmed -a 15% decrease as compared with the case of l = 0.18 in Fig. 2 (a). Now assume much reduced government testing and surveillance, resulting in h = 0.8. Figure 2 (c) shows that, for l = 0.18, the inflection point will occur on April 4 with the peak I(t) value of 9,000, the ending date of the epidemic will be about December 4, and the final size of the confirmed population will be 17,000 (so altogether 85,000 people would be infected). Now suppose a more restrictive set of measures: l = 0.24. In this case, the inflection point will occur on April 2. Comparing with the case in Fig. 2 (b), we see that the peak I value of 8,500 represents only a small increase, and the epidemic duration will be about one month longer (ending date on July 3) with 12,500 confirmed cases. Comparing the projected J(t) with the current data, we find that Figs. 2(a) and 2(d) represent the most likely scenarios for State of Washington at the present. This is intriguing because the two cases differ markedly in terms of the State testing capability and strength of control measures. Another result is that the epidemic scale of this early epicenter is projected to be at least one order of magnitude smaller than that of the current epicenter New York. Figure 3 presents the predicted outcomes of four scenarios. In Fig. 3(a) , the fraction of undocumented infection is 50% (h = 0.5) and the control measures imposed by the State government are relatively less stringent (l = 0.18). In this case, the inflection point will occur around April 8 when the infected population I(t) is peaked at the value of approximately 300,000. Because the mean time delay from I state to J state is set as seven days, the population of newly confirmed patients will be peaked around April 15, the epidemic will be over on about November 21, and the final number of confirmed cases would be about 550,000. In Fig. 3(b) , h is still 50% but more stringent control measures are assumed: l = 0.24. In this case, the inflection point will occur on about April 5 with the peaked value of I(t) about 240,000. Under the stringent measures, the epidemic will be over around August 1 with the final number of confirmed cases about 330,000. Figure 3 (c) shows that, for h = 0.8 and l = 0.18, the inflection point will occur around April 9 with the peaked infected population 350,000. The final number of confirmed cases would be 750,000 and the epidemic would last into 2021 to end around May 30. Because only 20% of 8 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . the infections are documented, the actual infected population would be about 3.75 millions. Figure 3(d) shows that, for h = 0.8 and l = 0.24, the inflection point will occur around April 6 with the peaked infected population 270,000. In this case, the epidemic will last until September 6 with the total number of confirmed cases about 400,000. Comparing Fig. 3 (c) with Fig. 3(a) , we see that, if the control measures are not as stringent (l = 0.18), insufficient tests and surveillance leading to an increase in the value of h from 50% to 80% will have devastating consequences: the epidemic duration will be significantly longer (six more months) and the number of confirmed cases will be 36% higher. However, if the government measures are stringent (l = 0.24), as can be seen by comparing Figs. 3(d) with Fig. 3(b) , the same increase in the value of h would result in only one extra day in the arrival of the inflection point, a month delay in the duration, and about 20% increase in the number of confirmed cases. These results indicate that, relative to testing and surveillance, imposing more stringent control measures would be more effective at containing the disease spreading. Checking against the available data to date, we find that none of the four scenarios in Fig. 3 matches the real COVID-19 trend in the State of New York. This is due to a highly localized or "singular" behavior: most infections in the State have occurred in New York City. Indeed, if we treat New York City as an independent "State" system, a good match between some predicted scenario(s) and the empirical data arises. Detailed results are presented in SN 3. Because of the severity of the epidemic in New York, it is of interest to assess the impact of control measures in a systematic manner. Because these measures are imposed by the State government, we consider the State of New York and calculate the epidemic duration and the final sizes of the three key populations versus l. Figure 4 (a) shows, for h = 0.5, the epidemic ending date versus l. It can be seen that strengthening the measures can lead to an exponential shortening in the duration of the epidemic. The corresponding results for h = 0.8 are shown in Fig. 4(b) . Figures 4(c) and 4(d) show, for h = 0.5 and h = 0.8, respectively, the peak H and I populations as well as the final size J of the confirmed cases versus l, where more stringent government measures can lead to an exponential decrease in the final size of the confirmed population. In terms of specific numbers, a loose set of control measures, e.g., l = 0.16, will result in 800,000 (h = 0.5) or 1.1 million (h = 0.8) confirmed cases for the State of New York. However, a stringent set of measures, say l = 0.28, would reduce the corresponding number to about 300,000 for both h = 0.5 and h = 0.8. That is, if the government measures are sufficiently strict, the size of the tested population becomes an insignificant factor determining the epidemic trend. How do the daily population movements into and out of the State affect the epidemic trend? To address this question, we simulate the closed five-state model for the State of New York and compare the results (SN 4) with those in Fig. 3 . We find that the population movements have relatively small effects on the occurrence time of the inflection point and the peak value of the I-state population, but can have a dramatic effect on the epidemic duration and the final size. For example, for h = 0.8 and l = 0.18, the population movements can make the duration four months longer but can reduce the final size of the confirmed population by about 15,000. This means that, without a complete lockdown of the entire State of New York, the travelers out of the State will spread the virus to the rest of the country, leading to a prolonged epidemic duration for the US and the State itself. This conclusion is also supported by simulation results of the closed model but for the entire country (SN 5), which demonstrate a similar epidemic trend as that of New York for every parameter setting tested. The implication is that, while COVID-19 has already begun to spread in other States throughout the country, a complete lockdown of the epicenter would still be beneficial or even necessary. Merely practicing social distancing and imposing stay-at-home order are not sufficient! The results obtained so far are under the assumption that interstate and intrastate travel restrictions generate the same rate of reduction in the social activities: l inter = l intra . In reality the two rates can be different. To systematically assess the effect of this difference, we fix one rate and calculate the epidemic duration and size as a function of the other rate for the States of New York and Arizona. These results justify and demonstrate the benefits for the State government to issue strict stay-at-home or shelter-in-place orders to limit intrastate activities. Predicted scenarios for seven other States: New Jersey, California, Michigan, Florida, Illinois, Massachusetts, and Louisiana are presented in SNs 7-13, respectively. The results for the ten 11 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04. 24.20078774 doi: medRxiv preprint States and the US are summarized in Supplementary Table 1 . The number of confirmed cases in the United States has exceeded a quarter of global cases and the epidemic is evolving rapidly in some States. Unlike countries such as China where the orders of the central government are uniformly and strictly followed in the entire country, there is vast heterogeneity among the States in the US in terms of population movements and the government restriction orders, leading to drastically varying epidemic dynamics in different States. As a result, prediction of the COVID-19 epidemics in the US needs to be done in a State-by-State manner. A complication is that, because of the daily interstate population movements, each individual State is effectively an open system, rendering inapplicable the recently developed non-Markovian model for a closed system [24] . We have developed a generalized, non-Markovian epidemic model for COVID -19 in an open setting, taking into account population movements into and out of the system. The main goal is to predict, for each of the ten States selected for this study (the nine States with the most severe COVID-19 epidemics in the early stage, plus the State of Arizona), the most likely epidemic scenario(s) based on the currently available data. In particular, for any given State, we have used the daily number of confirmed cases up to March 29 to estimate the two key model parameters: the initial population of hidden state and the infection rate, enabling the model to generate possible epidemic trajectories into the future. Comparing the trajectories with the available data up to the time of writing (April 12) allows us to identify the most likely epidemic scenario(s) for the State. For all ten States except the State of New York, at least one such epidemic scenario can be identified. The difficulty with New York State has been resolved by noting the existence of a singular and highly localized spot: New York City, into which vast majority of the cases in the State fall, generating an exceptionally high degree of heterogeneity in the State and rendering inaccurate model prediction of the entire State. The situation is similar to attempting to predict the possible epidemic scenarios for the entire country of USA, which is not feasible due to the highly heterogeneous pattern of outbreaks. Indeed, simulation of New York City alone as an independent system has generated two most likely scenarios that match with the data. The difficulty with New York State and our successful resolution have thus provided further justification for the State-by-State prediction approach. All these results have not only validated our model, but also demonstrated its predictive power. The most likely epidemic scenario(s) for any given State is (are) indicative of whether testing services and control measures are vigorous enough to fully contain the spreading of COVID-19, providing guidance for improvement. Another goal is to evaluate the effects of government imposed control measures on the epidemic trends through various scenarios. We have reported results from model predictions of the ten States that differ in their testing and surveillance capabilities. The measures imposed by the States to control COVID-19 also vary, which can be characterized by the exponential rates of reduction in the interstate and intrastate human activities. Our finding is that, with insufficient testing capability leading to a larger fraction of undocumented cases (e.g., 80% versus 50%) and with strict control measures, the occurrence of the inflection point will be delayed for a few days, the peak infected and documented population will increase by 20%, the epidemic duration will be prolonged for as long as half year, and the final infected population can increase by 70%. However, relaxing the control measures can have more devastating consequences. For example, a loose restriction measure (e.g., social-distancing only without stay-at-home order) can lead to a delay in the occurrence of the inflection point by ten days, a 40% increase in the peak infection, epidemic duration of over one year, and a more than five-fold increase in the size of the final infections (documented or undocumented). Simulation results incorporating uncertainties in the prediction still support the above conclusions (SN 13). We also find that imposing intrastate movement restriction can suppress exponentially the epidemic in both its duration and size, regardless of the current epidemic trend of the State. A comparison among the predicted scenarios for different States reveals a difference in the occurrence of the inflection point of about ten days but a stretch in the epidemic duration for as long as one year. Especially, simulations treating the State of New York as an open or a closed system reveal that, if the system is open, the final confirmed cases would be reduced by 15,000 but the duration can be longer by four months. This is because, a fraction of the reduced infected population in New York would diffuse into other States, making the epidemic of the whole country longer. Interstate travel thus poses a significant threat to States such as Arizona, where the epidemic is much less severe. This suggests the necessity of a complete lockdown of New York, the current epicenter, to significantly shorten the epidemic in the US as a whole. Our findings suggest that the duration of such strict lockdown should be no less than one year. To model the COVID-19 epidemic in an open system, three unique features must be considered: (1) undocumented population with no or mild symptoms, (2) non-Markovian state transitions, and (3) time dependence due to human movements in and out of the system. Models developed in the past few months [2] [3] [4] [5] [6] [7] [8] [9] 27] provide insights but do not take into account the three features. If the system is closed, it is only necessary to consider the first two features, resulting in a five-state model [24] . Here we generalize the closed-system model to include time dependence. Figure 6 illustrates the generalized model. An individual can be in one of the five states at each time step: susceptible (S), hidden (H), infected (I), confirmed and isolated (J), and removed (R). The states S, I, and R have the same meanings as in the classical SIR model for infectious disease, but states H and J are unique for COVID-19. In particular, an individual in H has had the virus and is infectious but is asymptomatic or only mildly symptomatic, in contrast to the I state in which individuals show symptoms. The J state contains individuals who are confirmed with COVID-19. Note that, individuals in the I or J state are quarantined or hospitalized, so the probability for them to infect others can be neglected. Individuals in the R state, by definition, are not infectious. Thus, only the H individuals are capable of infecting others [24] . The spreading dynamical process can be described, as follows. Individuals in the S state are infected by H individuals at the rate b and switch to the H state. A fraction h of H-state individuals recover spontaneously or die, a process that requires t 3 days. The parameter h thus represents the fraction of undocumented infections, and its value is determined by the testing capability of the country or State. The remaining (1 h) fraction of H individuals go through a transition to the I state after an average incubation period of t 1 days -a typical non-Markovian process. With medical treatment, individuals in the I state recover or die after t 2 days. Finally, a time delay exists for the transition from I to J: on average the I individuals will need t 4 days to be confirmed. 13 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. We treat each State in the US as an open system, regarding the influences from all the other States as perturbations, mathematically represented by the populations moving into and out of the S and H states, denoted as S in (t), S out (t), H in (t), and H out (t), respectively, as shown in Fig. 6 . These functions are determined by the travel intensity as a function of time. Two types of travel need to be distinguished: interstate and intrastate, with the corresponding intensity functions l inter (t) and l intra (t). Due to government imposed travel restrictions, these functions decay exponentially from an initial value to a final smaller constant value. For instance, a recent estimate [15] gives that, for several major US cities, the travel restrictions would reduce the outbound human movements by 50%. In general, we have and l intra (t) = 8 > > < > > : The starting dates differ from State to State. For the ten States studied, the dates t i c are listed in Supplementary Table 2 . The values of t s and t i s are set as t c + 7 and t i c + 7, respectively. Our generalized five-state model for COVID-19 epidemic for any given target State in the US (an open system) can be described by the following set of delayed integro-differential equations: where the quantity H n (t) in Eqs. (3) and (4) is the rate of increase in the H-state population: H n (t) = bS(t)l intra H(t)/N(t) with l intra (t)H(t) representing the active H-state population that has not been isolated, f 1 (t), f 2 (t), f 3 (t), and f 4 (t) are the normal probability distribution functions [24] of the delay time t 1 , t 2 , t 3 , and t 4 , respectively, and N(t) is the population of the State as a function of time. The quantity F(t) is the increment of the H-state population. A term-by-term explanation of the integro-differential equations can be found in Ref. [24] . The input and output functions given by S out (t) = F out l inter (t) S(t) H out (t) = F out l inter (t) where the quantities S total (t) and H total (t) are the total S and H-state populations of the US excluding the target State, F in (t) and F out (t) are the fluxes into and out of the State, which can be extrapolated from empirical data. To numerically solve the whole set of equations, the values of the initial H-state population, H(0), and of the infection rate b are needed, which can be estimated through a mathematical optimization procedure [24] . To determine the daily fluxes F in (t) and F out (t), we use the commuting data from the US Census Bureau (https://www.census.gov/topics/employment/commuting.html), which were obtained from sampling the home and work addresses of the working population in the five-year period (2011) (2012) (2013) (2014) (2015) . Our estimation method is described, as follows. Assume that the commuting population is distributed uniformly among the States. On average, each working individual commutes 0.11 time per day. Multiplying this number by the population of the State gives the daily average number of people who commute. Denoting this number by Q and letting the commuting populations from the Census Bureau's data base be P in/out , we obtain the ratio C in/out = Q/P in/out . Let D in/out be populations in and out of the State from the data base. The daily fluxes can be obtained as F in/out = C in/out · D in/out . For the ten States studied, the flux values are listed in Supplementary Table 3 . All relevant data are available from the authors upon request. All relevant computer codes are available from the authors upon request. 16 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04.24.20078774 doi: medRxiv preprint There have been focused recent efforts in predicting the epidemic trends of COVID-19 and the effects of government control measures through mathematical modeling . For example, two control strategies, one focusing on mitigation and another on suppression, were studied and compared [11] with the conclusion that, in the United Kingdom, even some best and optimal mitigation measures can lead to a two-thirds reduction in the demand for medical diagnosis and treatment and a 50% decrease in the number of deaths, the epidemic would still result in hundred thousands deaths and overwhelm the intensive care system. The suggestion is that, even for countries whose health care systems are still functional at the present, the strategy of suppression should be given a higher priority. Another work studied the epidemic risks and trends in different cities in Spain [13] through traffic data-driven, stochastic simulations of virus propagation and diffusion in spatial regions and inference of the starting time of the epidemic. The results suggest that finding and isolating the infected individuals at the earliest possible time as well as taking intervention measures to reduce the infection rate are effective strategies. Simulations of traditional mathematical models in epidemiology suggest the benefits of local prevention measures such as forbidding mass social gatherings, pinning down the exposure time as an important factor in disease spreading [14] . For example, if the basic reproduction number is two and the infection period is 14 days, if an infected individual stays at a gathering for more than nine hours, the likelihood of infecting others is high. If the exposure time exceeds 18 hours, the participants in the gathering should take strict measures to protect themselves. The effects of travel restrictions on COVID-19 spreading were studied [15] , with the finding that the lockdown of Wuhan city in China provided a golden time between three and five days for other regions of China to get prepared and, by the middle of February, might have suppressed the scale of global pandemic by as much as 80%. It was suggested that travel restriction in combination with individual prevention measures such as wearing facial masks leading to a reduction in the infection rate by more than 50% can suppress the epidemic. The exponential growth rate in different regions of China in the early stage of COVID-19 epidemic was estimated [16] through a significant level analysis, revealing an approximately uniform value for China: 2.1 ± 0.3 with the finding that effective control strategies leading to dramatic changes in the collective behaviors of the vulnerable population have a direct impact on the growth rate. A network model based on dynamic compartments and the SEIR (susceptible-exposed-infected-recovered) spreading dynamics was articulated [18] to analyze the effects of travel restriction, business closure, and social distancing on the epidemic, with the finding that school closure would shield the youth population from the disease, bar closures would be beneficial to young people and adults alike, and canceling religious services would protect the adults and the elders. Models for the COVID-19 epidemic in the United States have also been developed. A metapopulation model was proposed to simulate the SARS-CoV2 spread and the effects of social distancing and travel restriction on suppressing the epidemic in the continental US [20] with the prediction of large scale outbreak in the country within 180 days after March 13. A network-driven epidemic dynamic model was constructed to simulate the COVID-19 progression timeline and the effectiveness of interventions across the US [21] with the prediction that, without control measures and travel 1 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . restriction, about 7% of the population of the country would be infected when the epidemic peaks at the beginning of June. If the infection rate of COVID-19 is weakened by 25%, the peak would be delayed for about 34 days and its height would be reduced by 39%. Simulations of a model adopted from influenza management suggested [11] the necessity to maintain the control measures until a vaccine becomes available in about 18 months, but socially and economically this would represent a great challenge. Intermittent implementation of social distancing according to epidemic development can allow relaxation of control measures in relatively short time windows: in case a second wave of epidemic begins to emerge, social distancing and family isolation of suspected infections can be put in place again with an increasing intensity. The growth rate and reproductive number of COVID-19 were estimated [22] between March 14 and 19 for different cities in the US, revealing a power law relationship between these quantities and the city population. The implication is that the spreading speed in larger cities will be higher and, without control and prevention measures, a larger fraction of the population would be infected in larger cities. A dynamic model for COVID-19 was proposed [23] for comparing containment strategies in a pandemic scenario that the government can choose: one is a persistent control measure and another is based on adaptive implementation of suppression strategies with intensity depending on time, where the latter can shorten the epidemic duration to a greater extent as compared with the former. In fact, more lives can be saved if strict control measures are implemented earlier. 2 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04. 24.20078774 doi: medRxiv preprint In the main text, the predicted epidemic scenarios for the State of New York obtained using the realistic, dual system, coupled five-state model. To assess the effect of interstate traffic on the epidemic, here we study the hypothetical setting where the State is treated as a closed system. Let (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . We treat the US as a closed system, neglecting the influence of input cases from foreign countries after January 30. The total population is N = 3.3⇥10 8 Fig. S4(d) From the insets in different panels of Fig. S4 , we note a delay in the disappearance of the H-state population in comparison with the I-state -about seven days for cases (a,b,d) but one month for case (c). This means that, if testing and surveillance are insufficient, the number of confirmed cases reaching zero does not mean the end of the epidemic: there are still a non-negligible number of hidden individuals who carry the virus, are asymptomatic, and can infect others, thereby requiring long term placement of control measures to minimize the rise of a second outbreak. In fact, in Wuhan city (the epicenter in China), the number of confirmed cases has been zero for a few weeks, but the lockdown order was in place until around April 8 and social distancing and isolation orders are still being vigorously enforced. In the main text, results on the impacts of interstate and intrastate traffic restriction on the epidemic duration in New York and Arizona are presented (Fig. 5) . Here we present results on the impacts of such travel restrictions on the final epidemic size in the two states, as shown in Fig. S5 . When a reasonable level of testing is assumed (⌘ = 0.5), the effects of interstate travel restriction on the epidemic sizes are insignificant [Figs. S5(a) and S5(c)]. However, intrastate travel restriction can have an exponential effect on the epidemic sizes [Figs. S5(b) and S5(d)], similar to its effect on the epidemic duration as described in the main text. 6 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . We simulate four distinct epidemic scenarios for the State of New Jersey treated as an open system with different combinations of testing capability and strength of control measures. The starting date of the epidemic is March 5. The population of the State is N = 8.8 ⇥ 10 6 . The results are shown in Figs. S6(a-d) Fig. S7(d) Comparing the model predicted J(t) with the real data, we find that Fig. S7(c) represents a likely scenario for the State of California, which indicates that both government testing services and the control measures are not vigorous enough. Also, comparing Fig. S7(b) with Fig. S7(d) reveals that, when the government testing and surveillance services are significantly reduced, insofar as a stringent set of control measures is in place, the resulting epidemic would be worse but to a relatively small extent, unequivocally pointing at the extremely important role of placing vigorous government control measures in suppressing the epidemic. 7 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. Comparing the model predicted J(t) with the real data, we find that Figs. S9(b) and S9(d) represent two likely scenarios for the State of Florida. 8 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. Comparing the model predicted J(t) with the real data, we find that Fig. S11 (d) represents a likely scenario for the State of Massachusetts, suggesting that the current government control measures are stringent. 9 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. Comparing the model predicted J(t) with the real data, we find that Figs. S12(b) and S12(d) represent two likely scenarios for the State of Louisiana. 10 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Due to the uncertainties in the estimated model parameters, for each scenario, the predicted outcomes have a range. To determine the daily ranges for all cases is computationally prohibitive. For the purpose of illustration, we present the range of predicted quantities for three examples: New York and Arizona as an open system and the US as a closed system. A difficulty in assessing the uncertainties in the estimated parameter values is the lack of sufficient samples or data. Our solution is to resort to the non-parametric bootstrap method [24] and apply it to J(t). The basic assumption is that the data point J(t i ) obeys a normal distribution. The parameter uncertainties can be obtained through multiple sampling of the optimal model. In particular, from the data of confirmed infections J(t 1 ), J(t 2 ), . . . , J(t n ), we first use a least squares fitting to obtain the optimal estimates of the parameters:⇥ ⌘ (✓ 1 ,✓ 2 ), where✓ 1 ⌘ and✓ 2 ⌘ H(0). Let J(t i ,⇥) denote the model predicted data set of confirmed infections from the estimated parameter values⇥. For the normal distribution of J(t i ,⇥) at time t i , its mean is taken to be the actual data point J(t i ) and its standard deviation is approximated by that of the increments in the number of confirmed cases within five days. We then generate S = 10 3 realizations of the data set J(t i ,⇥), denoted as J ⇤ k (t i ,⇥) for t = 1, . . . , n and k = 1, . . . , S. For each data realization, the optimal parameter values can be estimated, generating the corresponding prediction results. Repeating this procedure for each of the S data sets gives the prediction ranges for the epidemic quantities of interest. The results with the prediction ranges for the three examples are shown in Figs. S13, S14, and S15. We observe that the uncertainty ranges of the predicted quantities increase with time, as expected. The open system model generates predictions with relatively less uncertainty as compared to those from the closed system model. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 12 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 14 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 15 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 18 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . 20 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 23 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 24 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 25 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 26 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 27 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. 28 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 29, 2020. . https://doi.org/10.1101/2020.04. 24.20078774 doi: medRxiv preprint Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Clinical Characteristics of Coronavirus Disease 2019 in China The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China. Novel, Coronavirus Pneumonia Emergency Response Epidemiology: Zhong Hua Liu Xing Bing Xue Za Zhi A novel coronavirus outbreak of global health concern Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Preliminary prediction of the basic reproduction number of the Wuhan novel coronavirus 2019-nCoV Modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions Modeling and prediction for the trend of outbreak of ncp based on a time-delay dynamic system The serial interval of COVID-19 from publicly reported confirmed cases. medRxiv Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand (2020) Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries (2020) Evaluation of the incidence of COVID-19 and of the efficacy of contention measures in Spain: a data-driven approach Insights from early mathematical models of 2019-nCoV acute respiratory disease (COVID-19) dynamics The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak Effective containment explains sub-exponential growth in confirmed cases of recent COVID-19 outbreak in Mainland China Assessing changes in commuting and individual mobility in major metropolitan areas in the United States during the COVID-19 outbreak Management strategies in a SEIR model of COVID 19 community spread An interactive web-based dashboard to track COVID-19 in real time Initial Simulation of SARS-CoV2 Spread and Intervention Effects in the Continental US. medRxiv COVID-19 progression timeline and effectiveness of responseto-spread interventions across the United States COVID-19 attack rate increases with city size Dynamic modeling COVID-19 for comparing containment strategies in a pandemic scenario Quantitative assessment of the role of undocumented infection in the 2019 novel coronavirus (COVID-19) pandemic Coronavirus COVID-19 Global Cases Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2) The geographical destination distribution and effect of outflow population of wuhan when the outbreak of the 2019-ncov pneumonia Supplementary References Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Clinical Characteristics of Coronavirus Disease 2019 in China The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China. Novel, Coronavirus Pneumonia Emergency Response Epidemiology: Zhong Hua Liu Xing Bing Xue Za Zhi A novel coronavirus outbreak of global health concern Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Preliminary prediction of the basic reproduction number of the Wuhan novel coronavirus 2019-nCoV Modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions Modeling and prediction for the trend of outbreak of ncp based on a time-delay dynamic system The serial interval of COVID-19 from publicly reported confirmed cases. medRxiv Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand (2020) Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries (2020) Evaluation of the incidence of COVID-19 and of the efficacy of contention measures in Spain: a data-driven approach No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity Insights from early mathematical models of 2019-nCoV acute respiratory disease (COVID-19) dynamics The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak Effective containment explains sub-exponential growth in confirmed cases of recent COVID-19 outbreak in Mainland China Assessing changes in commuting and individual mobility in major metropolitan areas in the United States during the COVID-19 outbreak Management strategies in a SEIR model of COVID 19 community spread An interactive web-based dashboard to track COVID-19 in real time Initial Simulation of SARS-CoV2 Spread and Intervention Effects in the Continental US COVID-19 progression timeline and effectiveness of response-to-spread interventions across the United States COVID-19 attack rate increases with city size Dynamic modeling COVID-19 for comparing containment strategies in a pandemic scenario Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity The authors declare no competing interests. To whom correspondence should be addressed. E-mail: tangminghan007@gmail.com; Ying-Cheng.Lai@asu.edu