key: cord-352824-sbsg39ix authors: Zhan, Choujun; Tse, Chi K.; Lai, Zhikang; Chen, Xiaoyun; Mo, Mingshen title: General Model for COVID-19 Spreading with Consideration of Intercity Migration, Insufficient Testing and Active Intervention: Application to Study of Pandemic Progression in Japan and USA date: 2020-03-30 journal: nan DOI: 10.1101/2020.03.25.20043380 sha: doc_id: 352824 cord_uid: sbsg39ix A new Susceptible-Exposed-Infected-Confirmed-Removed (SEICR) model with consideration of intercity travel and active intervention is proposed for predicting the spreading progression of the 2019 New Coronavirus Disease (COVID-19). The model takes into account the known or reported number of infected cases being fewer than the actual number of infected individuals due to insufficient testing. The model integrates intercity travel data to track the movement of exposed and infected individuals among cities, and allows different levels of active intervention to be considered so that realistic prediction of the number of infected individuals can be performed. The data of the COVID-19 infection cases and the intercity travel data for Japan (January 15 to March 20, 2020) and the USA (February 20 to March 20, 2020) are used to illustrate the prediction of the pandemic progression in 47 regions of Japan and 50 states (plus a federal district) in the USA. By fitting the model with the data, we reveal that, as of March 19, 2020, the number of infected individuals in Japan and the USA could be twenty-fold and five-fold as many as the number of confirmed cases, respectively. Moreover, the model generates future progression profiles for different levels of intervention by setting the parameters relative to the values found from the data fitting. Results show that without tightening the implementation of active intervention, Japan and the USA will see about 6.55% and 18.2% of the population eventually infected, and with drastic ten-fold elevated active intervention, the number of people eventually infected can be reduced by up to 95% in Japan and 70% in the USA. Finally, an assessment of the relative effectiveness of active intervention and personal protective measures is discussed. With a highly vigilant public maintaining personal hygiene and exercising strict protective measures, the percentage of population infected can be further reduced to 0.23% in Japan and 2.7% in the USA. The global spread of the 2019 New Coronavirus Disease (COVID-19) has shown no sign of subsiding since its emergence in Wuhan, China, in December 2019 [1] . As of March 21, 2020, a total of 276,472 cases of COVID-19 infection have been confirmed in over 185 countries, with a death toll of 11,417 [2] . Different control strategies at different levels of stringency have been applied to slow the spread of the virus in different countries [3] . While some countries have seen peaks of infected cases and observed significant reduction in the number of new infections in the local communities [2, 4] , the spreading has continued in many countries, and surges in infected cases have been observed in Europe, USA and Australia. Intercity travel has been found to be a contributing factor to the rapid spread of the virus [5, 6] . Thus, effective models for describing the pandemic progression in different cities should take into consideration the volume of intercity travels. Furthermore, the rapid spread of the virus in a population has often been a result of delayed information or unawareness of the real situation in that population, despite the wide dissemination of information related to COVID-19 outbreaks in other parts of the world. The most notable information latency lies in the number of confirmed cases reported, which depends on the ability of the particular country or city to perform tests as well as the possible bureaucracy in the local system of reporting. Thus, the number of confirmed cases is almost certainly not the true number of infected individuals at any given time [7] , and an improved model for predicting the spreading progression should incorporate the latency associated with the reporting system as well as the possible missing cases leading to delay and loss of information. The traditional Susceptible-Exposed-Infectious-Recovered (SEIR) model [8, 9] thus has obvious shortfalls in describing the spreading dynamics of COVID-19 pandemic. In this work, we attempt to fill the main gap between the number of confirmed cases and the actual number of infected cases. Specifically, in the proposed model, an infected individual may become a confirmed case and then recovered/removed. Moreover, an infected individual may also be recovered/removed without being confirmed as infected. In other words, the basic model proposed here is a Susceptible-Exposed-Infectious-Confirmed-Recovered (SEICR) model, which has an additional state corresponding to an individual having been confirmed by the authority as being infected. On the basis of an SEICR model, we develop a model incorporating intercity travel data which accounts for any increase or decrease in the number of exposed and infected individuals in a city due to intercity migration. Furthermore, the level of intervention in the form of travel restriction, regional lockdown or other active control measures would profoundly influence the rapidity of the spread of the virus and the eventual number of infected cases. The model should therefore allow the level of active intervention to be included as a control parameter and produce the appropriate progression profile. A specific parameter is used to adjust the level of active intervention in the simulation of future progression profiles, which corresponds quantitatively to the increase in the number of individuals eventually infected due to an additional infected individual at any given time. In this work, we apply the model to study the COVID-19 spreading progression in Japan and the USA. Data of confirmed and recovered cases in 47 Japanese prefectures or regions (January 15 to March 20, 2020) and 51 states in the USA including Washington DC (February 20 to March 20, 2020) are used for fitting with the model and retrieval of parameter values. The parameters found are then adjusted to produce future progression trajectories corresponding to the implementation of different levels of active intervention. From the set of best-fit parameters, we reveal that the actual number of infected individuals could be up to 20-fold and 5-fold as many as the confirmed numbers in Japan and the USA, respectively, as of March 19, 2020. Furthermore, if the level of active intervention is kept unchanged, the percentage of population eventually infected in Osaka-fu and Tokyo-to will reach around 12% and 4.2% of the population (around 2,300,000 and 600,000 people), respectively, and in total, Japan will have about 6.55% of its population eventually infected. However, implementing a four-fold elevated active intervention will improve the situation substantially, with the percentage of population infected in Osaka-fu and Tokyo-to reduced to 4.2% and 2.3%, respectively, and over 75% reduction in the overall number of infected cases in Japan. Our results for the USA also show a rather detrimental situation if the level of active intervention remains at the status quo in the coming months, with California and New York state having around 15% and 37.5% of their population (5,800,000 and 7,300,000 people) eventually infected, respectively, and the overall infected population will reach 18.2%. However, by implementing four-fold elevated active intervention, the percentage of infected population in the USA can be reduced to 9.3%. Furthermore, the effectiveness of the public in exercising protective measures can be assessed by varying the infection rates in the model, and it is found that active government intervention is more effective for Japan, while exercising protective measures by the public is more important for the USA. With the public raising its level of vigilance in exercising strict protective measures and the government drastically elevating its active intervention, the percentage of population getting infected can be reduced to 0.23% in Japan and 2.7% in the USA. The World Health Organization currently sets the alert level of COVID-19 to the highest, and has made data related to the pandemic available to the public in a series of situation reports as well as other formats [10] . Our data include the number of confirmed infected cases, the cumulative number of confirmed infected cases, the number of recovered cases, and death tolls, for 47 individual prefectures and regions in Japan, from January 15 to March 20, 2020, and for 50 states and a federal district (Washington DC) in the USA, from February 20 to March 20, 2020. Data organized in convenient formats are also available elsewhere [2, 11, 12] . Moreover, the monthly intercity migration data for February 2020 are available from official statistics provided by the Japanese government [13] , and are used as indicative migration strengths between prefectures or regions in Japan. For the USA, annual data for the volume of inter-state travellers are available from the Census Bureau [14] and the Bureau of Transportation Statistics [15] . In the proposed Susceptible-Exposed-Infectious-Confirmed-Recovered (SEICR) model, each individual would assume one of five possible states at any time, namely, susceptible (S), exposed (E), infected (I), confirmed (C), and recovered/removed (R). Compared to the traditional SEIR model [8, 9] , the new SEICR model has an additional C state, corresponding an individual having been confirmed by the authority as infected. Thus, not all infected individuals will become confirmed, and some infected individuals will transit to the recovered state without going through the confirmed state. For city or region j, the number of individuals in the five states are S j (t), E j (t), I j (t), C j (t) and R j (t), at time t. The transitions of the five states are illustrated in Figure 1 . In addition, P j (t) stands for the population of region j. Furthermore, to account for intercity movement, we introduce a migration strength, m ij (t), which represents an indicative volume of people moving from region i to region j at time t [4] . Then, the augmented SEICR model is given as follows: is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . , and ∆P j (t) = P j (t + 1) − P j (t). The meaning of each parameter is given in Table 1 . Also, we assume that the recovered and confirmed individuals would stay in region j. In this SEICR model, the number of individuals eventually infected is set initially at N s j (t 0 ) = δ j P j (δ j being constant), implying that some effective measures have been taken by the authorities to limit the upper bound of the susceptible population. Moreover, in the case of inactive or less effective intervention, or even unchecked spread, the growth of the number of infected cases will add to the eventual infected number. Hence, the number of eventually infected individuals should increase for each additional infected or exposed individual at time t. This is equivalent to adding an extra term (the boxed term below) to ∆S i (t) and ∆N s j (t). Furthermore, as the number of infected cases increases and approaches a saturating percentage k h (such as a herd-immunity condition), the spreading is expected to slow down significantly, i.e., α j and β j will drop as N s approaches k h P j , where 0 < k h < 1. Thus, we have j is an inverse indicator of the level of active intervention implemented, and corresponds quantitatively to an increase in the number of eventual infected individuals for each additional infected or exposed individual in region j, and the added term in ∆S j and ∆N s j will approach zero as N s j → k h P j . The meanings of other parameters are given in Table 1 . Again, the recovered and confirmed individuals are assumed to stay in region j. The model given in (1) and (2) is very general in the sense that it applies to populations with varied levels of effectiveness of active intervention during the outbreak. To further facilitate the assessment of control measures implemented in region j, we define the level of active intervention as Thus, if ψ j > 1, the control measures are effective and the progression is limited such that k (c) j < 1. The total number of eventually infected individuals is equal to N j=1 P i (t 0 )δ j . 5 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . α j : rate of infecting a susceptible individual by an exposed individual in region j β j : rate of infecting a susceptible individual by an infected individual in region j ρ j : rate of infecting a susceptible individual by a confirmed individual in region j λ j : confirmed rate of infected individuals in region j κ: rate of an exposed individual becoming infected γ (I) j : recovery rate of an infected but not confirmed individual in region j γ (Q) j : recovery rate of a confirmed individual in region j k I : possibility of an infected individual moving from one region to another k (c) j : increase in number of individuals eventually infected for each additional infected or exposed individual in region j ψ j : level of active intervention, ψ j = 1/k (c) j k h : proportion of population infected achieving no further spreading, i.e., absolute upper bound for N s j for all j δ j : initial percentage of eventual infected individuals in region j I j,0 : initial number of infected individuals in region j E j,0 : initial number of individuals in region j C j,0 : initial number of confirmed infected individuals in region j However, in the case of less effective or ineffective control, i.e., ψ j < 1, infected and exposed individuals continue to spread the disease, and for each additional infected individual, there will be k (c) j more eventual infected individuals, and the pandemic progresses until the number of infected cases reaches k h P j . The model represented by (1) and (2) describes the dynamics of the pandemic propagation with consideration of human migration dynamics and the reality of insufficient testing that leads to confirmed infected cases being fewer than the actual infected cases. The parameters in (1) and (2) are unknown and to be estimated from historical data of C and R. We solve this parameter identification problem via constrained nonlinear programming (CNLP), with the objective of finding an estimated growth trajectory that fits the data. An estimated number of infected cases of each city can be generated from (1) and (2) with unknown set θ j given by where I j,0 = I j (t 0 ) and E j,0 = E j (t 0 ) are the initial numbers of infected and exposed individuals in region j, and {α j , β j , γ j , δ j , λ j , γ j } are the model parameters of region j. Here, we assume that all confirmed cases are either quarantined or hospitalized, and hence not infectious, i.e., ρ j = 0. Then, the unknown set is Θ = {θ 1 , θ 2 , · · · , θ K , κ, k I , k h }, which essentially has 8K+3 unknowns, where K is the number of regions in the entire population 6 . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . https://doi.org/10.1101/2020.03.25.20043380 doi: medRxiv preprint under study. The identification of unknown parameters would require a considerable effort of computation. Specifically, the parameter estimation problem can be formulated as the following constrained nonlinear optimization problem: where F (·) represents the model given by (1) and (2), w is the set of estimated variables, with unknown set Θ being bounded between Θ L and Θ U . In this work, an inverse approach is taken to find the unknown parameters and states by solving (5). The model parameters characterize the spreading dynamics, and once the set of parameters has been identified using the abovementioned optimization procedure, we may generate future progression profiles by using the same set of parameters. Moreover, we may also adjust some of the parameters to examine different possible scenarios, corresponding to varying levels of active intervention ψ j = 1/k (c) j , including travel restriction, mandatory quarantine and other control measures. If the level of active intervention stays with the status quo, we will use the same value of k (c) j for generating future progression profiles. Future paths under more active intervention can be predicted by reducing the value of k (c) j . In our study, we will consider three levels of active intervention, namely, 1. staying with the status quo, corresponding to the same value of ψ j or k (c) j ; 2. two-fold step-up of active intervention, corresponding to 2ψ j or 0.5k (c) j ; 3. four-fold step-up of active intervention, corresponding to 4ψ j or 0.25k (c) j . We will examine the results in the next section. The pandemic progression profiles of 47 Japanese prefectures or regions are examined. We perform data fitting of the model, described by (1) and (2), using historical daily data of confirmed and recovered cases [11, 12] . A typical candidate set of parameter values that fit well wth the data from January 15, 2020 to March 20, 2020 is as follows: CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . This set of parameters reflects an inadequate level of control to slow the spread of the disease, as indicated by the value of k (c) j being larger than 1. Specifically, for each additional infected or exposed individual, the number of eventual infected individuals would increase by around 1.5 on the average. The number of individuals eventually infected will approach a saturating percentage k h . We have identified 100 candidate sets of parameters which satisfy the fitting criteria, and for each set of parameters, we perform a separate simulation run. Figure 2 shows one particular simulation run of a well fitted candidate set of parameters for 8 selected prefectures in Japan. The averaged results of all simulation runs are consolidated in the charts shown in Figure 2 . Based on the data up to March 20, 2020, our model estimates that less than 3% of the infected cases are confirmed, with Hokkaido having the highest percentage (6.9%) and Hyogo-ken the least (1.5%), as shown in Figure 3 (a). In other words, the actual number of infected individuals could be 20 times as many as the official confirmed number. Statistics of percentages of the population confirmed and infected with the disease up to March 20, 2020 are shown in Figure 3 By extending each simulation run to the forthcoming 200 days, we obtain a set of predicted progression profiles for each region in Japan. Moreover, different levels of active intervention can be assessed by adjusting parameter k (c) j relative to the values found in each candidate set. For instance, by reducing k (c) j and re-running the simulation, we may assess the effect of tightening the control measures. Specifically, as described in Section 3.3, we examine three cases, corresponding to the level of active intervention being unchanged, two-fold elevated and four-fold elevated. The key results are summarized as follows: • Staying with the status quo (k (c) j unchanged): If there is no further tightening of control aiming to slow the spread, all parameters of the candidate sets will remain unchanged. The total number of individuals eventually infected until September 23, 2020 in each region is shown in Figure 3 (c). In this case, the number of infected individuals in Osaka-fu and Tokyo-to will reach about 2,300,000 and 600,000 (12% and 4.2% of population), respectively, while most other regions will have around 5% of the population eventually infected by September 23, 2020, as shown in Figure 3 (d). In total, about 6.55% of the population in Japan will be infected. • Two-fold elevated active intervention (k Figure 3(c) . Specifically, the percentage of population eventually infected by September 23, 2020 in Osaka-fu and Tokyo-to would drop to about 6.8% and 2.3%, respectively, while most other regions would drop to less than 2%, as shown in Figure 3(d) . In total, about 4.14% of the population in Japan will be infected. • Four-fold elevated active intervention (k is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . https://doi.org/10.1101/2020.03.25.20043380 doi: medRxiv preprint 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 50 100 150 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 5000 10000 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 2000 4000 (a) Aichi-ken 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 20 40 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 500 1000 1500 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 100 200 300 (b) Chiba-ken 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 50 100 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 200 400 600 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 100 200 (c) Hokkaido 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 50 100 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 1 2 10 4 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 5000 10000 (d) Hyogo-ken 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 20 40 60 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 1000 2000 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 200 400 600 (e) Kanagawa-ken 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 50 100 150 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 5000 10000 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 2000 4000 6000 (f) Osaka-fu 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 20 40 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 200 400 600 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 200 400 (g) Saitama-ken 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 50 100 150 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 1000 2000 3000 15Jan20 25Jan20 05Feb20 16Feb20 26Feb20 08Mar20 19Mar20 0 500 1000 (h) Tokyo-to is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . inal value in each simulation run, we observe a very drastic drop in the number of individuals eventually infected, as given in Figure 3 (c). Specifically, the percentage of population eventually infected by September 23, 2020 in Osaka-fu and Tokyo-to would drop to about 4.1% and 2.3%, respectively, while most other regions would drop to less than 1%, as shown in Figure 3 (d). In total, about 1.54% of the population in Japan will be infected. In conclusion, the current level of control by the Japanese government seems to be inadequate, and a significant step-up in the level of active intervention is necessary in order to curb the aggressive progression trend. In addition, our model estimates that the number is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . of infected individuals could be 20 times as many as the currently confirmed number due to various reasons such as insufficient testing. Based on the data collected so far and assuming no further tightening of control, our model estimates about 6.65% of the population eventually infected, and a four-fold elevation in control efforts may bring it down to 1.54% (about 75% reduction) and end the pandemic sooner. As will be shown in Section 4.3, a drastic 10-fold elevated active control may bring it further down to 0.24%. The pandemic progression profiles of 50 states and a federal district in the USA are examined. We again perform data fitting of the model, described by (1) and (2), using historical daily data of confirmed and recovered cases from February 20 to March 20, 2020 [11, 12] , and obtain 100 candidate sets of parameters that satisfy the fitting criteria. For brevity of presentation, we show here the results for eight selected states having significant numbers of infected individuals as of March 20, 2020. Figure 4 shows one typical simulation run, showing the number of confirmed cases, the estimated number of infected individuals (not confirmed), and the estimated number of exposed individuals. As of March 19 , 2020, our model shows that less than 20% of the infected cases are confirmed, with Washington DC having the highest percentage (36%) and Michigan state the least (0.7%), as shown in Figure 5(a) . In other words, the actual number of infected individuals in the USA could be 5 times as many as the confirmed number. Statistics of percentages of population confirmed and infected with the disease up to March 19, 2020 are shown in Figure 5(b) . Again, by extending each simulation run to the future 200 days, we obtain a set of predicted progression profiles for each state and federal district in the USA. We also examine three cases corresponding to three different levels of active intervention, as described in Section 3.3. The key results are summarized as follows: • Staying with the status quo (k (c) j unchanged): If there is no further tightening of control aiming to slow the spread, all parameters of the candidate sets will remain unchanged. The total number of individuals eventually infected until September 23, 2020 in each state is shown in Figure 5 (c). In this case, the number of infected individuals in California and New York state will reach about 5,800,000 and 7,300,000 (15% and 37.5% of population), respectively, while most other states will have less than 20% of the population eventually infected by September 23, 2020, as shown in Figure 5(d) . In total, about 18.2% of the population in the USA will be infected. • Two-fold elevated active intervention (k (c) If active intervention is stepped up to twice the current level, i.e., the value of k (c) j set to half of the original value in each simulation run, we observe a significant drop in the number of individuals eventually infected, as given in Figure 5(c) . Specifically, the percentage of population eventually infected by September 23, 2020 in California and New York state would drop to about 4.5% and 29.5%, respectively, while most other states would drop to less than 10%, as shown in Figure 5(d) . In total, about 14% of the population in the USA will be infected. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . https://doi.org/10.1101/2020.03.25.20043380 doi: medRxiv preprint 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 500 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 1000 2000 3000 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 500 1000 (a) California 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 200 400 600 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 1000 2000 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 500 1000 1500 (b) Illinois 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 200 400 600 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 1000 2000 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 1000 2000 (c) Louisiana 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 200 400 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 500 1000 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 200 400 (d) Massachusetts 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 200 400 600 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 500 1000 1500 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 1000 2000 3000 (e) Michigan 16Feb20 21Feb20 26Feb20 03Mar20 08Mar20 13Mar20 19Mar20 0 200 400 600 16Feb20 21Feb20 26Feb20 03Mar20 08Mar20 13Mar20 19Mar20 0 2000 4000 6000 16Feb20 21Feb20 26Feb20 03Mar20 08Mar20 13Mar20 19Mar20 0 2000 4000 (f) New Jersey 16Feb20 21Feb20 26Feb20 03Mar20 08Mar20 13Mar20 19Mar20 0 2000 4000 6000 16Feb20 21Feb20 26Feb20 03Mar20 08Mar20 13Mar20 19Mar20 0 1 2 10 6 16Feb20 21Feb20 26Feb20 03Mar20 08Mar20 13Mar20 19Mar20 0 1 2 10 6 (g) New York 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 500 1000 1500 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 0 100 200 16Feb20 21Feb20 27Feb20 03Mar20 09Mar20 14Mar20 20Mar20 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. • Four-fold elevated active intervention (k (c) If active intervention is stepped up to four times the current level, i.e., the value of k (c) j set to a quarter of the original value in each simulation run, we observe further reduction in the number of individuals eventually infected, as given in Figure 5(c) . Specifically, the percentage of population eventually infected by September 23, 2020 in California and New York state would drop to about 2.5% and 23%, respectively, while most other states would drop to less than 3%, as shown in Figure 5(d) . In total, about 9.32% of the population in the USA will be infected. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . In summary, a significant step-up in the level of active intervention is necessary for US government to slow the spread of the virus. In addition, our model estimates that the number of infected individuals could be five times as many as the currently confirmed number due to insufficient testing and other reasons. Based on the data collected so far and assuming no further tightening of government's control, our model estimates that about 18.2% of population would eventually be infected, and a four-fold elevation in control efforts may bring it down to 9.32%. As will be shown in Section 4.3, a drastic 10-fold elevated active control may bring it further down to 5.24%. The results presented above have highlighted the ability of the model in assessing the impact of active intervention through adjusting one of the parameters, namely, ψ j = 1/k (c) j . Moreover, it has been widely disseminated that maintaining personal hygiene is equally important in curbing the spread of the virus. The World Health Organization recommends several specific protective measures to be practiced by the public, including frequent hand washing, maintaining social distancing, avoiding touching one's eyes, nose and mouth, and practicing respiratory hygiene [16] . Recent studies also show that wearing surgical masks also help in some cases [17, 18] . The level of vigilance of the public in exercising personal protective measures can also be incorporated in our model through adjusting infection rates α j and β j . We can therefore assess the combined effectiveness of active intervention and practicing protective measures in controling the pandemic. Here, we vary α j , β j and k (c) j from 10% to 100% of the originally identified values in 10 intervals, corresponding 10 different levels of vigilance of the public and active intervention by the authorities. In particular, we assess α j and β j as one property and k (c) j as another, i.e., varying α j and β j in synchrony. Specifically, for each candidate parameter set, we perform 100 simulation runs for each combination of α j , β j and k Our results have highlighted an interesting difference between the effectiveness of government's active intervention and maintaining personal hygiene by the public for Japan and the USA. For Japan, we observe a 27-fold reduction (from 6.55% to 0.24%) in the percentage of individuals eventually infected upon a drastic 10-fold step-up of active intervention, as given in the first column of Figure 6 (a), whereas less than 3-fold reduction (from 6.55% is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. to 2.16%) is observed in the percentage of individuals eventually infected upon the same 10-fold improvement in personal hygiene, as shown in the first row of Figure 6 (a). Thus, government's active intervention seems to be more important for Japan. Moreover, for the USA, we see the opposite. Specifically, only about 4-fold reduction in the percentage of individuals eventually infected is observed upon a drastic 10-fold step-up of active intervention, as shown in the first column of Figure 6 (b), whereas a 6-fold reduction is observed upon a 10-fold improvement in maintaining personal hygiene by the public. Thus, raising the level of vigilance of the public in exercising personal protective measures is comparatively more important for the USA. A plausible reason for the difference between Japan and USA is that the model has captured higher infection rates for the USA compared to Japan. Reducing k j for the US case is thus less effective at such high infection rates, i.e., a fewer eventual infected number per additional infected individual would not help too much. In contrary, the parameter sets for Japan already have relatively lower infection rates, and further improvement by reducing the infection rates would be limited. As a final remark, we observe from the charts given in Figure 6 that combining a very high level of vigilance of the public in exercising strict protective measures and a drastic step-up of government intervention, the percentage of the population getting infected can be reduced to 0.23% in Japan and 2.7% in the USA. One of the key challenges in data-driven modeling and analysis is the delayed and missing information that makes fitting of models either difficult or unreliable, resulting in inconsistent or even erroneous dynamical profiles generated by a poorly parameterized model. The traditional Susceptible-Exposed-Infectious-Recovered (SEIR) model provides a general dy- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . namical description of the spreading of a disease in a population, and involves a series of transitional processes that describe how a healthy individual becomes exposed, infected and eventually recovered or removed from the population. The model thus generally has four dynamical states. From the modeling point of view, the model parameters can be extracted from fitting with historical data consisting of the number of individuals in the infected state and recovered state, which are the usual practically observable data. Outbreaks of the 2019 New Coronavirus (COVID-19) pandemic have occurred in over 185 countries since the virus began to spread from China in January 2020 via an active global transportation network. The data of infected and recovered cases reported by different cities and regions have been found unreliable or incomplete, as they are subject to the availability of test facilities as well as other factors related to bureaucracy of reporting and the mode of operation of the medical systems. Nonetheless, figures of "confirmed" cases can still be honest figures, though not necessarily the true figures of infected cases. In this work, we propose a new disease spreading model with consideration of the delayed and missing data of infected cases, intercity travel, and the level of active intervention. Instead of matching the number of confirmed cases obtained from official sources directly with the number of infected cases in the model, we create a new state which is a delayed and contracted version of the original infected state of the SEICR model, leading to a new SEICR model. The model, which estimates the actual number of infected cases after identifying the best parameter sets, is applied to study the COVID-19 pandemic progression in Japan and the USA. Results reveal that the actual number of infected individuals could be up to 20-fold and 10-fold as many as the confirmed numbers in Japan and the USA, respectively, as of March 19, 2020. Our model also allows assessment of varying levels of active intervention implemented by the government, and results show that the current level of control by the Japanese and US governments may be inadequate, and a significant step-up in the level of active intervention is necessary in order to slow the aggressive progression trend in both countries. For Japan, based on the data collected so far and assuming no further tightening of control, our model estimates about 6.55% of the population eventually infected, and a four-fold elevation in control efforts may bring it down to 1.54%. For the USA, our model estimates about 18.2% of population eventually infected if the government does not step up its control, and a four-fold elevation in active intervention may bring it down to 9.32%. Finally, adjusting the infection rates permits assessment of effectiveness of practicing protective measures and maintaining personal hygiene. Simulations of various levels of implementation of combined active intervention and protective measures show that stepping up government's active intervention would be more effective for Japan, while raising the level of vigilance of the public in maintaining personal hygiene and social distancing is comparatively more important for the USA. With the public raising its level of vigilance and the government drastically elevating its active intervention, the percentage of population getting infected can be reduced to 0.23% in Japan and 2.7% in the USA. . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 30, 2020. . https://doi.org/10.1101/2020.03.25.20043380 doi: medRxiv preprint Early transmission dynamics in Wuhan China of novel coronavirus-infected pneumonia Coronavirus cases by country, territory, or conveyance Coronavirus: the hammer and the dance Modeling and prediction of the 2019 coronavirus disease spreading in China incorporating human migration data Risk of transportation of 2019 novel coronavirus disease from Wuhan to other cities in China The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak Coronavirus testing: criteria and numbers by country Mathematical Tools for Understanding Infectious Disease Dynamics Plausible models for propagation of the SARS virus World Health Organization. Coronavirus disease (COVID-19) outbreak Coronavirus pandemic in Japan New coronavirus epidemic real-time tracking State-to-state migration Bureau of Transportation Statistics World Health Organization. Coronavirus disease (COVID-19) advice for the public Influenza virus aerosols in human exhaled breath: particle size, culturability, and effect of surgical masks Does wearing a mask protect you from the flu and other viruses? This work was supported by National Science Foundation of China Project 61703355, Guangdong Youth University Innovative Talents Project 2016KQNCX223, and City University of Hong Kong under Special Fund 9380114.