key: cord-0258024-2q2q4zfy
authors: Tan, Y.; Zhang, Y.; Cheng, X.; Zhou, X.-H.
title: Statistical Inference Using GLEaM Model with Spatial Heterogeneity and Correlation between Regions
date: 2022-01-03
journal: nan
DOI: 10.1101/2022.01.01.21268139
sha: 65c7f002434391c1c699a5ab6e4022b379d98bbf
doc_id: 258024
cord_uid: 2q2q4zfy

A better understanding of the various patterns in the coronavirus disease 2019 (COVID-19) spread in different parts of the world is crucial to its prevention and control. Motivated by the celebrated GLEaM model (Balcan et al., 2010 [1]), this paper proposes a stochastic dynamic model to depict the evolution of COVID-19. The model allows spatial and temporal heterogeneity of transmission parameters and involves transportation between regions. Based on the proposed model, this paper also designs a two-step procedure for parameter inference, which utilizes the correlation between regions through a prior distribution that imposes graph Laplacian regularization on transmission parameters. Experiments on simulated data and real-world data in China and Europe indicate that the proposed model achieves higher accuracy in predicting the newly confirmed cases than baseline models.

The outbreak of coronavirus disease 2019 (COVID- 19) has impacted all aspects of the world significantly for a long. As of 26 Oct 2021, there have been over 243 million confirmed cases of COVID-19, including over 4 million deaths ( [2] ). Therefore, it is essential to study the spread of COVID-19 for better prediction and prevention of the disease. This paper first proposes a stochastic dynamical model that describes the spread of COVID-19 in multiple regions and then develops an algorithm to estimate the transmission parameters and their posterior distributions.

The model proposed in this paper is inspired by The Global Epidemic and Mobility (GLEaM) model proposed in [1] . GLEaM is a stochastic dynamic model to depict the spread of epidemics, integrating multiple data layers that couple metapopulations at the global level. The model involves 3362 subpopulations in 220 countries obtained from Voronoi tessellation, centered around major airports. These subpopulations are connected by a multi-layer mobility network composed of processes from short-range communication in geographically close subpopulations to international flights. In each subpopulation, the transmission of epidemics is modeled by a Susceptible-Exposed-Infected-Removed (SEIR) compartmental model that integrates short-range communication. After specifying the compartmental model and initial conditions, GLEaM can be run repeatedly to build a statistical ensemble for each point in the space of parameters, which can be further utilized for Monte Carlo analysis of epidemic parameters.

In the vast majority of GLEaM's applications ( [3, 4, 5, 6, 7, 8, 9, 10] ), the parameters are estimated based on [11] . [11] performed the maximal likelihood analysis of the reproduction number R 0 in the seed region Mexico. For each value of R 0 , the method generated the distribution of arrival time of the influenza A(H1N1) in 12 countries produced by 2 × 10 3 GLEaM simulations. Then, the optimal R 0 was chosen by maximizing the likelihood function of arrival time. [11] and subsequent works following its settings ( [3, 4, 5, 6, 7, 8, 9, 10] ) assumed that the epidemic was seeded from one region and the transmission parameters or the other parameters (like the introduction date and location) were estimated through the maximal likelihood analysis of arrival time or other events. In particular, the method in [11] was adopted in [5] to estimate the posterior distribution of R 0 of COVID-19, which was assumed to be uniform for all subpopulations at all times. However, this setting is not suitable for the current scenario of the COVID-19 pandemic, since COVID-19 has lasted for a long time, and the community transmission has been widespread in most countries in the world [12] . To model the spread of COVID-19, both spatial and temporal heterogeneity of the transmission parameters are needed instead of directly modeling R 0 as a periodic function of time as in [11] , since the social behaviors, containment measures, medical conditions, and other elements that might have effects on the spread of COVID-19 may vary among different countries and over time.

Recently, [13] improved the model described above by estimating the initially infected individuals in each subpopulation through microblogging data from Twitter and also estimating the reproduction number R 0 for USA, Italy, and Spain separately. Therefore, the method used in [13] involved spatial heterogeneity. However, in this study, international travel was not considered, and GLEaM was applied to each of the aforementioned countries (as an isolated system) independently. The transmission rates in all subpopulations of these countries were again presumed to be homogeneous. Furthermore, [13] also assumed that the initially infected individuals for each subpopulation were proportional to the total number of Twitter users in that subpopulation. Thus it still assumed that the severity of the pandemic at the initial outbreak of COVID-19 was uniform over the country, which is not the case for COVID- 19 .

In addition to the issues mentioned above, other potential concerns exist when using GLEaM to model the spread of COVID-19. As mentioned in the last section of [14] , GLEaM can be used to simulate the spread of the epidemic under normal conditions since it uses the "steady-state" mobility data around the world. However, since the outbreak of COVID-19, the social order has been disrupted, and travel has been restricted in most countries. Thus GLEaM might not work well with its multi-layer mobility networks. Furthermore, the estimation of parameters using GLEaM is based on a large number of simulations to explore the space of parameters, which usually takes much computational time ( [1] ). In addition, although the social behavior, medical conditions, and other factors that affect the spread of COVID-19 may vary among different regions, these factors for regions that are geographically close or have similarities in other aspects still bear some resemblance. Hence, the transmission rates for COVID-19 should not only own heterogeneity but also correlate with each other. However, neither of the features is reflected in GLEaM or most of its applications.

As the consequences of the possible constraints of GLEaM described above, most of the papers using GLEaM to model the epidemics focus on estimating the transmission parameter in the "seed region" and mainly use the epidemic data at the beginning of the outbreak. However, since factors like government policies and social behavior result in spatial and temporal heterogeneity of the long-lasting spread of COVID-19 around the world, GLEaM and inference methods based on a large number of simulations using GLEaM may not be directly applied to estimate the transmission parameters of COVID-19 in different regions and periods.

In this paper, we propose a model inspired by the GLEaM framework that further enables both spatial and temporal heterogeneity of parameters, making it possible to model the ongoing spread of COVID-19 instead of only the initial outbreak as considered in most applications of GLEaM. Motivated by GLEaM, we include inter-district mobility in the model when such mobility data are available. For the inference of the parameters, we introduce an optimization algorithm that utilizes the correlation between districts. Furthermore, the posterior distribution of parameters is estimated by Markov chain Monte Carlo (MCMC) sampling starting from an optimal choice of the parameters obtained from the optimization process. This takes a much shorter time than the methods based on GLEaM simulations.

The main contributions of our paper are:

• We propose a stochastic model to describe the epidemic's lasting spread, allowing spatial and temporal heterogeneity of transmission parameters and transportation between districts.

• Based on the model, we also design an algorithm that first makes inference for the parameters through a two-step procedure combining the information of correlation between districts, which is equivalent to imposing graph Laplacian regularization on the transmission parameters. Then, the algorithm estimates the posterior distribution efficiently by MCMC sampling with the estimated parameters as the initial points.

• We compare the performance of the proposed model with the baseline models on both simulated data and real-world data.

-For the simulated data, the results show that combining heterogeneity and transportation into the model helps improve the performance of trajectory prediction and parameter estimation. Moreover, our inference algorithm that integrates the correlation of districts leads to further improvement in predicting the future trajectories.

-For the real-world data in China and Europe, the proposed model still outperforms the baselines in trajectory prediction.

Data sets used in this paper are publicly available. Our work focuses on proposing a new method for parameter inference and, consequently, answering questions regarding the long-lasting spread of COVID-19 in multiple regions.

A strength of the proposed model resides in introducing spatial and temporal heterogeneity of transmission parameters. We remark that there exist recent works which also involved different levels of heterogeneity in their models in various ways. [15, 16] utilized randomness in reproduction number to reflect heterogeneity of the population, using plate model with Bayesian method and heterogeneous wellmixed theory [17] with age-of-infection method [18] , respectively. In addition, [19, 20, 21, 22, 23] adapted SEIR / Susceptible-Exposed-Infected (SEI) / Susceptible-Infected (SI) compartmental models similar to this paper. Among these works, [19, 20, 21] considered heterogeneity in the aspects of age groups, social links, and vaccination status separately. [22, 23] , including spatial heterogeneity in one county and among states respectively, bore more similarity with this paper since they also allowed transmission parameters to be spatially heterogeneous and involved transportation between different regions. However, [22] only considered intracounty data, and the transportation was used to compute the effective size of compartments and did not affect the dynamic model. Furthermore, the transmission rates in [22] were determined by an SDE whose parameters were to be fitted. Therefore, [22] focused on a different scope from this paper. The settings of compartments in [23] are more realistic than the one considered in this paper by considering reporting rates. Nevertheless, compared with the model and inference algorithm described in Section 2, transmission rates in [23] were kept constant over time and were estimated using independent non-informative prior. Both [22] and [23] used Ensemble Kalman Filter, which samples particles in the state space according to the prior distribution and obtains the posterior distribution in the process of moving particles at each time step. This may be computationally less efficient than directly applying MCMC according to the posterior distribution with the initial point maximizing the likelihood function, as implemented in this paper.

The rest of the paper is structured as below. In Section 2, we introduce the stochastic dynamic model and corresponding inference algorithm. In Section 3, we compare the performance of trajectory prediction and parameter estimation of models with or without mobility, heterogeneity, or using correlation information in the inference part for the simulated data. Section 4 describes the real-world data used in this paper, presents the results and findings of applying our model to the COVID-19 data in China and Europe. We discuss our limitations and possible extensions in Section 5.

Motivated by the stochastic dynamic process constructed in [24] , we propose a stochastic model that considers mobility.

Suppose there are n regions (region 1, ..., n), and we consider the evolution of the epidemic within a period of T days (day 1, ..., T ). For each region, we consider the following epidemiology compartments:

• E k (t): Exposed and infectious.

• H k (t): Hospitalized.

• R k (t): Recovered or dead.

The subscript k of the states denotes that they belong to the k-th region, and dependence on the continuous time t is addressed through expressing the states as functions of t ∈ [0, T ].

We also denote N k (t) = S k (t) + E k (t) + H k (t) + R k (t) as the total population in the k-th region at time t. N k (t) is allowed to be time-varying since the inter-region mobility is taken into consideration, especially for the days before the implementation of travel restrictions. In this report, we assume that N = n k=1 N k (t) keeps constant over time. We also denote (N a ) k (t) = S k (t) + E k (t) + R k (t) as the total population that are permitted to move in the k-th region, excluding the hospitalized ones.

Furthermore, denote W t i,j as the traveling volume from region i to j on the t-th day (t = 1, . . . , T ). Then the traveling matrix W t on the t-th day can be written as

(2.1)

• Recovery or death in the k-th region: Each individual in H k (t) will transfer into R k (t) with Poisson rate γ k . The recovery rate γ k owns spatial heterogeneity due to the uneven distribution of medical resources.

• Transportation between regions: At the end of the t-th day, all the individuals in region i except the ones in H i (t) have the same probability to travel from the i region to the j region, and the total traveling volume from the i region to the j region is w t i,j . Thus, if we denote {ξ

} follows a multinomial distribution. Specifically,

For clarification, we list all the notations and the parameters in Table 1 below. Now we consider the mean-field differential equation system in (2.3) ( [25, 26] ), which is a deterministic version of the stochastic model above:

whereS k (t) is the deterministic counterpart of S k (t) and so on, and (C a ) k (t) and (R a ) k (t) are the accumulated confirmed cases and the accumulated removed cases determined by the ordinary differential equation (ODE) system (2.3) in the k-th region at time t, respectively. Among the compartments in

can actually be observed, whose observed counterparts are denoted as

, respectively. For inference of parameters, we further denote (∆C a ) k (i) = (C a ) k (i) − (C a ) k (i − 1) as the newly confirmed cases on the i-th day (k = 1, . . . , n, i = 2, . . . , T ), and ( ∆C a ) k (i) as the observable counterparts of (∆C a ) k (i). The same convention holds for the definitions of (∆R a ) k (i) and ( ∆R a ) k (i).

For the remaining latent variables, R k (0) is assumed to be 0, while E k (0) and H k (0) are left to be inferred for each k = 1, . . . , n.

Note that in contrast to the work based on GLEaM [11, 3, 4, 5, 14] , we allow parameters to possess both spatial and temporal heterogeneity. Moreover, the transportation between regions is also considered in the model when the traveling matrix W t is available, which is similar to GLEaM.

We finally emphasize that the model and the inference algorithm described below can be applied when the newly confirmed and recovered cases, the transportation matrix in each day, and the timeline of significant policy responses to the pandemic are available.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; 

The traveling volume matrix on the t-th day, t ∈ {1, . . . , T } S k (t) The deterministic counterpart of S k (t) Other compartments follow the same convention of notations.

The accumulated confirmed cases determined by (2.3)

The accumulated recovered cases determined by (2.

3)

The newly confirmed cases determined by (2. 3) on the i-th day ( ∆C a ) k (i) Observable counterpart of (∆C a ) k (i) on the i-th day λ k Infection rate in the k-th region {λ k } n k=1 need to be inferred, and are allowed to be time-varying.

δ

Inverse of average incubation period δ is prefixed as 0.14. 6 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint

A two-step procedure of estimating parameters is described below.

First, note that δ, the inverse of the average time for a person from being exposed to hospitalized, is prefixed to be δ = 0.14 for all regions and all time. According to [27] , the mean duration of incubation period is 5.2 days. Furthermore, we assume that the average time for an individual from showing symptoms to being hospitalized is 2 days for simplicity. Thus, the mean duration for an individual from being exposed to being hospitalized is 7.2 days, whose inverse value is approximately 0.14.

Thus, the model parameters that need to be estimated are Θ = {E k (0), H k (0), λ k , γ k } n k=1 . A two-step analysis is adapted for the estimation. We first make inference for {γ k } n k=1 utilizing the newly recovered cases and the accumulated hospitalized cases. After the estimation of {γ k } n k=1 , Θ := {E k (0), H k (0), λ k } n k=1 are then estimated by maximizing the posterior distribution, where we introduce a prior distribution combining the information of correlation between regions. In addition, the marginal posterior distributions of Θ can also be obtained by running MCMC sampling starting from the optimizer Θ * of the posterior distribution. Then we complete the inference of Θ = {{γ k } n k=1 , Θ}. Mathematically, given parameters Θ, we can run the ODE system (2.3) on [0, T ] and denote

We first estimate {γ k } n k=1 by maximizing

{γ k } n k=1 , y i−1 over γ k for each k. By assuming that the newly recovered cases in one day follow a Poisson distribution whose mean equals to the product of γ k and the accumulated hospitalized cases (which is the difference between the accumulated confirmed cases and the accumulated recovered cases, and thus is observable) the day before, we have that

.

Step 2: Estimate Θ = {E k (0), H k (0), λ k } n k=1 and the marginal posterior distributions of Θ. Next, we estimate the remaining parameters Θ = {E k (0), H k (0), λ k } n k=1 as follows. First, note that λ k is the transmission rate, which reflects the infectivity of the virus, the prevention and control measures, and the rates of close contacts in the k-th region. In this report, λ k is allowed to be time-varying. Specifically, the total T days are split into two periods with time thresholds T k chosen for each k = 1, . . . , n, then λ k = λ k,1 if t < T k and λ k = λ k,2 otherwise.

Notice that the ODE system (2.3) is the mean-field version of our stochastic model, and is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint are all independent for k = 1, . . . , n, i = 1, . . . , T conditioned on the parameters Θ. Furthermore we suppose that ( ∆C a ) k (i) ∼ Pois((∆C a ) k (i)). Thus, the likelihood of Θ can be written as

Now, we denote the posterior distribution of Θ as π( Θ). Then by Baysian formula,

where p( Θ) is the prior distribution of Θ to be determined and Z = p

To fit the realistic evolution of the epidemic more precisely, Θ is estimated as Θ * = arg max π( Θ) = arg min V ( Θ) with reasonable prior distribution p( Θ). Then, MCMC sampling scheme starting from Θ * is applied to get the posterior distribution for Θ. This process might possess higher computational efficiency than choosing the initial point for MCMC randomly or empirically.

The remaining problem is to choose the prior distribution p( Θ). Presuming that the transmission rates in the regions owning more similarities are closer, p( Θ) is designed to combine the information of correlations between regions. In particular, given a matrix A which characterizes the pairwise similarities between the regions, and if we denote

is a constant depending on σ and A. Here, a small σ is chosen for p( Θ) to be a probability measure without imposing much restriction on λ (1) or λ (2) . Then,

Recalling that

8 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint it can be seen that by choosing the prior distribution as in (2.7), a l 2 regularization term 1

A is constructed from another affinity matrix W by further addressing the correlation between regions with more similarities.

For simulated data and real-world data in China, in which cases the transportation data are available,

. Nevertheless, for real-world data in Europe, where we are not aware of traveling data publicly available that are sufficient for the proposed model, W is just the all 1 adjacency matrix multiplied by a scalar. We remark that the affinity matrix W may also be obtained by ways other than using the transportation data, as long as it reflects the similarities between districts.

Once W is obtained, the next step of attaining A is to divide the n regions into d groups (

. . , n}, and ∀i = j, D i ∩ D j = ∅ ) so that the regions in the same groups have more similarities. Then a ij is constructed as follows. For any i, j ∈ {1, . . . , n}, if i, j ∈ D m for some m ∈ {1, . . . , d}, a ij = µ m w ij for some µ m > 0 relatively large; otherwise, if i, j / ∈ D m for any m ∈ {1, . . . , d}, a ij = µ 0 w ij for some µ 0 > 0 relatively small. By constructing A in the aforementioned way, correlations for regions in the same groups are further strengthened.

Finally, after choosing p( Θ) by determining A and σ, the optimization process Θ * = arg min V ( Θ) is accomplished through Matlab function fmincon. In addition, the posterior distribution for Θ could be obtained by classical MCMC sampling scheme starting from Θ * as mentioned above.

We finally summarize the procedure described in Section 2.2 in Algorithm 1 below.

Input: Data of total confirmed cases and total recovered and dead cases

, graph Laplacian prior parameters µ 0 , µ 1 , . . . , µ d , σ and affinity matrix W , division of n provinces D 1 , . . . , D d , number of MCMC iterations N ite Output: Inferred parameters Θ * , estimated marginal posterior distributions of Θ, predicted deterministic trajectoryŷ 1:T 1: Compute newly confirmed cases and newly recovered and dead cases,

Construct another affinity matrix A by 4: for i, j = 1, . . . , n do 5: if i, j ∈ D m for some m ∈ {1, . . . , d} then 6: a ij ← µ m w ij 7:

should be changed accordingly if λ is allowed to be time-varying

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. 

k=1 are the bins and {c l } n bin k=1 are the numbers of elements in the bins in N ite iterations

Once the estimated parameters Θ

)γ k and Θ * = arg min V ( Θ) as described in Section 2.2, the trajectories of newly confirmed cases could be simulated according to the stochastic dynamic process with Θ * . Furthermore, trajectories could also be sampled from the posterior distribution of Θ instead of using Θ * alone, which also takes the randomness from Θ into account. Particularly, this could be achieved by sampling Θ from MCMC and then simulating trajectories with the sampled {{γ * k } n k=1 , Θ}. Additionally, deterministic trajectories determined by (2.3) could also be computed by explicit Euler's method as in Algorithm 1.

Two specific cases are considered for simulated data. The first one considers only two provinces with transportation between each other. In contrast, the other case considers four provinces separated into two groups (the provinces in the same group are assumed to have more similarities) and with traffic between each pair of the provinces.

For each case, we first compare the predicted trajectories of our dynamic model with four baseline models detailed in Section 3.1.2. Comparison between the baseline models shows that heterogeneity of transmission parameters and transportation in the model improves the prediction. Then, the comparison between the baseline models and the proposed model, which further involves the correlation between regions besides heterogeneity and transportation, indicates that the graph Laplacian regularization improves generalization performance. Furthermore, the estimation of transmission parameters using different models shows the benefits of involving heterogeneity of transmission parameters. This also shows that the proposed model can estimate the transmission parameters, validating the capability of the model proposed in this paper in predicting the trajectories. Finally, we show quantitative evaluations of the proposed model and the baseline models by comparing training errors and testing errors defined in Appendix A, which further verifies our findings above.

We remark that the results below are for 100 replicas. In each replica, two random trajectories are sampled according to the stochastic model with prefixed parameters, part of which are treated as the ground truth training and testing trajectory, respectively (see more details in Appendix A). For each model, we first fit the parameters using the training trajectory and then predict the testing trajectory using the estimated parameters. The comparison of trajectory prediction is from one typical realization, for which we compare the ground truth training and testing trajectories with the fitted training and predicted testing trajectories for all the models. Additionally, parameter estimation and quantitative evaluations are compared with mean and standard deviation over all 100 replicas. The detailed computations of training and testing errors for simulated data can be found in Appendix A.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint

In this case, n = 2. The data span over a period of T = 30 days, and the threshold T th separating training and testing data is set to be 10 (more detailed can be seen in Appendix A). The other prefixed parameters are as follows:

• For k ∈ {1, 2}, N k (0) = 10 6 , E k (0) = 30, H k (0) = 10.

• For i ∈ {1, . . . , T }, W i 12 = W i 21 = w 12 = 5 × 10 3 .

• λ 1 = 0.4, λ 2 = 0.35, δ = γ 1 = γ 2 = 0.14. {λ k } are not time-varying.

We first specify the models to be compared below. The last one is the proposed model, and the first four models serve as baselines with different settings.

1. Models with uniform prior distribution, without heterogeneity or migration.

2. Models with uniform prior distribution, without heterogeneity but with migration.

3. Models with uniform prior distribution, with heterogeneity but without migration.

4. Models with uniform prior distribution, with both heterogeneity and migration.

5. Models with prior distribution based on graph Laplacian, with both heterogeneity and migration.

First, the models with uniform prior distributions themselves (Models 1-4) are compared according to whether two key assumptions exist in the model:

(1) Whether the transmission rates {λ k } are allowed to vary over regions.

(2) Whether there exists transportation between regions. Then, the model with prior distribution based on graph Laplacian (Model 5) is compared with those using uniform distributions as prior distributions (Models 1-4) . The former one utilizes the correlation between subpopulations by adding a l 2 regularization term for the model. In contrast, only lower and upper bounds are imposed on parameters without other prior information being used in the latter ones.

Note that for Model 5, that utilizes graph Laplacian based prior distribution as defined in (2.7), the parameter σ is taken to be 10. Moreover, since there are only two provinces in the current experiment, the intra-group parameter {µ m , m = 1, 2} does not appear in the final penalty, and only the inter-group parameter determines the graph Laplacian penalty, which is denoted as µ. Specifically, by (2.9), Θ = {E k (0), H k (0), λ k } n k=1 are estimated by the optimization problem in (3.1):

We remark that in Models 1-4, the estimation of Θ = {{γ k } n k=1 , Θ} still follows a similar two-step procedure as in Model 5, and the first step of obtaining γ 1) )γ k remains formally the same. The difference lies in the optimization object of Θ. First, the l 2 regularization term becomes prior knowledge of the parameters' upper and lower bounds. Second, for models without heterogeneity of parameters, {λ k } n k=1 are forced to be the same in (2.3). For models 11 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint without transportation between regions, terms involving W t disappear in (2.3). Additionally, for the other data sets considered in the following sections, the parameter estimation methods for the baseline models are similar and thus will not be repeated.

Finally, note that the model in [11] is similar to Models 1 and 2, since they all assume a spatially homogeneous transmission parameter. However, [11] assumed that the epidemic was seeded from one seed region while Models 1 and 2 do not make such assumption. Moreover, [11] focused more on the spread of the epidemic from the seed region at the early stage of the pandemic, and only the introduction dates in the other regions were utilized for the estimation of transmission parameters. In comparison, the estimation of transmission rate in Models 1 and 2 exploits the data in all regions in the whole process.

The true trajectories from a typical realization and the predicted trajectories of Models 1-5 are plotted in Figure 1 , and the differences of the true trajectories and the fitted trajectories are plotted in Figure 2 for better comparison of the five models.

As can be seen in Figure 1 and 2, Model 4 which allows both heterogeneity and transportation fits the training data better and also predicts the testing data with relatively smaller absolute errors compared with Models 1-3. In addition, Model 5, which further combines the correlation between provinces into prior distribution, has better performance of generalization than Model 4.

.

CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint

The mean and standard deviation of {λ i } estimated by the five models are reported in Table 2 . It can be seen from Table 2 that models allowing heterogeneity estimate parameters more accurately, and Model 5 that integrates the correlation leads to slightly better estimation for λ 2 .

Model M H GL prior

0.3907 ± 0.0390 0.3500 ± 0.0339 Table 2 : Estimated λ i of five models. Recall that the ground truth is λ 1 = 0.4, λ 2 = 0.35. The first three columns indicate whether the model permits migration between provinces, the heterogeneity of parameters and the Graph Laplacian prior, respectively. {λ i } are inferred for each of the 100 replicas, then the mean and standard deviation are presented in the table above. In addition, µ = 10 −2 in Model 5 for inference of parameters as defined in (3.1).

The training and testing errors computed according to Appendix A are listed in Table 3 below. From Table 3 , it can be seen that even without the prior based on graph Laplacian, heterogeneity of parameters and transportation between two provinces help reduce the testing errors. The concurrence of both the two assumptions makes the training and testing errors further decrease.

Model M H GL prior TrainRelErr (1) TestRelErr ( Table 3 : Training and testing error with standard deviation for simulated data (two provinces) of five models. The first three columns indicate whether the model permits migration between provinces, the heterogeneity of parameters and the Graph Laplacian prior, respectively. TrainRelErr (1) , TestRelErr (1) , TrainRelMSE (1) and TestRelMSE (1) are computed for each of the 100 replicas, then the mean and standard deviation are presented in the table above. In addition, µ = 10 −2 in Model 5 for inference of parameters as defined in (3.1).

The plots of the mean of TestRelErr (1) and TestRelErr (2) against varying µ are shown in Figure 3 . First, note that TestRelErr (1) of Model 4 is nearly the same as that of Model 5 with small µ, since Model 4 is the same as Model 5 with µ = 0. Then, it can be observed from the trend of TestRelErr (1) that as µ decreases, the mean of TestRelErr (1) first decreases and then slightly increases. This indicates that as µ decreases, Model 5 first underfits when µ is large since it forces {λ k } to be close to each other, and 13 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint Figure 3 : Mean of TestRelErr (1) and TestRelErr (2) against µ for simulated data (two provinces). The horizontal green lines show the values of the mean of the testing errors when µ = 0. then overfits when µ is small since nearly no restriction is imposed on {λ k }. TestRelErr (1) achieves its minimum when µ is close to 10 −2 . Figure 17 in Appendix C shows relative mean square testing errors against varying µ, in which the same trends described above can be seen.

However, the testing errors are close when µ is relatively small, and thus the benefit of introducing graph Laplacian based prior is less significant in this case. Such a trend is more apparent in the four provinces case in the following subsection.

In this simulated study, we let n = 4, T = 30, T th = 10. The other prefixed parameters are listed below:

• For k ∈ {1, 2, 3, 4}, N k (0) = 10 6 , E k (0) = 30, H k (0) = 10.

• For i ∈ {1, . . . , T }, k, l ∈ {1, 2, 3, 4} and k = l, W i kl = 5 × 10 3 .

• λ 1 = 0.5, λ 2 = 0.47, λ 3 = 0.4, λ 4 = 0.37, δ = γ 1 = · · · = γ 4 = 0.14. {λ k } are not time-varying.

The four provinces are divided into two groups, with the first group consisting of province 1 and 2 and the second group consisting of province 3 and 4. The similarities within groups are reflected in the settings that the values of {λ k } are closer for provinces in the same group.

The same five models (Models 1-5) as described in Section 3.1.2 are compared for the four provinces case.

Notice that for Model 5, there are three parameters µ 0 , µ 1 , µ 2 in the graph Laplacian prior, among which µ 0 controls the inter-group correlation, µ 1 and µ 2 control the correlations within group 1 and group 2, respectively. For simplicity of model comparison, we let µ 0 = 0 (no restrains on λ i 's in different groups), and µ 1 = µ 2 . The parameter σ in (2.7) is still chosen to be 10. Then for Model 5, by (2.9), Θ = {E k (0), H k (0), λ k } n k=1 are estimated by the optimization problem in (3.2):

14 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; 

The trajectories of a typical realization are plotted in Figure 4 and the differences between the true and fitted trajectories are shown in Figure 5 . As shown in these two figures, heterogeneity helps improve the prediction of testing data more than transportation, while introducing migration without heterogeneity of parameters worsens the estimation as can also be noticed from Table 5 . More explanations can be found in Section 3.2.5.

Additionally, Model 5 with prior distrbution based on graph Laplacian lowers the absolute errors of predicted trajectories compared with Model 4.

The mean and standard deviation of {λ i } estimated by the five models for four provinces case are reported in Table 4 . The same trend as in the two provinces case is observed in the current settings.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. Table 4 : Estimated λ i of five models. Recall that the ground truth is λ 1 = 0.5, λ 2 = 0.47, λ 3 = 0.4, λ 4 = 0.37. The first three columns indicate whether the model permits migration between provinces, the heterogeneity of parameters and the Graph Laplacian prior, respectively. λ i are inferred for each of the 100 replicas, and then the mean and standard deviation are presented in the table above. In addition, µ = 10 −1.55 in Model 5 for inference of parameters as defined in (3.2).

The training and testing errors are listed below in Table 5 . Different from the two provinces case, it can be seen from Table 5 that the errors increase greatly after transportation is included while heterogeneity remains absent. A possible explanation for this might be that without heterogeneity of parameters and transportation between provinces, the estimated λ k 's are lower than the true λ k 's for group 1, which leads to that the estimated newly confirmed cases are fewer than the true ones in group 1. For the same reason, the estimated newly confirmed cases are higher than the true ones in group 2. When the transportation is considered, more confirmed cases in group 1 are transferred to group 2 than the cases transported in the opposite direction. As a result, when the transmission parameters do not have heterogeneity, migration between provinces will worsen the prediction performance compared to the case without migration. Table 5 : Training and testing error with standard deviation for simulated data (four provinces) of five models. As in the two provinces case, The first three columns indicate whether the model permits migration between provinces, the heterogeneity of parameters and the Graph Laplacian prior, respectively. TrainRelErr (1) , TestRelErr (1) , TrainRelMSE (1) and TestRelMSE (1) are still computed for each of the 100 replicas, then the mean and standard deviation are presented in the table above. In addition, µ 1 = 10 −1.55 in Model 5 for inference of parameters as defined in (3.2).

In addition, Table 5 still shows that the presence of both heterogeneity and transportation helps reduce the training and testing errors by comparing the first four models. By comparing Model 4 and Model 5, it can be seen that using the graph Laplacian prior could further reduce the testing errors.

As in the previous subsection, the mean of TestRelErr (1) and TestRelErr (2) against varying µ 1 is shown below in Figure 6 , and the mean of TestRelMSE (1) and TestRelMSE (2) is shown in Figure 18 in Appendix C. From Figure 6 and Figure 18 , the phenomenon that Model 5 first underfits and then overfits as µ 1 decreases can be seen more recognizably compared with the two provinces case.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; Figure 6 : Mean of TestRelErr (1) and TestRelErr (2) against µ 1 for simulated data (four provinces). The horizontal green lines show the values of the mean of the testing errors when µ 1 = 0.

Real-world COVID-19 data in China and Europe are analyzed in this paper, respectively. Similar to the simulated data cases in Section 3, we first show the predicted trajectories of the proposed model (both deterministic and stochastic ones) and the baseline models (only deterministic ones) for some selected provinces in China or countries in Europe. This shows that the proposed model can capture and further predict the trend of the true trajectories and behaves better than the baseline models, mainly due to the introduction of heterogeneity of transmission parameters and the employment of correlation between provinces or countries. After that, we show the estimation of the posterior distribution of transmission parameters in a selected province or country using the proposed model. Finally, we evaluate the models quantitatively by computing the training and testing errors as described in Appendix A, which further demonstrates the benefits of the proposed model compared to the baselines.

Real-world data studied in this section involve the data in China and Europe.

Data in China include the number of confirmed cases, recovered cases, and fatalities in China's major provinces and municipalities from January 21st to March 28th. These publicly available data are downloaded from websites of health commissions of the provinces. The corresponding total population in each province or municipality is from [28] .

Transportation data in China are also utilized, lasting from January 10th to February 10th. The data used to reflect the number of people that move out from the cities are collected from Baidu Qianxi [29] (provided by Baidu). This data are collected by keeping track of the locations of smartphones that use the applications from Baidu and can be seen as reflecting the trend of the number of people moving out since Baidu is set as the default search engine for most people in China. Another data set used in this paper reflects the percentage of people moving to the corresponding destinations from their starting points and is also available on [29] . The traveling volume is estimated by combining these two data sets.

Data in Europe include the number of confirmed cases, recovered cases, and fatalities in the following 11 countries from May 1st to August 31st: Denmark, Finland, Norway, Austria, Germany, Switzerland, Italy, Spain, Belgium, France, and Ireland. The data are retrieved from [30, 12] . The corresponding total population in each country is from [31] . However, to our best knowledge, the transportation data between these countries in Europe needed in this model are not publicly available. Thus, the models are simplified when analyzing the European data, and the details can be seen in Section 4.3.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint

After the filtration process described in Appendix B, n = 21 provinces or municipalities are taken into consideration, which are Anhui, Beijing, Fujian, Gansu, Guangdong, Guangxi, Hebei, Henan, Hubei, Hunan, Jiangsu, Jiangxi, Liaoning, Ningxia, Shandong, Shanxi, Shanxi, Shanghai, Sichuan, Zhejiang, and Chongqing.

We set T = 31 (from January 10th to February 10th) and T th = 27 (February 5st). The data from T th,1 = 11 (January 21st) to the T th -th day (February 5st) are treated as the training set, and the data from T th + 1 (February 6th) to the t-th day (February 10th) are treated as the testing set. In addition, from observation, we choose T k = 24 (February 2nd) to be the day when λ k changes in Anhui, Beijing, Henan and Hubei, and T k = 20 (January 29th) for the other provinces. More details of the data set can be found in Appendix B.

In addition, the data from Baidu Qianxi might not be the exact traveling volumes between municipalities and provinces. We assume that the actual traveling volume from one starting point to one destination is proportional to the product of numbers of people departed from the stating point and the percentage of population travelling from this origin to the destination. The corresponding scaling parameter α also needs to be inferred from the data for all the models.

As in Section 3.1.2, the same Models 1-5 are compared for COVID-19 data in China. Recall that Model 5 is the model proposed in this paper and the other four are the baseline models for comparison.

Note that for Model 5 with prior distribution based on graph Laplacian, the provinces are chosen to be divided into two groups, which are Hubei and other provinces except Hubei, due to the overwhelmed medical resources in Wuhan at the early stage of the pandemic ( [32] ). For simplicity of comparison, we choose 2µ 0 = µ 1 = µ 2 . The parameter σ in (2.7) is chosen to be 1 in this case. Then for Model 5, by (2.9), Θ = {E k (0), H k (0), λ k,1 , λ k,2 } n k=1 and scaling parameter α are estimated by the optimization problem in (4.1): Θ * , α * = arg min(− log p(y 1:T th | Θ, α) + µ 0 i∈D1,j∈D2 Figure 7 shows the true and predicted trajectories for newly confirmed cases in Hubei. Note that in Figure  7 ,

• The red line shows the true trajectory.

• The blue line shows the predicted deterministic trajectories obtained by running (2.3) with Θ * (inferred using (4.1) with µ 0 = 10 −3.5 ).

• The blue scatter plots show 100 stochastic trajectories with Θ * .

• The red scatter plots show 100 stochastic trajectories sampled from the posterior distribution of Θ (which is estimated by 5 × 10 5 MCMC iterations staring from Θ * ).

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; Figure 7 : True and fitted trajectories in Hubei. The red line shows the true trajectory, the blue line shows the predicted deterministic trajectory with Θ * ( Θ * inferred using (4.1) with µ 0 = 10 −3.5 ), blue scatter plots show 100 stochastic trajectories with Θ * , and red scatter plots show 100 stochastic trajectories sampled from the posterior distribution of Θ (which is estimated by 5 × 10 5 MCMC iterations staring from Θ * ). The vertical lines show the threshold T th = 27 for the training data and testing data. The deterministic trajectories obtained by the other four models and the corresponding absolute errors are also plotted for better comparison. It can be seen that predicted deterministic trajectories generally capture the trend of the true trajectory. Moreover, sampling trajectories from the posterior distribution of Θ generates more randomness than sampling trajectories with Θ * . Besides, it can be seen from Figure  7 that the performances of all the five models are similar for Hubei. This may be explained by the fact that the cases in Hubei outnumber those in other provinces or municipalities, which makes all the models tend to fit the trajectory of Hubei best.

Similarly, Figure 8 shows the predicted trajectories in Henan. As can be seen in Figure 8 , unlike Hubei, heterogeneity of parameters helps the model fit and predict the trajectory of Henan in a more significant manner. When the transmission parameters in the two periods are forced to be the same in all provinces, the increment of the newly confirmed cases of the predicted trajectory is faster than the trend shown in the true trajectory in the first period. Meanwhile, the decrement of the newly confirmed cases of the fitted deterministic trajectory in the second period is also slower than the trend reflected in the true trajectory. Thus, heterogeneity helps improve the performance of fitting and predicting.

Moreover, from Figure 9 which shows the fitted trajectories in Anhui, it can still be seen that hetero-

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; Figure 9 : True and fitted trajectories in Anhui. The red line shows the true trajectory, the blue line shows the predicted deterministic trajectory with Θ * ( Θ * inferred using (4.1) with µ 0 = 10 −3.5 ), blue scatter plots show 100 stochastic trajectories with Θ * , and red scatter plots show 100 stochastic trajectories sampled from the posterior distribution of Θ (which is estimated by 5 × 10 5 MCMC iterations staring from Θ * ). The vertical lines show the threshold T th = 27 for the training data and testing data. geneity helps the models predict the testing trajectory better. Furthermore, introducing correlation helps Model 5 predict the trajectory marginally better than the other models.

To illustrate the estimation of the posterior of parameters, Figure 10 shows the estimated posterior distributions of λ k,1 and λ k,2 in Hubei using Model 5, in which the red lines show the inferred λ k,1 and λ k,2 with µ 1 = 10 −3.5 , and Table 6 shows the inferred λ k,1 and λ k,2 with the mean and standard deviation of their estimated posterior distributions from MCMC.

By comparing the transmission rates in Hubei in the two periods, it can be seen that the transmission rate decreases greatly after February 2nd, which indicates that the containment measures adopted in Hubei against the COVID-19 outbreak are effective. Table 6 : Estimated paramters in Hubei. θ * is inferred using (4.1) with µ 0 = 10 −3.5 . The posterior distribution is estimated by 5 × 10 5 MCMC iterations.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; Figure 11 : TestRelErr (1) and TestRelErr (2) against µ 0 for real-world data in China. The horizontal green lines show the values of the mean of the testing errors when µ 0 = 0. Table 7 shows the training errors and testing errors of the five models computed in the way described in Appendix A. As can be seen from Table 7 , both heterogeneity of parameters and transportation between provinces help reduce the training errors and the testing errors as the simulated data case. The utilization of graph Laplacian prior helps further reduce the testing errors. Note that the testing errors of Model 4 are close to those of Model 3, which might be explained as after the traveling restrictions are imposed on January 23rd ( [33] ), transportation no longer has much influence on the spread of the epidemic.

Model M H GL prior TrainRelErr (1) Table 7 : Training and testing error for real-world data in China of five models. The first three columns indicate whether the model permits migration between provinces, the heterogeneity of parameters and the Graph Laplacian prior, respectively. µ 0 = 10 −3.5 in Model 5 for inference of parameters as defined in (4.1). (1) and TestRelErr (2) against varying µ 0 , and Figure 19 in Appendix C plots TestRelMSE (1) and TestRelMSE (2) against varying µ 0 . From Figure 11 and Figure 19 , it can be observed that Model 5 first underfits and then overfits as µ 0 decreases.

T k = 50 is chosen to be the day when λ k changes for all countries except Norway, Italy and France. In addition, T N owway = 75, T Italy = 85 and T F rance = 60. Furthermore, we set T th = 95.

For COVID-19 data in Europe, only three models detailed below are compared, which allow heterogeneity of parameters but do not include transportation. The transportation data needed in this model are not available to the best of our knowledge. However, the impact of transportation is expected to be less significant due to travel restrictions [33] .

1'. Model with uniform prior distribution, without heterogeneity or migration.

2'. Model with uniform prior distribution, with heterogeneity but without migration.

3'. Model with prior distribution based on graph Laplacian, with heterogeneity but no migration.

For Model 3', 11 countries are partitioned into d = 4 groups:

• Northern Europe: Denmark, Finland, Norway;

• Central Europe: Austria, Germany, Switzerland;

• Southern Europe: Italy, Spain;

• Western Europe: Belgium, France, Ireland, Besides, in Model 3', we assume that µ 1 = · · · = µ d = µ within , µ 0 = µ between , and µ within = 10µ between for simplicity. The parameter σ in the prior distribution (2.7) is still taken to be 1. Then by (2.9), for Model 3', Θ = {E k (0), H k (0), λ k,1 , λ k,2 } n k=1 are estimated by the optimization problem in (4.2):

where w = 10 5 is a constant scaling. 

Same as before, Figure 15 shows the estimated posterior distributions of λ k,1 and λ k,2 in Italy using Model 3', in which the red lines show the inferred λ k,1 and λ k,2 with µ between = 1. Table 8 shows the inferred λ k,1 and λ k,2 and the mean and standard deviation of their estimated posterior distribution from MCMC.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; Figure 12 : True and fitted trajectories in Austria. The red line shows the true trajectory, the blue line shows the predicted deterministic trajectory with Θ * ( Θ * inferred using (4.2) with µ between = 10 −1 ), blue scatter plots show 100 stochastic trajectories with Θ * , and red scatter plots show 100 stochastic trajectories sampled from the posterior distribution of Θ (which is estimated by 5 × 10 5 MCMC iterations staring from Θ * ). The vertical lines show the threshold T th = 95 for the training data and testing data. The red line shows the true trajectory, the blue line shows the predicted deterministic trajectory with Θ * ( Θ * inferred using (4.2) with µ between = 10 −1 ), blue scatter plots show 100 stochastic trajectories with Θ * , and red scatter plots show 100 stochastic trajectories sampled from the posterior distribution of Θ (which is estimated by 5 × 10 5 MCMC iterations staring from Θ * ). The vertical lines show the threshold T th = 95 for the training data and testing data.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; Table 8 : Estimated paramters in Italy. θ * is inferred using (4.2) with µ between = 10 −1 . The posterior distribution is estimated by 5 × 10 5 MCMC iterations. Table 9 shows training and testing errors of the three models. It can be seen from Table 9 that heterogeneity of transmission parameters helps reduce both training and testing errors. We can also see that adding Graph Laplacian prior properly further reduces the testing errors greatly without increasing training errors much.

Model M H GL prior TrainRelErr (1) Table 9 : Training and testing error for real-world data in Europe of five models. The first three columns indicate whether the model permits migration between provinces, the heterogeneity of parameters and the Graph Laplacian prior, respectively. µ 0 = 10 −1 in Model 3' for inference of parameters as defined in (4.1). Figure 20 in Appendix C plot TestRelErr (1) , TestRelErr (2) , TestRelMSE (1) and TestRelMSE (2) against varying µ between , from which one may see the same trend of prediction performance as µ between varies.

In this paper, we propose a stochastic dynamic model inspired by [1] , considering the inter-district transportation and both spatially and temporally heterogeneous transmission parameters, which can model the ongoing and lasting spread of the epidemic in multiple districts. Based on this model, we also introduce a two-step procedure for estimating the parameters, which maximizes over posterior distribution with a 24 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint Figure 16 : TestRelErr (1) and TestRelErr (2) against µ between for real-world data in Europe. The horizontal green lines show the values of the mean of the testing errors when µ between = 0. prior distribution that utilizes graph Laplacian regularization to address the correlation between districts. Experiments on both simulated and real-world data show that compared with the baseline models, this new model with transportation, heterogeneity of parameters, and usage of the correlation in parameter inference improves the performance of fitting and generalization.

We acknowledge that there are limitations in this work. First, the stochastic dynamic model proposed in this paper might not be comprehensive enough to characterize the spread of COVID-19 in reality. For example, the model does not account for the asymptomatic virus carriers and the infectious incubation period. Second, the proposed model does not consider the heterogeneity of risk of being infected in different populations, for example, populations with different ages, jobs, or health conditions. Third, The proposed model needs further modification considering the large-scale COVID-19 vaccination. At last, the current method does not provide the guidance to determine how intensively we would like to stress the correlation between districts.

The current methods have possible extensions in the following several directions in the future study. First, the dynamic model is to be refined to be closer to reality, for example, taking the asymptomatic virus carriers and medical tracking into account and allowing the change of the transmission parameters as time increases to be more realistic. Second, the model can be modified to accommodate the heterogeneity of various populations, including different age groups and vaccination statuses. For example, the compartments S, E, H, R are to be further divided into subgroups, and the parameters such as transmission rates and mortality rates are allowed to be different; meanwhile, the dynamic model is also to be adjusted accordingly. At last, the method could be further used to assess the influence of containing measures taken by different countries, for example, through adjusting the traveling volume to make it different from the actual transportation data and then analyzing the corresponding changes in newly confirmed cases.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; where j ∈ {1, 2}. After defining the errors for each region, the total errors are defined as the summation 29 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. In addition, the testing errors and weights are defined in the same way as above, which are listed below,

• Absolute error and relative error of the k-th region on the i-th day: 

k,test (i) = 1 T − T th .

• Weighted absolute error and relative error of the k-th region:

k,test (i)TestAbsErr k (i), TestRelErr

k,test (i)TestRelErr k (i).

• Weighted absolute and relative mean square error of the k-th region: • Total weighted absolute and relative mean square error:

In the above definitions, i ∈ {1, . . . , T }, k ∈ {1, . . . , n}, j ∈ {1, 2}.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint Figure 17 : Mean of TestRelMSE (1) and TestRelMSE (2) against µ for simulated data (two provinces). The horizontal green lines show the values of the mean of the testing errors when µ = 0. Figure 18 : Mean of TestRelMSE (1) and TestRelMSE (2) against µ 1 for simulated data (four provinces). The horizontal green lines show the values of the mean of the testing errors when µ 1 = 0.

For stability of inference for parameters, only the provinces or municipalities whose accumulated confirmed cases are above 50, the estimated δ k are not less than 10 −10 and the change of λ k occurs no later than February 2nd are considered.

The COVID-19 data including the number of confirmed cases, recovered cases and fatalities in major provinces and cities of China are from January 21st to March 28th. Moreover, the transportation data are from January 10th to February 10th. Thus, we consider the spread of COVID-19 in China from January 10th to February 10th, and T = 31. However, only the COVID-19 after January 21st are available, hence the training set is the data from T th,1 = 11 (January 21st) to the T th -th day (February 5st), and the testing set is the data from T th + 1 (February 6th) to the t-th day (February 10th).

CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint Figure 19 : TestRelMSE (1) and TestRelMSE (2) against µ 0 for real-world data in China. The horizontal green lines show the values of the mean of the testing errors when µ 0 = 0. (1) and TestRelMSE (2) against µ between for real-world data in Europe. The horizontal green lines show the values of the mean of the testing errors when µ between = 0. 32 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted January 3, 2022. ; https://doi.org/10.1101/2022.01.01.21268139 doi: medRxiv preprint

Modeling the spatial spread of infectious diseases: The global epidemic and mobility computational model

World Health Organization (WHO)

Assessing the international spreading risk associated with the 2014 west african ebola outbreak

Real-time assessment of the international spreading risk associated with the 2014 west african ebola outbreak

The effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak

Estimate of novel influenza a/h1n1 cases in mexico at the early stage of the pandemic with a spatially structured epidemic model

Modeling the critical care demand and antibiotics resources needed during the fall 2009 wave of influenza a(h1n1) pandemic

Modeling vaccination campaigns and the fall/winter 2009 activity of the new a(h1n1) influenza in the northern hemisphere

Real-time numerical forecast of global epidemic spreading: Case study of 2009 a/h1n1pdm

Assessment of the middle east respiratory syndrome coronavirus (mers-cov) epidemic in the middle east and risk of international spread using a novel maximum likelihood analysis approach

Seasonal transmission potential and activity peaks of the new influenza a (h1n1): a monte carlo likelihood analysis based on human mobility

An interactive web-based dashboard to track covid-19 in real time

Forecasting seasonal influenza fusing digital indicators and a mechanistic disease model

Predicting the behavior of techno-social systems

Modeling the heterogeneity in covid-19's reproductive number and its impact on predictive scenarios

Time-dependent heterogeneity leads to transient suppression of the covid-19 epidemic, not herd immunity

Epidemic outbreaks in complex heterogeneous networks

A contribution to the mathematical theory of epidemics

Estimation of country-level basic reproductive ratios for novel coronavirus (sars-cov-2/covid-19) using synthetic contact matrices

Heterogeneity in sir epidemics modeling: superspreaders and herd immunity

Epidemic progression and vaccination in a heterogeneous population. application to the covid-19 epidemic

Intracounty modeling of covid-19 infection with human mobility: Assessing spatial heterogeneity with business traffic, age, and race

State-specific projection of covid-19 infection in the united states and evaluation of three major control measures

Prediction of the covid-19 outbreak based on a realistic stochastic model

Solutions of ordinary differential equations as limits of pure jump markov processes

Limit theorems for sequences of jump markov processes approximating ordinary differential processes

Temporal dynamics in viral shedding and transmissibility of covid-19

Annual data by province

Covid-19 data repository by the center for systems science and engineering (csse) at johns hopkins university

List of european countries by population

Information Office of Hubei Provincial People's Government. Prevention and control of pneumonia outbreak of new coronary virus infection

Covid-19 government response tracker

The authors thank Dr. Huaiyu Tian for helpful discussion. 

Measurements are proposed in this section to evaluate the performance of fitting and generalization of the aforementioned models in the experiments section.To be more specific, the performance of fitting is measured on the training data {( ∆C a )[T r] k (i), k = 1, . . . , n} T th i=1 up to some threshold T th , and the performance of generalization is evaluated on the testing data {( ∆C a ). Note that T th should not be earlier than the maximum of T k , which is the day that λ k changes in the k-th region, ∀k = 1, . . . , n.For both simulated and real-world data, the fitting performance is assessed by comparing the train-are inferred by the corresponding model. Similarly, the generalization performance is assessed by comparing the testing data {( ∆C a )3) with parameters Θ. Although the deterministic trajectory is still computed by ODE system (2.3), its initialization is the number of individuals in compartments on the T th -th day of the testing data, i.e.,

For simulated data, two random trajectories are sampled according to the stochastic model with prefixed parameters in one replica. Data from the first T th days of one trajectory are treated as the training data, and data from the last T − T th days of the other trajectory are treated as the testing data. This guarantees that the training and testing data are independent.Besides, the prestored {S k (T th ), E k (T th ), H k (T th ), R k (T th )} of the testing trajectory are used as the initialization of (2.3) to obtain the deterministic trajectory {(∆C a )[T e,Θ] k (i), k = 1, . . . , n} T i=T th +1 as described above.

For real-world data, only one observable trajectory is available, thus the training data and the testing data are the two parts of the true trajectory split by the T th -th day.Based on the explanations above, the training errors and testing errors can be defined. Details are listed below.Remind that the ground truth training data and testing data are {( ∆C a )[T e] k (i), k = 1, . . . , n} T i=T th +1 , respectively. In addition, the fitted deterministic training trajectories and predicted deterministic testing trajectories with parameters Θ are denoted as {(∆C a )[T e,Θ] k (i), k = 1, . . . , n} T i=T th +1 , respectively. The details of obtaining training data, testing data along with fitted trajectories for both simulated data and real-world data have been explained above.Based on the ground truth trajectories and the fitted ones, definewhich denote the absolute error and relative error for the training data in the k-th region on the i-th day, respectively.With the errors defined for each day, the total errors for the k-th region are the weighted sums ofThere exist two types of reasonable weights for weighting the errors. The first type is defined aswhich share with same trend of increasing or decreasing as the newly confirmed cases in the k-th region. Moreover, the second type of weights is the simple average of errors for all the T days, that is to say, Wk,train (i) =