key: cord-0577382-5oyeq06k authors: Li, Shuang; Wang, Lu; Chen, Xinyun; Fang, Yixiang; Song, Yan title: Understanding the Spread of COVID-19 Epidemic: A Spatio-Temporal Point Process View date: 2021-06-24 journal: nan DOI: nan sha: a48e1c4fc515049f715074e9e8501c5cf792b0a1 doc_id: 577382 cord_uid: 5oyeq06k Since the first coronavirus case was identified in the U.S. on Jan. 21, more than 1 million people in the U.S. have confirmed cases of COVID-19. This infectious respiratory disease has spread rapidly across more than 3000 counties and 50 states in the U.S. and have exhibited evolutionary clustering and complex triggering patterns. It is essential to understand the complex spacetime intertwined propagation of this disease so that accurate prediction or smart external intervention can be carried out. In this paper, we model the propagation of the COVID-19 as spatio-temporal point processes and propose a generative and intensity-free model to track the spread of the disease. We further adopt a generative adversarial imitation learning framework to learn the model parameters. In comparison with the traditional likelihood-based learning methods, this imitation learning framework does not need to prespecify an intensity function, which alleviates the model-misspecification. Moreover, the adversarial learning procedure bypasses the difficult-to-evaluate integral involved in the likelihood evaluation, which makes the model inference more scalable with the data and variables. We showcase the dynamic learning performance on the COVID-19 confirmed cases in the U.S. and evaluate the social distancing policy based on the learned generative model. Since the first coronavirus case was confirmed in Washington state on Jan. 21, up to May 21 more than 1.5 million people have confirmed COVID-19 and more than 93,000 people have died from the disease in the U.S. 1 This infectious respiratory disease has spread rapidly across more than 3000 counties and 50 states, with the exponential growth of confirmed case count in March and with all 50 states reporting cases by March 17. In April, the U.S. became the nation with the most confirmed cases and most deaths globally. On March 15, the Centers for Disease Control and Prevention advised against gatherings of 50 or more people for the next two months, and two of the first U.S. hot spots, Washington state and Illinois, closed all bars and restaurants. On the next day, many cities and states shut down social life and many schools began to close. The increasing temporal patterns exhibit significant differences county by county, which are influenced by features such as population and location. The three states, New York state, Connecticut, and New Jersey alone, have accounted for about 50% of all U.S. confirmed cases since March 20. This paper is motivated by modeling and predicting the spread of COVID-19 the exhibits clustering and triggering patterns in time and space. We propose a generative model to track the spread of the disease and directly captures how infections are transmitted. Our model can help understand how one state's outbreak compares with another's and provides a simulator to evaluate policy, such as when is the best time to start and ease the restrictions on the social distancing. We treat confirmed COVID-19 cases as discrete events, and directly model the transmission of the events by spatio-temporal point processes (STPPs). STPPs model the generative process of discrete events in continuous time and space by intensity function, without the need to divide the space and time into cells [13, 2, 16] . The occurrence intensity of events is a function of space, time, and history, and explicitly characterizes how the events are allocated over time and space. The propagation of contagious diseases such as COVID-19 often exhibit self-exciting patterns [16] that the occurrence of a previous event will boost the occurrence of new events [11, 8, 15 ] within a region centered around the current location. Existing spatio-temporal self-exciting models require handcrafting the triggering kernel to capture the propagation patterns. The log-Gaussian Cox process, where the log intensity function is a random realization drawn from a Gaussian process [12, 3] , although flexible, requires a prespecified mean and covariance function to incorporate an accurate prior belief on the spacetime interleaved correlation. Moreover, this model faces challenges to scale with voluminous data like the COVID-19 in the U.S. and is not proper in this setting. To alleviate the model-misspecification, we propose a customized imitation learning framework for spatiotemporal point processes. Our policy-like generative models are intensity-free, with the output events (i.e., confirmed case in space and time) directly produced by nonlinear transformations to the history embedding. This generative process mimics the self-exciting mechanism, but the neural-based nonlinear transformations add flexibility to the triggering kernel that can be learned in a data-driven fashion. Furthermore, by incorporating features relative to population, lockdown time, and other spatial and temporal covariates to the intensity function, we add flexibility and interpretability to the model. The learned model can be used to evaluate how population and lockdown time will impact the spread of the virus. We adopt an imitation learning framework [1, 14, 18, 7] to learn the generative model (i.e., policy) by minimizing the discrepancy between the generated events and the observed events, where the learning method is an extension of [10] to the spatio-temporal setting; yet our point process generator is intensity-free. We empirically demonstrate the sound performance of our method in generating and forecasting the confirmed COVID-19 cases in the U.S. We are interested in learning the generating dynamics of events localized randomly in time and space. Each event is recorded as a tuple e := (t, u), where t ∈ R + is the occurrence time and u ∈ S is the occurrence location of the event. A spatio-temporal point processes (STPP) is a random process whose realization consists of an ordered sequence of events, i.e., H t := {e 1 = (t 1 , u 1 ), . . . , e i = (t i , u i ) | t i < t}, where H t is the history up to time t and H t is σ-algebra. Conditional Intensity Function. Denote N (A) as the number of events, such as e = (t, u), falling in the set A ⊂ R + × S. The dynamics of STPP can be characterized by a conditional intensity function, denoted as (1) which specifies the mean number of events in a region (i.e., infinitisemal interval and region around t and u) conditional on the past. The propagation of contagious diseases often exhibit self-exciting patterns that can be characterized in terms of the conditional intensity function of the form where β 0 (s) is the exogenous event intensity that models drive outside the region, and the endogenous event intensity i:ti