key: cord-0167434-dsmblunh authors: Castro, Lauren; Fairchild, Geoffrey; Michaud, Isaac; Osthus, Dave title: COFFEE: COVID-19 Forecasts using Fast Evaluations and Estimation date: 2021-10-04 journal: nan DOI: nan sha: 9cada1ba387c8493f6c56a58c6c86448f21a036a doc_id: 167434 cord_uid: dsmblunh This document details the methodology of the Los Alamos National Laboratory COVID-19 forecasting model, COFFEE (COVID-19 Forecasts using Fast Evaluations and Estimation). •ÿ d,t = t j=1 y d,j is the cumulative number of reported deaths of COVID-19 through day t as reported by CSSE at JHU (observable) • δc,t is the underlying number of reported cases on day t (unobservable) •δc,t = t j=1 δc,j is the underlying number of cumulative reported cases through day t (unobservable) • δ d,t is the underlying number of reported deaths on day t (unobservable) •δ d,t = t j=1 δ d,j is the underlying number of cumulative reported deaths through day t (unobservable) • δs,0 is the underlying number of susceptible individuals at the start of the pandemic (unobservable) • δs,t = δs,0 −δc,t is the underlying number of susceptible individuals on day t (unobservable) We use the convention that bolded quantities are vectors and unbolded quantities are scalars. For concreteness, yc,t is a scalar while yc,1:t = (yc,1, yc,2, . . . , yc,t) is a t × 1 vector. Let yc,t|δc,t, α ∼ NB δc,t, δc,t α (1) where NB(a,b) is a Negative-Binomial model with mean parameter a > 0 and size parameter b > 0 where E(yc,t|δc,t, α) = δc,t Var(yc,t|δc,t, α) = δc,t(1 + α). (3) Figure 1 shows the daily reported cases for New Mexico, the United States (US), and France. All three regions have gone through rising and declining periods of cases with various levels of noise in the reported cases. In what follows, we outline the steps COFFEE takes to produce forecasts of reported cases. Step 1: Identify and Adjust Outliers COFFEE automatically identifies and adjusts outliers [2] . It runs five different outlier detection algorithms on the reported data, taking into account possible day-of-week (DOW) effects. A datum is declared an outlier if three or more of the five detection algorithms identify that datum as an outlier. The outliers are not removed, but rather adjusted to ensure all values are non-negative. Figure 2 shows the result of this process on daily cases for New Mexico, the US, and France. All subsequent modeling steps are conducted with outlier adjusted data. Step 2: Compute the Empirical Growth Rate,κ t The model for the underlying number of reported daily cases, δc,t, is a dynamic susceptibleinfectious (SI) model [1] , where δc,t =δc,t−1 + δc,t, and δc,t = κt δs,t−1 δs,0δ The quantityδc,t−1 is the cumulative number of underlying cases on day t − 1, is the proportion of the population still susceptible at time t − 1, and κt is the growth rate on day t. When δ s,t−1 δ s,0 ≈ 1 (when most of the susceptible population is still susceptible), we can rearrange Equations 4, 5, and 6 to identify a crude estimator of κt: Estimates for κt are shown in Figure 3 . It is clear thatκt is dynamic and changes over time. This is what makes forecasting COVID-19 so challenging; parameters of epidemiologicallymotivated models are dynamic and forecasting with them requires anticipating how these dynamic parameters will change in the future, not just tracking where they have been in the past. In what follows, we describe how we forecastκt. Step 3: Computeκ * t = logit(κ t ) After the initial portion of the outbreak,κt is almost always between 0 and 1. COFFEE logit transformsκt, where logit(p) = log(p/(1 − p)) for p ∈ (0, 1). For all days with no reported cases,κt = 0, an incompatible value with the logit transform. Thus, we computeκ * t = logit(κt) as follows: where τc = 0.95 * min({κt|κt > 0}), ensuring that ifκt ≥κ t , thenκ * t ≥κ * t . The logit transformedκt are shown in Figure 4 . Step 4: Split Data into Training and Testing Sets Let T be the last observed day. We only consider the last 42 days of data when fitting a model forκ * t and split those days into training and testing data [3] . Days T − 41 through T − 14 constitute the training data, while T − 13 through T constitutes the testing data. We will denote the last day of the training data by T train = T − 14. The splits are shown for New Mexico, the US, and France in Figure 5 . Step 5: Computeκ trend t We fit a weighted regression to the training data, where we downweight influential points that could have an outsized influence on the regression using the inverse of Cook's distance. The regression has a linear trend over time and a DOW effect: . We refer to the fits and predictions from this linear model asκ trend where, if a variable was removed during the variable selection phase, then the correspondinĝ β is set equal to 0. Step 6: Computeκ constant t There is aκt trajectory corresponding to a constant new number of cases day over day. Let be the average number of daily reported cases over the last week of the training window. where we set δs,0 = 0.55N , where N is the population of the forecasted region and 0.55 is a nominal attack rate for COVID-19. Step 7: Compute a Joint Probability Distribution over Tuning Parameters The form of the forecasting model forκ forecast for t = T train + k and k ∈ 1, 2, . . . , 14. The objective is to produce forecasts forκt. The way COFFEE does this is be creating a blended combination ofκ trend behind this modeling choice is that, as cases are going up, people will take action to curb the growth of the pandemic, either through independent choices of personal responsibility or governmental policies. As cases are going down, however, we assume policies will be relaxed or people will become more comfortable engaging in activities that will increase transmission pathways. The tuning parameter ω ≥ 1 determines how quickly the forecast transitions from The closer ω is to 0, the quicker the transition occurs. where k is a positive integer. Figure 8 shows weight trajectories wt for various choices of ω. When w T train +k = 1, all weight is onκ trend t ; when w T train +k = 0, all weight is on The third tuning parameter is φ. The trajectory λt is defined as a linear trend starting at 1 when k = 0. The tuning parameter φ > 0 determines whether λ T train +k trends up (φ > 1) or down (φ < 1). Examples of λ T train +k are shown in Figure 9 . For a combination of η, ω, and φ, we computeκ forecast t (η, ω, φ) and compute the inversedistance between the inverse-logit ofκ forecast t (η, ω, φ) andκt over the test period: Finally we compute a joint probability distribution over η, ω, and φ as the normalized inverse-distance. The probability distributions for New Mexico, the US, and France are shown in Figure 10 . Step 8: Produce the Reported Cases Forecast The final step is to simulate reported cases. The purpose of the previous steps was to get a joint probability distribution over the tuning parameters that can be used to sample from. If more than 14 of the last 28 days had zero reported cases, we take independent and identically distributed (iid) samples of future reported cases from the empirical distribution of outlier adjusted reported cases over the last 28 days. If no cases were reported over the Figure 12 shows the daily deaths for New Mexico, the US, and France. Figure 12 : Daily reported deaths for New Mexico, the US, and France. The COFFEE deaths model is where γt is the case fatality ratio and f (δc,1:t, ν) is a moving average of δc,1:t with window size equal to ν: The deaths model proceeds with the following steps. Step 1: Identify and Adjust Outliers Step 3: Computeγ * t = logit(γ t ) Step 4: Split Data into Training and Testing Sets Splitγ * t in a training and testing data set, same as with the cases model. Step 5: Computeγ trend t Fit a regression model with a linear date term and a DOW effect toγ * t , analogous to Equation 9. Variable selection is then performed. The fitted regression and predictions (γ trend t ) are shown in Figure 16 . Step 6: Compute a Joint Probability Distribution over Tuning Parameters The form of the forecasting model forγ forecast Step 7: Produce the Reported Deaths Forecast The final step is to simulate reported deaths. The purpose of the previous steps was to get a joint probability distribution over the tuning parameters that can be used to sample from. If more than 14 of the last 28 days had zero reported deaths, we take iid samples of future reported deaths from the empirical distribution of outlier adjusted reported deaths over the Figure 17 : Joint probability distributions over tuning parameters ν (columns), θ upper , and θ lower for New Mexico, the US, and France (rows). last 28 days. If no deaths were reported over the last 28 days, we sample future reported deaths as iid Bernoulli draws with success probability equal to 1/29. If 14 or more of the last 28 days observed at least 1 reported death, we simulateγ forecast T +k for k ∈ 1, 2, ..., K by doing the following: 1. Fit the regression outlined in Step 5 to days T to T − 27. Use this to computeγ trend T +k . 2. Draw a vector of (ν, θ lower , θupper) from the joint distribution computed in Step 6. if k − ν > 0. Figure 18 shows the daily deaths forecasts for New Mexico, the US, and France. The stochastic SI model with recruitment and deaths I. Comparison with the closed SIS model tsoutliers: Detection of Outliers in Time Series Data splitting Figure 18 : The median (black line) and 50% and 80% prediction intervals (ribbons) for New Mexico, the US, and France for daily reported deaths.