key: cord-0442337-thf7immb authors: Albani, Vinicius V. L.; Velho, Roberto M.; Zubelli, Jorge P. title: Estimating, Monitoring, and Forecasting the Covid-19 Epidemics: A Spatio-Temporal Approach Applied to NYC Data date: 2020-11-17 journal: nan DOI: nan sha: b383d2067a232c01c35417b1417878e6a23281f6 doc_id: 442337 cord_uid: thf7immb We propose an SEIR-type meta-population model to simulate and monitor the Covid-19 epidemic evolution. The basic model consists of seven compartments, namely susceptible (S), exposed (E), three infective classes, recovered (R), and deceased (D). We define these compartments for n age and gender groups in m different spatial locations. So, the resulting model has, for each age group, gender, and place, all epidemiological classes. The mixing between them is accomplished by means of time-dependent infection rate matrices. The model is calibrated with the curve of daily new infections in New York City and its boroughs, including census data, and the proportions of infections, hospitalizations, and deaths for each age range. We end up with a model that matches the reported curves and predicts accurately infection information for different places and age classes. During December 2019, in Wuhan, China, many cases of severe acute respiratory syndrome, caused by an unknown virus, were registered. Since then, the virus was named Sars-Cov-2 and the corresponding disease was called by the acronym Covid-19, which means Coronavirus Disease 2019. On 11-Mar-2020, a pandemic was declared by the World Health Organization (WHO). Then, by 08-Oct-2020 Sars-Cov-2 has infected more than 36 million individuals and has caused more than one million deaths around the world. One clear message from the above history of the pandemics so far is that the study and management of the crisis calls for the use of spatial-temporal epidemiological models and to their appropriate calibration by means of mathematical tools from numerical analysis and regularization theory [15, 29] . This is the first goal of the present article. It is now well-documented that the chance of developing the more severe form of the disease increases dramatically with age [32, 36, 38, 35, 8] . In addition, the infection shows to be more frequent in older people [8] . This indicates that age plays an important role in possible containment and mitigation measures. Gender also seems to play a major role on Covid-19 outcomes, as documented in [5, 22] . Incorporating age and gender structure in the model is the second goal of this article. The absence of a universally available vaccine and of an effective treatment, by the time this article is written, limits considerably the number of possible actions to control the spread of the disease and the subsequent volume of hospitalizations and deaths. Thus, only containment or mitigation policies, like quarantine or lockdown may be applied. However, such measures have been causing an unprecedented impact on the economy and labor market, leading to massive unemployment and recession [9] . The International Monetary Fund has revised its forecast in April 2020 and it predicted a 4.9% drop in global output in 2020 [33] . Quantifying and tracking the disease spread in different places and age ranges, as well as its impact on the health system, is useful to decide if lockdown measures can be relaxed, allowing then the return of economic activities gradually. This is the third goal of this work. Moreover, an accurate forecasting of the number of regular hospital and intensive care unit (ICU) beds allows a better use of public resources, bringing economic relief. Susceptible-Exposed-Infected-Recovered (SEIR) and Susceptible-Infected-Recovered (SIR) models have been used to describe different disease outbreaks dynamics since the seminal work [24] . See also [23] in a textbook format. Recently, several authors [2, 6, 7, 13, 17, 36, 37] have applied SIR-and SEIR-type models to describe the Covid-19 epidemic, including different features, like geographical information and time-dependent transmission parameters. In this article, we also propose a versatile SEIR-type model applied to Covid-19 epidemic dynamics. Our model takes into consideration different levels of disease severity, its impact on age ranges, and the distribution of the population in different locations. Following [32] , individuals in a severe state of this disease are accounted in our model as hospitalized, while those critically ill are considered in an intensive care unit (ICU). The interaction between classes of infected and susceptible individuals from different age-ranges, genders, and places is defined by time-dependent transmission matrices. Such matrices, if appropriately calibrated with up-to-date data on daily new infections, can be used to reconstruct the status of the disease spread and they allow us to verify the impact of containment measures. Concerning vaccines, the flexible general form of our model can be used to design vaccination strategies that account for age, gender, and spatial distribution of susceptible population. From the sanitary point of view, such designed strategies may break the transmission chain of the disease, while optimizing the costs of immunization of the population on the financial side. The dependency of disease severity on age-range and gender is translated into the model through transmission rates, as well as the rates of recovery, hospitalization, ICU admission, and death. The values of those rates are based on publicly available datasets and recent studies that analysed, among other characteristics, the relationship between Covid-19 severity and age [37, 8, 32, 11] or gender [5, 22] . Other parameters, like mean incubation time and case fatality rate outside ICU, were obtained from [26, 19, 20, 21, 38, 35] . As mentioned above, our proposed model also accounts for spatial information. Policy makers can then identify clusters of uncontained disease spread in real time, isolate them, and later verify if the chosen imposed restrictions measures were effective. Moreover, the model can be used to detect which regions should be reopened first, thus reducing the impact a lockdown creates on economy. Once an effective vaccine is available, the model could be used to target regions where there are clusters of infected individuals. This approach would then be used to create immunization belts around such regions. Moreover, the proposed model is able to forecast future spatial-and age-distributed clusters of infected individuals and bring information to design contention or immunization measures. We end up with a model that is useful to track and forecast epidemics caused by new emerging pathogens, including Sars-Cov-2, in different geographical scales, gender, times, and age classes. The model is easy to implement, since the set of ordinary differential equations (ODE) can be solved by general-purpose ODE solvers. Furthermore, the model is simple to calibrate, with, for example, gradient-based optimizers or Bayesian inference algorithms. We obtain the geographical distribution of the disease dynamics considering the five NYC boroughs (Manhattan, Bronx, Brooklyn, Queens, and Staten Island) using the census data [12] , the curve of daily new infections [11] , and the corresponding proportions of hospitalizations and deaths depending on age classes, by borough. The estimation of the parameters is performed by minimizing a log-posterior density with a gradient-descent technique. Bootstrap sampling is used to test parameters sensitivity as well as to provide 90% confidence intervals [10] . Main Findings After smoothing out the daily curves through 7-days moving averages, we estimate the model parameters. The predicted curve by the model for daily new infections has good adherence to the averaged data curve. Furthermore, the predictions of hospitalizations and deaths match well the reported values based on NYC data and its boroughs. We observe a dramatic change in the pattern of disease transmission on 19-Mar-2020, identifying the effectiveness of containment measures imposed a week earlier, when a state of emergency was declared and people were asked to stay home. We can also observe a considerable drop of the time-dependent transmission coefficient and the time-dependent basic reproduction rate (obtained via the next-generation matrix technique [14] ). We also noticed this phenomenon in the dynamics of the time-dependent transmission coefficients associated to the NYC boroughs. When analysing the datasets, the patterns of daily hospitalizations and deaths changed consistently between the end of February and the end of August, especially the rate of hospitalization, which has decreased systematically since the end of March. To account for this feature, we allow the model rates of hospitalization and death to be time-dependent.This produces model predictions adherent to the datasets. Short-term forecasts, with calibrated parameters, were also tested in two different situations, namely, during the transmission regime change and after the spread containment. In both cases, the model predicted accurately the observed scenarios. Moreover, different reopening scenarios were simulated, considering the impacts of reopening the entire NYC, the Staten Island only, or schools only. In all such cases, unless strict social distance measures were kept, model predictions indicate new infection waves that affect the population of the entire city (Figs. 5-7). Such findings are corroborated by recent news, with new infection waves identified in Europe, New Zealand, and China [3, 28, 16] , as well as the reports of Covid-19 spread amongst youth population on an overnight camp in Georgia (United States) and at schools in Israel [31, 25] . This section presents the epidemiological model and the procedure to estimate the model parameters from the reported data. The SEIR-type model considered here accounts for disease severity, age ranges, gender, and geographic distribution of some pre-defined group or population. For simplicity, we postpone the inclusion of gender and (spatial) location dependence to the end of the present section. A number n of age ranges is assumed, each one represented by the superscript i = 1, . . . , n and distributed in seven epidemic compartments: susceptible (S i ), exposed but not yet infective (E i ), infective in mild conditions (I i M ), infective in severe condition or hospitalized (I i H ), infective in critical condition or in an intensive care unit (ICU), denoted by I i I , recovered (R i ), and deceased (D i ). Following [32] , we assume the following forms are synonyms: in severe condition and hospitalized. The same applies for the forms in critical conditions and in ICU. Each individual in the first two infective compartments, mildly infective (M ) and hospitalized (H), can recover, die, or develop a more severe disease outcome. Those ones in ICU can only recover or die. To describe the model, we introduce the following notation: Define the vector where the superscript T denotes the transposed vector, and similarly for E, I M , I H , I I , R, and D. Define also the tensor product: x : y = [x 1 y 1 , x 2 y 2 , . . . , x n y n ] T . Then, the epidemiological model can be written as: The schematic representation of the model in Eqs. (1)-(7) can be seen in can be seen in Fig. 1 . The matrices β M , β H , and β I contain the transmission parameters for the infective individuals in the mild, hospitalized, and ICU classes, respectively. Such parameters are time-dependent and, if well calibrated, may be used to address the effectiveness of con- tention measures, to verify changes in the transmission pattern, or yet to track the impact of suspending lockdown. It is worth mentioning that, depending on the information available, simplifying assumptions on the structure of such matrices must be made. In our study, β M , β H , and β I assume the following form: where the n × n matrix B is of the form: β(t) is a time-dependent scalar parameter that controls the epidemic dynamics, and the parameters a and b are scale factors between 0 and 1. The matrix B depends on 2n − 1 parameters, namely, the vectors a = [a 1 , . . . , a n ] T , which contains the observed rates of infections in the n age ranges reflecting the observed heterogeneity of the infectiousness of Covid-19 into the model, and b = [b 1 , . . . , b n−1 ] T , which addresses the mixing between different age ranges and shall be estimated from the available data. The n-dimensional vectors ν M , ν H , and ν I contain the recovery rate for each age range in the infective classes M , H, and I, respectively. Similarly, µ M , µ H , and µ I contain the mortality rate for mild, hospitalized, and in ICU infective individuals. The mean time of incubation is 1/σ. The rates γ M and γ H stand for the hospitalization and ICU admission, respectively. The rates of recovery, mortality, hospitalization, and ICU admission are inversely proportional to the corresponding mean times of disease evolution and directly proportional to the probabilities of moving on to other compartments. All the quantities defining such rates are based on the results of references [8, 19, 20, 21, 11, 26, 32, 35, 38] . The time-dependent transmission parameters β M , β H , and β I , as well as the initial number of infective cases, are unknowns and shall be estimated from the recorded data of daily new infections. Available census data is used to determine the population size and the proportions of susceptible population on each age range. Moreover, whenever the data from daily reports of new cases (infections, hospitalizations, and deaths) include different age ranges, the model can be generalized to incorporate such information. In this case, the entries of β M are as follows: with β j (t), j = 1, . . . , n, time-dependent scalar coefficients, and B ij the entries of the matrix B defined above. The other transmission parameters are still of the form β H = aβ M and β I = bβ M . Since for the NYC datasets only the accumulated numbers of infections, hospitalizations and deaths are age structured, we assume that the daily reported cases are not age structured. To introduce more realistic death and hospitalization rates, we adjust µ I and γ M by appropriate delayed ratios from daily reports. More precisely, if γ M and µ I represent the mean rates of hospitalization and death, respectively, for each age range i = 1, . . . , n, the constant rates γ i M and µ i I are replaced, respectively, by whereĨ M ,Ĩ H , andD represent the time series from daily reports of new infections, hospitalizations, and deaths, respectively. In addition, τ M is the mean time of onset to hospitalization and τ D is the mean time of hospitalization to death. We set τ M = 1, approximating the median value found in [26] , and the parameter τ D is set to τ D = 1, obtained empirically in the numerical tests. Notice that we are not considering the curves of daily reports of ICU admissions, since this is not available in the NYC dataset. Covid-19 affects male and female individuals differently. Depending on the age range, the case fatality ratio is much larger for male individuals [5, 22] . To account for gender variance into the model, the transmission parameters (β M , β H , and β I ) are generalized. The transmission matrix for the mild class assumes the form: where β F M and β M M are the transmission matrices for age ranges, defined above, for female and male individuals, respectively. The transmission matrices β H and β I have the form β H = aβ M and β I = bβ M . Notice that the transmission between genders is accounted by the mean value of the transmission inside genders. Intuitively, it means this means that a female individual that keeps social distance with other females will also keep such containment measures with male individuals. The same happens with male individuals. Monitoring and forecasting the disease spread and the effectiveness of containment measures in large regions, like metropolitan areas, states, and countries, are difficult tasks. Heterogeneous distribution of population and differences in the implementation of social restrictions may lead to quite different disease dynamics from place to place. Moreover, people moving between regions can also cause new infection waves. Thus, to account for these aspects, an epidemiological model must include the population's geographical distribution. A number of approaches have been proposed and a review on this subject can be found in [23] . Among them, SEIR-type models have been recurrently used to describe the dynamics of human infectious diseases with geographical information. For example, [17, 7] describe the Covid-19 dynamics in the United States counties and in Italy, respectively. In order to include the geographical distribution of the population into the model described by Eqs. (1)- (7), we enlarge the transmission matrices. By indexing each site under consideration by l = 1, . . . , m, let β l M , β l H , and β l I be the corresponding transmission matrices. Then, the transmission matrix for mildly infective individuals in the model becomes: where 1 is the m × n-matrix with all entries equal to one and c l = min i,j [B l ] i,j , where B l is the matrix defining the transmission matrix β l M , for the l-th location. The transmission matrices for the other infective classes are similar. This choice for the matrix that represents the mixture of infective populations from different locations helps to simplify the model, reducing considerably the number of unknowns, thus making calibration easier. In addition, the model structure is data driven in the sense that it depends on the current information, thus reflecting more precisely population behavior under different contention measures. When dealing with large places, like states and countries, it is worth to add to the model the distance between places by using exponential, Gaussian, or power law functions [23] . Due to the interconnectedness of NYC and its boroughs, we prefer to estimate the transmission-matrix components from the reported data. For simplicity, we start by presenting the estimation procedure for the model without gender or geographical dependence. Moreover, the data on new infections, new hospitalizations, and new deaths released by the NYC authorities do not include gender or age-ranges. Thus, we shall use this simpler version of the model, where β M = β(t)B. In addition, to simplify the estimation, the constants a and b, related to the transmission matrices of hospitalized and in ICU individuals, are empirically set as a = 0.1 and b = 0.01, respectively. In order to estimate the model parameters from publicly available curves of daily new infections, we build the so-called posterior distribution relating parameters to data. We assume the number of daily new infective cases, denoted by I, is Poisson-distributed with parameter σ n i=1 E i (t). Thus, denoting the vector of model parameters by θ, the logarithm of the likelihood function is where N is the number of samples in the data and the log(I!) is approximated by the Stirling's formula We also assume that the vector of parameters θ is Gaussian-distributed with the mean given by some vector of suitably chosen a priori parameters, denoted by θ 0 , and identity covariance matrix. Thus, the negative of the logarithm of the posterior distribution The constant α is the so-called regularization parameter in the context of Tikhonovtype regularization methods [15] . The estimated parameters are obtained by minimizing lP (θ|I, θ 0 ). We estimate the initial proportion of mild infective individual on each age range as follows: where I M,0 is a scalar and p i is the population fraction of infective individuals on the i-th age range. The latter is estimated from census data. Thus, the vector of parameters to be estimated assumes the form: where the values of b j , j = 1, . . . , n − 1, correspond to the entries of the matrix B in Eq. (9) . The time-dependent transmission coefficient β(t) also appears in the definition of β M . The estimation of θ proceeds as follows: 1. Assume that β(t) is constant, and estimate θ from the set of daily reports of new infections; 2. Estimate β(t) for each t j in the dataset by minimizing the following functional: where lP (θ l |I l , θ l 0 ) and F (β l (t j+1 )|β l (t j ), θ l , I l (t j+1 )) are given by (11) and (13), respectively. The computational cost of estimating this model varies with the number m of sites considered. Based on the available computational resources, for large m, it may be worth to simplify the model by reducing the number of age ranges or merging sites into a larger one, thus reducing the model's dimension. We implemented the model's solution and the estimation procedures in MATLAB ® . The code is available upon request. The optimization of the posterior density was performed by the general-purpose gradient-based algorithm LSQNONLIN from MATLAB ® 's Optimization Toolbox. We start by evaluating the accuracy of the proposed methodology on fitting real data. Then, we do back-testing of the estimated results with out-of-sample data for periods of 7 and 20 days. Finally, we perform a forecast analysis under different scenarios. Our results are based on New York City reports of new infections, hospitalizations, and deaths. This dataset is updated daily and it contains information about the disease distribution at the five NYC boroughs with age-structured accumulated numbers [11] . As the number of daily Covid-19 tests have been increasing in NYC and more effective treatments have been tested [34] , the rates of new hospitalizations and new deaths have been decreasing, in comparison to the rates of new infections. In order to account for all these features, the rates used in the model were updated. Our initial example set does not consider geographical dependence yet. Such information shall be incorporated along Subsection 3.2. The disease dynamics is estimated from the number of daily new infections in the entire NYC area. The data series goes from 29-Feb-2020 to 21-Aug-2020 and comes from publicly available data in [11] . This source provides Covid-19 case reports and statistics for NYC and each of its five boroughs. The populations of NYC and of its boroughs is distributed in the 5 age ranges present in the datasets. The population distribution on age ranges and boroughs is based on the census data publicly available at [12] . The curves of daily reports of new infections were smoothed-out by a moving average of seven consecutive days. The rates per 100,000 inhabitants were used to define the various model parameters. They include the hospitalization, recovery, and death rates, as well as the vector a in the definition of the transmission matrices. After preliminary calibration tests, two disease evolution regimes were clearly identified. In the first one, the number of infective individuals increased exponentially and, in the second one, the spread was considerably reduced due to the contention measures imposed by the state of emergency declared on 12-Mar-2020. The effect of such intervention was clearly observed in the evolution of the time-dependent transmission coefficient β(t) on 19-Mar-2020 (Fig. 2 ). In order to capture possible regime changes, like the different age-range mixing, we divided the time-series into two parts, namely data before and after 19-Mar-2020. For these two time series subsets, we estimate the vector of parameters θ and the time-dependent β. Note that, for the second dataset, after 19-Mar-2020, we do not estimate the initial infective population. The estimated parameters and the corresponding 90% confidence intervals (CIs) can be found in Tab. 1. Such intervals were obtained by excluding the 5% largest and smallest values generated by 200 bootstrap samples [10] . Figure 2 presents the comparison between reported and model predicted curves for daily new infections. The time evolution of the basic reproduction rate can be found in Fig. 2 . Table 2 depicts the predicted and reported number of infections, hospitalizations, and deaths for 21-Aug-2020 with 90% CIs. In Tab. 1, the estimated values of β were far from each other in both periods, which indicates the aforementioned change in the transmission regime. Concerning the vector b, after contention, the estimated values decrease consistently, indicating that the interaction between age ranges was also significantly reduced. All these results show that, after 19-Mar-2020, disease spread was contained. However, as we shall see later on, new infection waves can occur if contention measures are relaxed. Figure 2 shows the adherence of the calibrated model predictions to the 7-day moving average of the reported number of new infections. The hospitalization and death rates were evaluated using the ratios of reported data defined previously. The proportions of ICU admissions by age were obtained in [8] and they were adjusted according to the proportions of deaths available in [11] . The model accuracy is also illustrated in Tab deaths, for each age range and gender, are close to the reported ones. As Fig. 2 clearly shows, the time-dependent basic reproduction rate R(t) has two different levels. Before 19-Mar-2020, it presents large values, indicating that the disease was spreading without control. After 19-Mar-2020, the transmission parameter value drops to around one, which indicates control of transmission by contention measures imposed from 12-Mar-2020 onwards. Notice that, the large values for the basic reproduction rate in the first part of the series, i.e., before 19-Mar-2020, might be caused by an accumulation of reports in the beginning of the outbreak. Such accumulation can be related, for example, to difficulties faced by the health authorities to setup an appropriate diagnosis protocol. The adherence of the calibrated model predictions to reported data, the accuracy in the number of hospitalizations and deaths, as well as the behavior of the calibrated parameters indicating the effect of disease contention measures for NYC data show that our proposed model captures well the Covid-19 dynamics in NYC. Therefore, it is useful to track the spread dynamics, allowing to assess the effects of, for example, travel quarantine, social distancing, and reopening strategies. If infection curves for different age ranges are available, it is possible to use the present model to track aspects like the effects of reopening schools, universities, or yet parks and public gardens, since these spaces are usually frequented by people of well defined age ranges. Thus, it is easier to track the disease spread dynamics more accurately, allowing the decision of whether additional reopening or further restrictions could be taken. Let us consider the epidemiological dynamics of the five boroughs of NYC, namely, Queens, Manhattan, Staten Island, Brooklyn, and Bronx. The data was downloaded from [11] on 02-July-2020. The calibrated model predictions of daily new infections and the corresponding timedependent transmission coefficients can be found in Fig. 3 Figure 3 , among other things, shows the model predictions adherence to the curves of reported daily new infections. Such accuracy is also attested by the comparison between reports and predictions of the accumulated number of total infections, hospitalizations, and deaths for 01-July-2020 in Tab. 3. The behavior of the time-dependent transmission coefficient for each borough is similar to the basic reproduction rate R(t) in the previous example. This behavior is expected since transmission contention measures were taken since 12-Mar-2020 in the entire NYC. This example shows the ability of the present model to detect disease transmission patterns in different locations at the same moment. The model also accounts for the interaction between individuals from different age ranges, genders, and locations. With such features, it is possible to track the implications of reopening, the necessity of additional contention measures, or yet auxiliating the design of vaccination strategies. Such broad applications could not be achieved via simpler models. In order to test short-term forecast capabilities of the model, we consider two different periods of the Covid-19 outbreak in NYC. We calibrate the parameters with data from reports on new infections in the period from 29-Feb-2020 to 19-Mar-2020 and we produce a seven-day forecast starting on 20-Mar-2020. This forecasted period is of particularly interest since on 19-Mar-2020 the disease spread pattern changed considerably due to the contention measures undertaken 7 days earlier, as a consequence of the state of emergency declared on 12-Mar-2020. To generate the predictions, we assume that β(t) is constant for dates t after 19-Mar-2020 and that it takes the same value as the one estimated on 19-Mar-2020. We now create a forecast for the period 24-July-2020 to 12-Aug-2020. For dates t after 23-July-2020, the time-dependent values of the transmission coefficient are given by the mean of values for the period from 13 to 23-July-2020. Figure 4 presents a comparison between the predicted and the reported data for the forecasted period. Table 4 presents the predicted and the reported values of accumulated infections, hospitalizations, and deaths from 24-July-2020 to 12-Aug-2020. Infections Table 4 : Model predictions and reported accumulated infections, hospitalizations, and deaths for the periods 19 to 26-Mar-2020 and 24-July to 12-Aug-2020. Top rows represent predictions while bottom rows are the reported cases. Between parentheses are the 90% CIs. As Fig. 4 and Tab. 4 show, the model predictions of infections, hospitalizations, and deaths are once again satisfactorily accurate. These results are explained by the model ability to incorporate the disease dynamics through the time-dependent parameters. Notice that, in these examples, the rates of hospitalizations and deaths are evaluated using appropriate ratios of reported data, defined previously, until the last day of estimation. In the forecasted period, we repeat the rates for the days 19-Mar-2020 and 23-July-2020, respectively, for the corresponding data ranges studied. We now apply the calibrated models from the previous sections to a number of plausible scenarios, such as the reopening of the entire NYC region, of schools, and of just one single borough, which we chose to be Staten Island. The predictions for those scenarios will give us an idea of the predictive capability of our model. The aim of this example is to present possible scenarios for the Covid-19 epidemic for long periods without an effective vaccine or appropriate treatment. We consider two different situations. In the first one, the transmission parameters stay at the level of strict containment, what was observed in the period from 04 to 14-June-2020. Thus, for any date t after 14-June-2020, the transmission coefficient β(t) is set as the mean for the estimated values of β(t) on the period from 04 to 14-June-2020, i.e., β(t) = 1.77 with 90% CI 0.27-4.69. In the second case, we simulate a controlled reopening, by allowing the coefficient β(t) to reach the double of the values obtained in the previous case, i.e., β(t) = 3.54 (0.53-9.38). However, if we impose that whenever the number of daily new infections reaches 1000 cases, containment measures are undertaken, forcing β(t) to return to lower levels until it reaches the value 1.77 (0. 27-4.69 ). On the other hand, after undertaking containment measures, if the number of daily new infections is below 200 cases, we permit reopening, and β(t) may grow again until it reaches the value 3.54 (0.53-9.38). To simulate a controlled reopening of schools from 15-June-2020, we use the set of parameters estimated after 19-Mar-2020. In order to artificially increase the interaction between school-age individuals, the entries of the transmission matrix β M associated to the mildly infective individuals in the age range 0 to 17 years old are multiplied by 2.5. In addition, the time-dependent transmission coefficient is set to 1.25 times the mean of the estimated values for the period 06 to 15-June-2020. For this specific age range, the transmission parameter values are similar to the ones obtained for the period before 19-Mar-2020. This is expected, since controlling the mixing in youth population at schools is difficult. Indeed, recent news indicates that Covid-19 transmission rate amongst people under 19 years old is similar to transmission rate in other age ranges [25, 31] . Therefore, even under an idealized situation, reopening schools may cause new infection waves among the entire population. Thus, monitoring transmission dynamics is of fundamental importance to set the right time for relaxing or tightening contention measures. In Israel, the recent reopening of schools caused a secondary wave of new infections that forced the adoption of new contention measures [25] . Two different scenarios are considered in this example: reopening Staten Island with and without restrictive measures. In the first one, contention measures are slightly relaxed, without allowing people from different age ranges and boroughs to interact. Quantitatively, we keep the same parameter values estimated in the period 19-Mar-2020 onwards. The only change is in the transmission coefficient for Staten Island, which is set as the double of the mean of the corresponding estimated values of β(t) for the period 22-June-2020 to 01-July-2020, for dates t after 01-July-2020. During the forecasted period (02-Jul-2020 to 29-Oct-2020), the transmission coefficients for the other boroughs are kept equal to the mean of the estimated values for the period 22-June-2020 to 01-July-2020. In the second scenario, people from different boroughs and age ranges are allowed to interact, keeping some social distance and simple containment measures. Only in Staten Island, simple containment measures are undertaken. In other words, the time-independent transmission parameters assume the same values estimated in the period 29-Feb-2020 to 19-Mar-2020. In addition, we allow the time-dependent transmission coefficient for Staten Island to reach the double of the mean of the estimated values for the period 22-June-2020 to 01-July-2020, for dates t after 01-July-2020. Again, after 01-July-2020, the transmission coefficients for the other boroughs are kept equal to the mean of the estimated values for the period 22-June-2020 to 01-July-2020. Whenever the number of daily new infections reaches 1000 in NYC, contention measures are undertaken again. So, the time-independent transmission parameters are brought back to the same values of the period after 19-Mar-2020 and the transmission coefficient for Staten Island is reset to the mean of the correspond-ing estimated values for the period 22-June-2020 to 01-July-2020. Reopening reoccurs whenever the number of daily new infections in NYC is below 100 reports. 6-7 present the curves of daily new infections for the first and second scenarios, respectively. They show the evolution for each borough and for the entire NYC. In the first case, doubling the mean of the transmission coefficient alone is not sufficient to cause secondary waves of infection. In other words, if restriction of movement between boroughs are kept and the interaction between age ranges stay contained, through strict social distance measures, infection will not return and the disease outbreak will die out. On the other hand, reopening a borough and allowing people of different ages and from different boroughs to interact, although keeping some light containment and social distance measures, can cause new waves of infection for large periods. We use a 7-day moving average to perform smoothing of data. After calibration, the model predictions were adherent to the data of daily new infections and they predicted well the number of daily new hospitalizations and deaths. The adjustment of the rates of hospitalization and death using appropriate ratios of reported data contributed to improving model predictions. While backtesting, forecasts for few days ahead, under different contexts, turned out to be accurate. The model identified well the effects of lockdown undertaken in NYC after 12-Mar-2020. A considerable change in the values of the transmission rates was noticed. This flattened the curve and kept the number of daily new infections low, as shown in [11] . The rates of hospitalizations and deaths were smaller by the end of August than in earlier periods of the Covid-19 outbreak in NYC. This phenomenon may be caused by changes in disease virulence or in the protocols used to address Covid-19, like smaller onset to hospitalization meantime, more precise Covid-19 diagnosis, or the introduction of more effective treatments [34] . Moreover, in NYC, the number of Covid-19 daily tests has been increasing consistently since the beginning of the outbreak in the end of February-2020. To account for such changes in the aforementioned rates, the corresponding model parameters were adjusted using ratios of reported data, incorporating this feature. This increased considerably the accuracy of the model predictions of hospitalizations and deaths, as the results show ( Fig. 2 and Tab. 1). Concerning reopening strategies, some simulated scenarios generated with calibrated parameters indicate that there is no completely safe way of reopening schools, boroughs, or the entire city unless people respect strict protocols of social distance, avoiding direct personal contact. As model predictions show, even when only a borough or only schools are reopened, new infection waves may occur, forcing public authorities to establish contention measures again. While infective individuals are present in a population, there is the risk of new infection waves in reopening strategies, since it is not possible to guarantee that everyone will respect the protocols of contention. Such phenomenon was recently seen in the news [25, 16] . Thus, reopening must be undertaken with strict control of disease transmission, while applying massive testing and enforcing social distance measures. Despite the fact that the role of children and teenagers in Covid-19 spread is still unknown, some recent events of disease spread amongst youth population in an overnight camp in Georgia [31] and in schools in Israel [25] indicate that, although most individuals present mild symptoms, they can infect other people. So, reopening schools also represents a risk of new infection waves. Even New Zealand and China, that successfully contained Covid-19 spread and had long periods without registering community transmission, are now facing new cases [3, 28] . All these news corroborate our model predictions, suggesting that, without strict control of Covid-19 through social distance or after massive and effective vaccination, a completely safe reopening may be impossible. It is worth mentioning that, the present model is also suited to simulate and analyze vaccination strategies appropriately, since it addresses dependency on age and spatial distribution. This possibility shall be addressed in a forthcoming article. On May 2020, the State of New York initiated a four-phase reopening program. NYC joined the fourth phase on 20-July-2020. By 22-Aug-2020, schools and shopping malls were still closed, yet public transportation and a number of economic activities were already operational [18, 30] . Strict social distancing measures were still enforced and public authorities were following closely their observance. This kept the number of daily new infections stable. Figure 2 shows the Covid-19 situation in NYC until 21-Aug-2020. The panel presents the comparison of model predictions and reports of daily new infections, as well as the time-dependent basic reproduction rate R(t). Observe that since 19-Mar-2020, R(t) stays around one, meaning the disease transmission is under control, but not eradicated, and that new infection waves may still occur. Even after reaching the fourth phase of controlled reopening, NYC authorities managed to keep transmission under control through social distancing measures, by limiting the operational capacity of numerous services, disinfecting public transportation, and many other practices. Notice that, even if Covid-19 is eradicated in NYC, while no effective vaccine is ready to be used in the entire globe, contention measures inside NYC must be kept to avoid new outbreaks caused by the reintroduction of the disease from abroad. SEIR-type models have been proposed by a number of authors to predict qualitative aspects of the dynamics of infectious diseases in general, and of the Covid-19 pandemic in particular. See [4] for a recent account of the SIR models and its connections to other models. Yet, to address the elusive aspects of the complex human interactions within the terrain, we feel that one has to forego parsimonious models. Functional and high-dimensional models have been used in a number of areas ranging from Financial Mathematics to Population Dynamics [1, 27] . They are directly connected with the mathematical theory of Inverse Problems [29, 7, 6, 15] . The present article explores several aspects of infectious diseases, including Covid-19, that have been receiving little attention in the recent literature. We consider time-dependent rates of transmission, hospitalization, and deaths, as well as the disease age-and gender-dependent severity and transmission, while taking in account the spatial distribution of population. The model was extensively tested with real data from NYC and its boroughs. After calibration, it matched the curves of daily new infections and provided accurate predictions for the number of daily hospitalizations and deaths. It also properly detected the change in the transmission pattern on 19-Mar-2020 caused by contention measures taken on 12-Mar-2020. Moreover, it illustrated the stabilization of the time-dependent basic reproduction rate around the value 1 in NYC. Concerning prediction of new infections, the model was also evaluated while using real data and calibrated parameters. It generated accurate results under controlled and uncontrolled transmission contexts. Moreover, different scenarios as reopening of schools and of an entire borough of NYC were illustrated. In both cases, transmission rates increased considerably, demanding new contention measures. In other words, without reaching herd immunity or the complete disease eradication, we can always face the risk of new infection waves. Our proposed model is sufficiently general to track transmission dynamics with dependence on age range, gender, and spatial distribution, while evaluating the disease impact on the population. Thus, it is a powerful tool to evaluate scenarios and to build proper vaccination strategies. A Splitting Strategy for the Calibration of Jump-Diffusion Models Athanasios Tsakris, and Constantinos Siettos, Data-based analysis, modelling and forecasting of the COVID-19 outbreak coronavirus: 14 new Covid-19 cases reported The challenges of modeling and forecasting the spread of covid-19 Sex differential in COVID-19 mortality varies markedly by age Bayesian dynamical estimation of the parameters of an SE(A)IR COVID-19 spread model Metapopulation Network Models for Understanding, Predicting, and Managing the Coronavirus Disease COVID-19 Severe outcomes among patients with coronavirus disease 2019 (COVID-19)-United States Tracking the Economic Impact of COVID-19 and Mitigation Policies in Europe and the United States Fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts COVID-19: Data Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions On the definition and the computation of the basic reproduction ratio R 0 in models for infectious-diseases in heterogeneous populations Regularization of Inverse Problems, Mathematics and its Applications Europe's second wave? country-by-country breakdown of resurging covid-19 cases Renato Casagrandi, and Andrea Rinaldo, Spread and dynamics of the covid-19 epidemic in italy: Effects of emergency containment measures What Restrictions on Reopening Remain in New York? Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy region Clinical characteristics of coronavirus disease 2019 in China Clinical features of patients infected with 2019 novel coronavirus in Gender Differences in Patients with COVID-19: Focus on Severity and Mortality Modeling infectious diseases in humans and animals A contribution to the mathematical theory of epidemics When covid subsided, israel reopened its schools. it didn't go well The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application China records biggest one-day rise in coronavirus cases since march, Financial Times Statistical and Computational Inverse Problems SARS-CoV-2 Transmission and Infection Among Attendees of an Overnight Camp -Georgia Estimates of the severity of coronavirus disease 2019: a model-based analysis New predictions suggest a deeper recession and a slower recovery Corticosteroids for COVID-19 Report of the WHO-China joint mission on coronavirus disease 2019 (COVID-19 Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study Characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention JPZ thanks Khalifa University, the Government of Abu Dhabi, and Fundação Carlos Chagas Filho de Amparoà Pesquisa do Estado do Rio de Janeiro (FAPERJ) through the program Cientistas do Nosso Estado for the support during the course of this research. JPZ would like to acknowledge very fruitful discussions with Profs. Dimitris Goussis and Leontios Hadjileontiadis (Khalifa University).