key: cord-0967229-kdwetmdq
authors: Fokas, A. S.; Dikaios, N.; Kastis, G. A.
title: Mathematical models and deep learning for predicting the number of individuals reported to be infected with SARS-CoV-2
date: 2020-08-05
journal: J R Soc Interface
DOI: 10.1098/rsif.2020.0494
sha: 860d5d4e9858fdc02b444762dbbaefb967d74c12
doc_id: 967229
cord_uid: kdwetmdq

We introduce a novel methodology for predicting the time evolution of the number of individuals in a given country reported to be infected with SARS-CoV-2. This methodology, which is based on the synergy of explicit mathematical formulae and deep learning networks, yields algorithms whose input is only the existing data in the given country of the accumulative number of individuals who are reported to be infected. The analytical formulae involve several constant parameters that were determined from the available data using an error-minimizing algorithm. The same data were also used for the training of a bidirectional long short-term memory network. We applied the above methodology to the epidemics in Italy, Spain, France, Germany, USA and Sweden. The significance of these results for evaluating the impact of easing the lockdown measures is discussed.

The novel coronavirus SARS-CoV-2 is the third coronavirus to appear in the human population in the past two decades, following the severe acute respiratory syndrome coronavirus SARS-CoV outbreak in 2002 and the Middle East syndrome coronavirus MERS-CoV outbreak in 2012. SARS-CoV-2 initially emerged in Wuhan, China, at the end of 2019; after Chinese scientists identified the sequence of the causative virus [1] , this information was immediately shared with the international community. Furthermore, China took effective measures for the containment of the spread of this outbreak. This new coronavirus is less pathogenic than the earlier two coronaviruses [2] . For example, in the first case of pneumonia caused by this virus reported in USA, a 35-year-old, healthy, individual (who had travelled in Wuhan) presented in a hospital four days after he experienced dry cough and low grade fever; he proceeded to develop pneumonia 5 days later, but quickly recovered [3] . This is the typical disease course for young persons. However, COVID-19 has a significant mortality rate for elderly persons and for those with a variety of underlying medical conditions, including respiratory and cardiovascular diseases, as well as diabetes mellitus. For example, after the identification of an individual in a skilled nursing facility in USA infected with SARS-CoV-2, extensive testing was carried out and 23 days later it was established that 57 of 89 residents (67%) were infected; as of 3 April, 11 of the infected persons had been hospitalized (three in the intensive care unit) and 15 had died (mortality 27%) [4] . A crucial factor for the high transmissibility of this virus is the high level of SARS-CoV-2 shedding in the upper respiratory tract even among asymptomatic individuals [5] . As a result of the above facts, and the lack of appropriate early international measures for the suppression of its spread, it has now caused a pandemic.

This pandemic represents the most serious global public health threat since the devastating 1918 H1N1 influenza pandemic. Justifiably, several countries have adopted draconian measures to combat this threat. The scientific community, in addition to its accelerated efforts to develop an effective treatment and a vaccination, is also playing an important role in advising policymakers of possible non-pharmacological approaches to limit the catastrophic impact of the pandemic. For example, two possible strategies, called mitigation and suppression, are thoroughly discussed in the important paper [6] ; in the early stages of the pandemic, the UK was following mitigation, but after the publication of this report, is now following suppression.

In this paper, we present a novel methodology for predicting the time evolution of the cumulative number, N(t), of the individuals reported to be infected in a given country, by SARS-CoV-2. This methodology can be used for predicting several features of the epidemic, such as the time that a plateau will be reached, as well as the total number of individuals reported to be infected at that time. Here, the plateau is defined as the time when the rate of change of the people reported to be infected is 5% of the maximum rate of infection.

Our methodology is based on two different tools: on the use of appropriate mathematical models and on the employment of deep learning networks. Regarding the first tool, our mathematical models have led to two analytical formulae, called rational and birational. The advantage of these formulae is that they provide more accurate predictions for the characteristics of the plateau than the classical logistic formula often used in epidemiology. Also, importantly, the birational model may provide an upper bound of N(t), and hence it is preferable to the rational one. However, the rational model can be constructed sooner than the birational one: the input needed for the rational model is data until around the time when the maximum rate of infection occurs, which will be denoted by T (the corresponding point on the curve describing N(t) is known as the inflection point). On the other hand, the birational model requires data for several more additional days. The effectiveness of the above two analytical formulae is supported by showing that their predictions are as accurate as those obtained via a deep learning algorithm; in particular, we employed a bidirectional long short-term memory (BiLSTM) network, which provides a powerful generalization of recurrent neural networks.

A prerequisite for the development of any accurate model is the existence of appropriate data. For the pandemic of SARS-CoV-2 such data are certainly available. For example, there exist a long series of data from Italy, Spain, France and Germany, where their SARS-CoV-2 epidemics are approaching or have passed the plateau; USA and Sweden passed the inflection point several weeks ago but appear to have a slow approach towards a plateau. The total reported cases of the above countries as a function of the number of days after the day that 500 cases were reported are shown in figure 1. Estimated rates of change (new reported cases) for Italy, Spain, France and Germany are plotted in figure 2 .

These graphs show that in all countries, except Sweden, the growth of the epidemics is similar for the first approximately 10 days after the day that the number of infected persons reached 500. However, following this period, the behaviour of the epidemics is different, presumably reflecting the type of measures and the time that these measures were implemented, in each country.

The mathematical modelling of epidemics has a long and illustrious history; it began with the Kermack-McKendrick model, introduced in 1927 [7] . In this pioneering paper, the population is divided into susceptible, infectious and recovered (removed) sub-populations. Then, specific ordinary differential equations are formulated characterizing the time evolution of the functions representing these populations. The above work was certainly ahead of its time. It was rediscovered in the 1980s, and since then it has provided the basis for a variety of deterministic models, known as SIR models (rigorous mathematical results for such models are derived in [8] ). The generalization of SIR to models involving partial differential equations is presented in [9] . A simple extension of the standard SIR model capable of modelling SARS-CoV-2 is presented in [10] . This model involves six ordinary differential equations (ODEs) specified by nine parameters. There is an underlying belief in the mathematical epidemiology community that these parameters, in principle, can be determined from the epidemiological data of the number of infected and diseased individuals. However, the analysis of the above six ODEs presented in [10] shows that this is not possible (on the other hand, it is possible to determine those combinations of the nine model parameters that specify a fourth-order ODE characterizing the time evolution of the number of deaths [10] ).

In addition to the above formidable obstacle rigorously derived in [10] concerning the determination of the SIR model parameters, the number of the individuals infected by SARS-CoV-2 is obviously different than the number of individuals royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 17: 20200494 reported to be infected. These considerations make it necessary to seek a direct approach to modelling the accumulative number N(t) of individuals reported at time t to be infected by a viral epidemic. In this work, we assume that the function N(t) satisfies the ordinary differential equation

This is a Riccati equation that is specified by the timedependent function α(t) and the constant parameter N f . This constant as well as the function α(t) depend on the basic characteristics of the particular virus and on the cumulative effect of the variety of different measures taken by the given country for the prevention of the spread of the viral infection. The dependence of α(t) on time reflects various timedependent factors, including the fact that the effect of the different measures taken by the government depends on t. The case of α(t)=constant can be considered as an 'ideal' case.

Remarkably, although (1.1) is a nonlinear equation depending on time-dependent coefficients, it can be solved in closed form. Its solution depends on α(t), the constant parameter N f , and the constant of integration β:

In the particular case that α(t) is a constant denoted by κ, equations (1.2) yield the classical logistic formula

Interestingly, this simple formula is adequate for capturing the evolution N(t) of typical viral epidemics. For example, determining the three constant parameters κ, β and N f of equation (1.3) with data from the Ebola virus epidemic of 2014 in Guinea, we find the excellent fit depicted in figure 3 . Importantly, the above parameters remain essentially unchanged if we use a smaller set of data for their determination, which shows that the logistic model could also have been used for royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 17: 20200494

predictive purposes (throughout this paper the unknown parameters are determined by employing an error-minimizing algorithm described in §2.1). The cases of Liberia and Sierra Leone are very similar to the case of Guinea (the relevant data were obtained from the official site of the Centers for Disease Control and Prevention (CDC); they are official World Health Organization (WHO) data) 1 .

As it will be shown in §3, the simple formula (1.3) also provides a good fit of the SARS-CoV-2 pandemic. However, the long series of existing data of the epidemics of Italy, Spain, France and Germany show that the logistic model does not provide accurate predictions. For example, figure 4a shows that if we use a subset of the existing data of the epidemic in Italy for the determination of the parameters of the logistic model, and then compare the resulting graph of N(t) with the remaining available data, we find that the logistic model underestimates N(t); in other words, the logistic model provides a lower bound of the actual N(t). This raises the following natural question: is it possible to find a formula yielding more accurate predictions than the logistic one? After experimenting with more than 50 different forms of α(t), we have obtained an affirmative answer to the above question, by introducing two novel formulae which will be referred to as rational and birational. In the former case, the exponential function appearing in equation (1.3) is replaced by an algebraic function; in the birational formula, the values of the parameters specifying this algebraic function change, depending on whether t is larger or smaller than a parameter denoted by X.

In this work, we implemented the following tasks associated with the epidemics in Italy, Spain, France and Germany: (i) we determined the parameters of the logistic, rational and birational formulae, by training the above formulae using only a subset of the data, namely data up to T + 25 (the inflection point T was determined by fitting the logistic model over the whole dataset of each country, namely up to 24 May 2020). By plotting the prediction curves of these formulae against the actual data up to 24 May 2020, it becomes evident that first, the logistic formula provides a lower bound, and second, that the rational and birational formulae generate more accurate predictions. Furthermore, for the cases of Italy, the birational formula generates a curve which is above the curve of the data, suggesting that the birational model may provide an upper bound. (ii) We established that the rational and birational formulae have similar predicting performance with a far more complicated deep learning network, which does not depend on the assumption of the validity of equation (1.1). This finding highlights the suitability of the proposed mathematical models for predictive purposes. (iii) We computed the time of the plateau as well as the value of N at the plateau using the logistic, the rational and the birational formulae, as well as the BiLSTM network.

In particular, the birational model yielded the following estimates for the dates that the plateau will be reached, as well as for the number of individuals reported to be infected with SARS-CoV-2 when this occurs: Italy, plateau on 11 June 2020, with 242 315 individuals reported to be infected (as noted above, the estimate for Italy may be an upper According to the distinguished philosopher of science Karl Popper, a necessary requirement for the validity of a theory is that it is falsifiable. This notion suggests that supportive evidence for the validity of our approach can be provided by presenting cases where our models fail. For this purpose, we have investigated the epidemics of Sweden and of USA. In the case of Sweden, only partial restrictions were applied instead of strict lockdown measures, whereas the different measures adopted by different states makes USA very difficult to model. Indeed, for the epidemics in these two countries, after using a subset of the available data to train the three models and the neural network, we could not obtain curves that were close to the curves of the remaining data. Thus, for USA and Sweden, we have presented the parameters determined with data up to 24 May 2020, with the understanding that we do not expect neither the associated analytical formulae nor the BiLSTM network to provide accurate predictions (for these two countries the following numbers provide rather inaccurate lower bounds: USA, plateau on 28 August 2020, with 2 264 338 reported individuals; Sweden, plateau on 9 October 2020, with 57 540 reported individuals).

Let F denote the relative infectivity of the epidemic, defined by

We assume that F is a linear, time-dependent function of N. Let the constant N f denote the final cumulative number of individuals reported to be infected. Taking into consideration that F vanishes when N = N f , we have

Inserting equation (2.2) in the definition (2.1) we find the basic equation modelling this situation, namely the Riccati equation (1.1).

The particular case of a Riccati equation with constant coefficients (corresponding to the case that α(t) is constant) has appeared in a plethora of dynamic processes, including the modelling of epidemics. Indeed, in the classical SIR model mentioned in the introduction, if one assumes that R = 0, then after replacing in the first-order differential equation satisfied by I, S with I-T, where the constant T denotes the total population, one finds a Riccati equation of the form (1.3), where α(t) is replaced by a constant. Another notable example of the appearance of a constant coefficients Riccati equation in the mathematical modelling of infectious processes can be found in the paper of Anderson and May [11] ; this work describes the dynamic interaction of parasites with their host environment. In this paper, whose impact in the field of mathematical biology was far reaching [12] , a Riccati equation is formulated that involves a single constant parameter 2 .

In order to solve the Riccati equation (1.1), we first use the independent change of variables specified by the second of equations (1.2). This gives rise to an equation similar to equation (1.1), where α is replaced by 1 and t by τ. This constant coefficients Riccati equation can be linearized via the change of the dependent variable specified by

Indeed, substituting equation (2.3) into the above constant coefficients Riccati equation and simplifying, we find

Solving this equation, substituting the resulting expression in equation (2.3), and simplifying, we find the first expression in equation (1.2). We expect that the validity of the above model improves as t increases. Thus, we avoid evaluating the first of equations (1.2) at τ =0 to express β in terms of N f and N at τ = 0. Instead we determine β by matching the expression obtained from the first of equations (1.2) with the actual data. Important information provided by the above model is the time T when the maximum rate of infection is achieved: computing the second derivative of the right-hand side of the first of equations (1.2) and requiring that the resulting expression vanishes, we find that T satisfies the equation

where throughout this paper prime denoted differentiation with respect to time. Using the above expression in the exponential occurring in the expressions for N and N 0 evaluated at t = T, we find.

For the logistic model we have α(t) = κ. Thus, equations (2.4) and (2.5) yield

Taking into consideration that the logistic formula is a good approximation of the relevant dynamic process, the above value of T provides an approximate value for the time that the inflection point is reached. The rational and birational models are defined, respectively, as follows:

where X is a constant parameter in the vicinity of T.

royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 17: 20200494

Letting in equation (2.8) t ! 1, we find

By comparing equations (2.7) and (2.8) with the first of equations (1.2), it is straightforward to determine α(t) for both the rational and the birational models: for the rational model

which justifies the terminology 'rational'. For the birational model

If b, c, d, k are close to b 1 , c 1, d 1 , k 1 , then N f is close to c 1, and hence the value of α(t) for t > X is close to the value of α(t) for t < X.

Computing the second derivative of the right-hand side of equation (2.7) and equating the resulting expression to zero, we find that the value of T for the rational model is characterized by the equation

Similarly, for the birational model where the parameters b, d, k are replaced with b 1 , d 1 , k 1, respectively. Replacing the rational function in the expressions for N and of its derivative with the rational function of equation (2.10), we find that for the rational model

Similar expressions are valid for the birational model. The birational model is based on the natural assumption that the parameters of the rational function specifying the function N(t) are different before and after T. It is quite satisfying that this very simple model yields the best fits among more than 50 models that were tested. Included in these models were several 'fractal' models; in the simplest such model, the exponent kt in the logistic formula was replaced with k × t μ . Among other models investigated was the generalized logistic formula

where N l and N u are the upper and lower values of the relevant curve. This expression provides the general solution of the time-independent ODE

It turns out that this equation is actually equivalent with the time-dependent ODE (1.1), where

In other words, the generalized logistic formula is the general solution of equation (1.1) with the above choices of N f and α(t).

We obtained the time-series data for the coronavirus disease (COVID-19) for the counties studies here from the official site of the European Centre for Disease Prevention and Control. 3 We arranged the data in the form of individuals N reported to be infected over time measured in days, after the day that the number of cases reached 500.

All evaluated formulae were fitted using the simplex algorithm, which is an iterative procedure that does not need information regarding the derivative of the function under consideration. The algorithm creates a 'random' simplex of n + 1 points, where n is the number of the model parameters that need to be estimated. The simplex changes iteratively by reflection, expansion and contraction steps until it finds the model parameters that minimize the given likelihood function. The constrained variation of the simplex algorithm [13, 14] available in MATLAB ® was used for all tested formulae; an L 1 -norm was employed in the likelihood function to improve robustness [15] . The simplex algorithm is particularly effective for cases where the gradient of the likelihood functions is not easy to calculate. Random parameter initializations were used to avoid local minima. The simplex algorithm was chosen because it performed better than certain nonlinear leastsquares curve fitting algorithms evaluated in this work, namely the Levenberg-Marquardt [16] and the trust-regionreflective [17] algorithms.

The stability of the fitting procedure was established using the following simple criterion: different fitting attempts based on the use of a fixed number of data points must yield curves which have the same form beyond the above fixed number of points. The fitting accuracy of each model was evaluated by fitting the associated formula on all the available data in a specified set. The relevant parameters specifying the logistic, rational and birational formula are given in table 1. For computing the inflection point T, we require the time that the derivative of N becomes maximum. For this purpose, we used the model with the best fit in the neighbourhood of the inflection point, which in most cases turned out to be the birational model. For all countries presented for the fitting of the birational model we used X = T.

Machine learning and in particular deep learning have had a transformative impact in many areas of science and technology; furthermore, they have begun to have a significant impact in medicine [18] . The first important relevant applications of neural networks were related to the employment of artificial neural networks (ANN) with one hidden layer and a finite number of neurons. According to the universal approximation theorem [19] , by estimating the weights of these neurons, ANN can learn any nonlinear function; but this may require a large number of neurons. In order to fit complex multivariate functions, such as those required to model the number of individuals reported to be infected by SARS-CoV-2, deeper neural networks (many hidden layers) may provide a more efficient alternative. Each hidden layer of an ANN has its own weights and acts independently, hence ANN cannot capture sequential information of time series, such as the cumulative number of individuals reported to be infected. On the other hand, this can be easily achieved using recurrent neural networks (RNN), where the current royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 17: 20200494 state of the hidden layers h t at each time step t is estimated via the process of combining the current input x t with the previous state of the hidden layers h t−1 :

In this equation, σ is the activation function, w hh are the weights of the recurrent neurons, and w xh are the weights of the input neurons. The final state of the hidden layers is estimated after the network goes, sequentially, through all the inputs of the time series. The prediction of the RNN is given by y t ¼ w hy h t , where w hy are the weights of the output neurons.

A common problem with both ANN and RNN, particularly when many hidden layers are used, is the vanishing or the exploding of the gradient in the backpropagation algorithm occurring during the process of updating the weights. Unlike ANN, RNN can capture sequential information, but cannot learn from long-term dependencies.

Hochreiter & Schmidhuber introduced in [20] the long Table 1 . Model parameters and plateau characteristics for the logistic, rational, birational models as well as for the BiLSTM network, for the SARS-CoV-2 epidemics of Italy, Spain, France and Germany. For completeness, the epidemics of USA and Sweden are also included, although for these cases all three analytical formulae and the deep learning network are not able to provide accurate predictions. short-term memory (LSTM) networks that can capture longterm dependencies and at the same time avoid the problem of vanishing/exploding gradients. LSTM are a type of RNN, where a memory cell maintains its state over time; they use gates to decide whether to flow information in (keep) or out (forget) of the memory cell. The first step of LSTM decides which information to 'forget' from a memory cell state; this step and can be expressed in the form

where σ is an activation function (usually sigmoid) to output a value f t [ [0,1] . For f t = 1, it flows all information in the memory cell C t−1 , whereas for f t = 0 it forgets all information. The second step involves the input gates, denoted by in t and Ct, respectively, which synergistically decide which information will be added in the memory cell. First, the input is restricted between −1 and 1 using the tanh layer,

Then, Ct is multiplied by the input gate in t to decide the values to be added in the current state of the memory cell to replace the ones the network forgot:

The current state C t aggregates the old memory state via the forget gate,

In the final step, the output gate decides which information of the cell state to output:

The final output of the cell memory is given by

Graves & Schmidhuber introduced in [21] the BiLSTM networks; this development was motivated by the bidirectional RNN networks, introduced earlier by Schuster & Paliwal in [22] , where the training runs forwards and backwards using two separate RNN. Consequently, the main difference is the training sequence: in the LSTM, the sequence runs backwards, preserving information from the future, whereas in the bidirectional LSTM training, the sequence runs backward and forward preserving information from both the past and the future. The resulting forward and backward hidden values of the bidirectional LSTM, namely h f and h b respectively, are concatenated giving the final output h t = (h f ,h b ). The bidirectional LSTM are well suited for time-series prediction and can potentially completely capture the contextual information of the time series.

Several machine algorithms have been validated for the purposes of this study such as ANN, RNN, LSTM and BiLSTM. The BiLSTM network was chosen based on fitting performance and prediction accuracy. The relevant model was implemented in Matlab; it consisted of two Bi-LSTM layers and a fully connected layer, using the ReLu activation function. The model was optimized by the algorithm of adaptive learning rate [23] , using as a loss function the root mean square error ( plus an L 2 regularization term to avoid overfitting) at learning rate 0.01. The batch size used was 1. Dropout was included in the bidirectional LSTM layer, and the proportion of disconnection was 0.1. Each model was optimized by training for 2000 epochs. Hyperparameters such as the number of hidden units and the regularization factor of the loss function were optimized using, as part of the training, a grid search approach for each network. Table 1 presents the parameters for the three different models as applied to Italy, Spain, France, Germany, USA and Sweden. This table also presents the fitting accuracy and plateau characteristics for all the aforementioned countries for all three mathematical models, as well as for the BiLSTM prediction. For Italy, Spain, France and Germany, only a subset of the available data were trained, namely data up to T + 25. In this case, T corresponds to the inflection point of the complete dataset, namely data up to 24 May 2020. The constant T, where the inflection point occurs, was determined by fitting the logistic model in the complete dataset (the other models yield similar results). For Italy, Spain, France and Germany, the inflection point occurred at t = 37, t = 28, t = 32 and t = 30, respectively. The inflection point for USA and Sweden, which was also determined from the complete set of data (up to 24 May 2020), occurred at t = 45 for both countries; for USA and Sweden this corresponds to 22 and 26 April 2020, respectively.

Incidentally, if one uses data only up to T + 25, then the inflection point is slightly different than the one computed from all the data (up to 24 May 2020): for Italy, Spain, France and Germany, it is found to occur at t = 33, t = 26, t = 30 and t = 28, respectively (table 1) . This corresponds to 31 March, 31 March, 5 April and 3 April 2020, respectively. Figure 4 presents the predictions made by the analytical formulae and the deep learning network versus the actual data for the cumulative number of reported cases due to SARS-CoV-2, as a function of days after 500 cases were reported, for the epidemics in Italy, Spain, France and Germany. The parameters of the analytical formulae and for the deep learning network were obtained using a smaller set of the available data for each country, namely up to T + 25, which for Italy, Spain, France and Germany, corresponds to t = 62, t = 53, t = 57 and t = 55, respectively. It is important to emphasize that each of the three mathematical models, namely equations (1.3), (2.7), and (2.8), as well as the deep learning model, can fit the trained data quite well (see R 2 in  table 1 ). However, the predictive capacity of these formulae is not the same. This is best illustrated by comparing the predictive curves of each model with the remaining available data up to 24 May 2020 (in red). For the epidemic of Italy For completeness, we also present the time of the plateau and the corresponding value of N for the epidemics of USA and Sweden, although, as stated in the introduction, neither the explicit formulae nor the deep learning network provide accurate predictions. As stated earlier, in these cases, the three explicit formulae and the BiLSTM network were trained with data up to 24 May 2020. For the epidemic of USA the logistic model predicts a plateau on 11 June 2020 (day 95 after the day that 500 cases were reported) with 1 579 077 reported cases; the rational model predicts a plateau on 12 August 2020 (day 157) with 2,056,370 reported cases; the birational model predicts a plateau on 28 August 2020 (day 173) with 2 264 338 reported cases and the BiLSTM network predicts a plateau on 11 July 2020 (day 125) with 2 072 246 reported cases.

For the epidemic of Sweden, the logistic model predicts a plateau on 27 June 2020 (day 107 after the day that 500 cases were reported) with 35 868 reported cases; the rational model predicts a plateau on 8 October 2020 (day 210) with 53 724 reported cases; the birational model predicts a plateau on 9 October 2020 (day 211) with 57 540 reported cases and the BiLSTM network predicts a plateau on 25 June 2020 (day 105) with 46 111 reported cases.

Several useful models elucidating aspect of the COVID-19 pandemic have already appeared in the literature; they include the following: (i) a model for simulating the transmissibility of SARS-CoV-2 from bats to humans is presented in [24] . (ii) The calculations of exponential growth and maximum likelihood are used in [25] to determine the reproductive number of SARS-CoV-2 and SARS in China. (iii) The formulation of a susceptible-infected-recovered-dead (SIDR) model, together with the knowledge of data from China in the period 11 January to 10 February 2020, is used in [26] to estimate the associated per day infection mortality and recovery rates. (iv) In [27] , by combining a stochastic model for the SARS-CoV-2 infection with the knowledge of data from China during January and February 2020, the probability that newly introduced cases might generate new outbreaks is calculated. (v) In [28] , an SIDR model supplemented with mean-field kinetics is used to calculate the time and peak of confirmed infected individuals in China, Italy and France. (vi) In [29] , the effect of social distancing was studied by using a model where the population was divided into those who are asymptomatic or have mild symptoms (95.6%), those who are hospitalized but do not require critical care (3.08%), and individuals who require critical care (1.32%); seasonal variations were incorporated by allowing the basic reproduction number to be a time-dependent function following a cosine curve that peaks in early December.

The above references, as well as the references [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] that also contain interesting results, represent only a tiny fraction of more than 3000 publications that have appeared in the last four months in arXiv, medRxiv and bioRxiv.

Here, we have modelled the cumulative number N(t) of persons reported to be infected by SARS-CoV-2 in a given country as a function of time, in terms of the Riccati equation (1.1). In addition, we have introduced a particular deep learning network which is capable of predicting accurately the time evolution of N.

Regarding equation (1.1), it is noted that although it is a nonlinear ODE containing time-dependent coefficients, it was solved in closed form, yielding (1.2). For appropriately chosen functions α(t), the first of equations (1.2) provides a flexible generalization of the classical logistic formula that has been employed in a great variety of applications, including the modelling of infectious processes. The fact that α is now a function of t has important implications. In particular, it made it possible to construct the rational and birational formulae which provide more accurate predictions than the logistic formula.

The construction of the exact solution of equation (1.1), given by equations (1.2), has the following consequences: (i) for the case that an infection that has been stabilized, any of the three analytical formulae presented here can be used for the characterisation of the evolution of the cumulative number of persons reported to be infected. These expressions can be used for a variety of purposes. (ii) More importantly, the rational and birational models can be used for predictive purposes, providing accurate estimates for the characteristics of the plateau. (iii) Our approach has the capacity to provide increasingly accurate predictions: as soon as the epidemic in a given country passes the time T, the rational model can be used; furthermore, when the sigmoidal part of the curve is approached, the rational model can be supplemented with the birational model (a simple criterion of checking whether the birational model can be used is given in §2.1). Also, as more data become available, the parameters of the rational and of the birational models can be re-evaluated; this will yield better predictions. (iv) The Riccati equation (1.1) together with the flexibility of the arbitrariness of α(t), offer the possibility of deciphering basic physiological mechanisms royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 17: 20200494 dictating the evolution of N(t). In particular, following the transient stage of the epidemic, it is envisioned that α(t) becomes a function of N instead of a function of t. By plotting α in terms of N it is possible to scrutinize a posteriori such a relationship: we find that after t approximately equal to T − 7 the relation between α and N/N f is, remarkably, linear, and almost identical for all studied countries, figure 5.

How can the success of the simple mathematical model expressed by (1.1) be explained? Apparently, the constant N f defining equation (1.1), the constant of integration β entering the associated solution, and the constant parameters specifying the function α(t), capture the essence of the underlying time evolution process. This suggests that the cumulative effects of a variety of different mechanisms express themselves via the few parameters entering in the explicit solution formulae (1.3), (2.7) and (2.8) . In this connection, it is worth recalling that the single parameter characterizing the Riccati equation of the celebrated Anderson-May model mentioned earlier represents the cumulative effect of different biological mechanisms 4 .

An additional partial explanation of the success of equation (1.1) is that the implementation of the formulae obtained via (1.1) for predicting the number of individuals infected by SARS-CoV-2 shares the same philosophy employed by the powerful technique of machine learning. Indeed, the explicit formulae (1.3), (2.7) and (2.8) used in this work can be thought of as 'algorithms', where given t, they predict N; these algorithms are characterized by several parameters, which are fixed by the 'knowledge' of the data. Thus, the more data are available, the better this algorithm 'learns' how to make accurate predictions. Hence, choosing these parameters by requiring that the analytical solution matches the data curve is consistent with the approach of machine learning.

An important part of this work is the presentation of detailed comparisons between the predictions of the analytical formulae and those obtained via a BiLSTM network. As discussed in the results section, the rational and birational formulae yield similar predictions with those of the above network.

As noted in the introduction, it is not possible using the epidemiological data of infected and deceased individuals to determine the parameters of SIR type models that specify the time evolution of the number of individuals infected by a given viral infection. On the other hand, assuming that the number of individuals reported to be infected is a time-invariant percentage of the actual number of infected persons, the analytical formulae as well as the deep learning algorithm discussed here can be used to predict the time evolution of the number of infected individuals. This information can be useful for a variety of purposes. In particular, following the expected decline of the 'first wave' of infections, several countries have begun easing the lockdown measures. This was vital, not only for economical but also for health considerations. Indeed, the psychological impact on the population at large of the current situation is substantial [40] ; furthermore, this is expected to worsen, especially due to the effect of the post-traumatic disorder. Under the assumption that the characteristics of the virus remains unchanged, SIR type models predict that in the post-lockdown period the number of reported infected individuals as well as the number of deaths will begin to grow [10, 41, 42] . At this stage, the predictions made here and in [43] will cease to be accurate. However, these works are still very valuable: they can be used to compute the additional number of reported infected individuals and deaths caused by easing the lockdown measures. If these numbers are very small, it would mean that the assumption made in the SIR type models regarding the time invariance of the virus characteristic is no longer valid, and that the virus has mutated to one which is less contagious and less virulent. In this connection, we note that France begun easing the lockdown restriction on 11 May thus, since this effect will not be evident for approximately two weeks, it would not have been detected by 24 May 2020, which was the last day of our analysis. However, Germany begun easing the measures on 15 April (t = 40) and accelerated further this process on 30 April (t = 55) when, for example, playgrounds and churches were reopened; furthermore, all state-wide curfews were lifted on 9 May (t = 64). The fact that for the epidemic in Germany, our model predicts the situation quite well until at least 24 May, implies that the partial easing of the measures did not have any impact on the number of reported infected. This is most encouraging, pointing towards the possibility of the emergence of a less contagious virus.

We conclude with some general remarks: (i) taking into consideration the ubiquitous use of the logistic formula, and the fact that equations (2.7) and (2.8) provide variations of this formula, the rational and birational formulae may be useful for the modelling of a variety of phenomena. (ii) The fact that two different viral infections, namely SARS-CoV-2 and Ebola, are modelled by the same ODE, suggests that the Riccati equation (1.1) analysed here may play a generic role in the modelling of aspects of viral epidemics. (iii) The celebrated Burgers' equation, which is an evolution partial royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 17: 20200494 differential equation combining the generic effects of diffusion and nonlinear convection, admits a travelling wave solution that satisfies the Riccati equation (1.1) with α(t) a constant (this constant specifies the speed of propagation of the travelling wave, whereas N f is a free parameter appearing in Burgers' equation). Hence, the mathematical analysis presented in this work may also be relevant for some of the phenomena modelled by appropriate generalizations of the Burgers' equation.

Data accessibility. The time-series data for the coronavirus disease for the counties studies here were obtained from the official site of the European Centre for Disease Prevention and Control (https://www.ecdc.europa.eu/en/publications-data/downloadtodays-data-geographic-distribution-covid-19-cases-worldwide). Analytic spread sheets of our fitting results are available as part of the electronic supplementary material.

A pneumonia outbreak associated with a new coronavirus of probable bat origin

A novel coronavirus emerging in China -key questions for impact assessment

First case of 2019 novel coronavirus in the United States

Presymptomatic SARS-CoV-2 infections and transmission in a skilled nursing facility

Virological assessment of hospitalized patients with COVID-2019

Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand. London, UK: Imperial College COVID-19 Response Team

A contribution to the mathematical theory of epidemics

Optimal control for a SIR epidemic model with nonlinear incidence rate

A mathematical modeling approach to infectious diseases, cross diffusion PDE models for epidemiology

Two alternative scenarios for easing COVID-19 lockdown measures: one reasonable and one catastrophic. medRxiv

The population dynamics of microparasites and their invertebrate hosts

The population dynamics of microparasites and their invertebrate hosts

Bound constrained optimization, computer program. MATLAB and Simulink Centre

Convergence properties of the Nelder-Mead simplex method in low dimensions

Numerical recipes in C

The Levenberg-Marquardt algorithm: implementation and theory. Numerical analysis

An interior, trust region approach for nonlinear minimization subject to bounds

Deep medicine: how artificial intelligence can make healthcare human again

Multilayer feedforward networks are universal approximators

Long short-term memory

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Bidirectional recurrent neural networks

ADAM: a method for stochastic optimization. ICLR

A mathematical model for simulating the phasebased transmissibility of a novel coronavirus

Time-varying transmission dynamics of Novel Coronavirus Pneumonia in China

2020 Data-based analysis, modelling and forecasting of the Covid-19 outbreak

Early dynamics of transmission and control of COVID-19: a mathematical modelling study

Analysis and forecast of COVID-19 spreading in China

Social distancing strategies for curbing the COVID-19 epidemic

2020 Modeling and forecasting the early evolution of the Covid-19 pandemic in Brazil

2020 Model studies on the Covid-19 pandemic in Sweden

Modeling Covid-19 dynamics for real-time estimates and projections: an application to Albanian data. medRxiv

The first 100 days: modeling the evolution of the Covid-19 pandemic

A mathematical model for the novel coronavirus epidemic in Wuhan

Projected development of Covid-19 in Louisiana

2020 A mathematical model for the spatiotemporal epidemic spreading of Covid-19. medRxiv

A spatial model of Covid-19 transmission in England and Wales: early spread and peak timing. medRxiv

2020 Tracing DAY-ZERO and forecasting the fade out of the Covid-19 outbreak in Lombardy, Italy: a compartmental modelling and numerical optimization approach. medRxiv

Atmospheric conditions influencing the origin, transport and spread of major influenza and coronavirus pandemics: a research rubric based on COVID-19 outbreak

2020 The psychological impact of quarantine and how to reduce it: rapid review of the evidence

In press. A quantitative methodology for designing an optimal exit strategy from the COVID-19 Lockdown

2020 Age, death risk, and the design of an exit strategy: a guide for policymakers and for citizens who want to stay alive

Covid-19: predictive mathematical models for the number of deaths

Authors' contributions. A.S.F. conceived, designed and supervised the study. N.D. and G.A.K. performed the fitting experiments and analysis of the results. A.S.F. wrote the initial draft of the manuscript. All authors critically revised and improved the manuscript in various ways. All authors reviewed the manuscript and gave final approval for publication.Competing interests. We declare we have no competing interests. Funding. A.S.F. acknowledges support from EPSRC via a senior fellowship. N.D. acknowledges support from the Royal Society in the form of a fellowship.

A.S.F. is grateful to V. Marmarelis, whose deep insight regarding the remarkable range of the applicability of the Riccati equation provided the impetus for this work. He also thanks K. Kalimeris and Y. C. Yortsos for useful suggestions. Endnotes 1 https://www.cdc.gov/vhf/ebola/history/2014-2016-outbreak/ case-counts.html. 2 It should be noted that certain optimal features of the dynamics of the SIR model are also characterized by a Riccati equation that involves three time-dependent functions [12] . 3 https://www.ecdc.europa.eu/en/geographical-distribution-2019ncov-cases. 4 It is shown in [11] that this constant can be expressed in terms of the population size parameter H, the mortality rate of uninfected hosts parameter b, the mortality rate of infected hosts parameter α + b, and the rate at which infected hosts recover and become susceptible parameter γ.Note added in proof Figure 1 of the electronic supplementary material presents the updated curves of figures 4 for the period 24 May to 17 June. The curves associated with the epidemics of Italy and Spain, where most of the lockdown measures remain in effect, provide strong evidence for the capacity of our models to make accurate predictions. The curves associated with the epidemics of France and Germany provide strong evidence that the effect of the elimination of the lockdown measures has only a slight effect on the numbers of reported infected individuals as compared with the numbers predicted under the lockdown conditions. Remarkably, the large bump on the actual data of France occurred on 29 May (t = 84), which is indeed approximately two weeks after the easing of the lockdown measures (that occurred on 11 May). Furthermore, for the epidemic in Germany, the predicted curve began to deviate from the curve of the actual data on 25 May (t = 80), i.e. approximately two weeks after all state-wise curfews were eliminated.A similar analysis presented in [43] shows that the number of deaths caused by COVID-19 has not been affected by the easing of the lockdown measures in the above countries. Taken together, these results suggest that the virus responsible for the current pandemic, at least in the above four countries, has muted to one which is both less contagious and less virulent. An alternative explanation is that the older and the vulnerable individuals continue to observe the lockdown measures.For completeness, in figure 2 in the electronic supplementary material, we present a comparison of the actual data with predictions for the epidemics in USA and Sweden. These curves illustrate clearly the failure of both the analytical formulae and of the deep learning network to make accurate predictions.