key: cord-0457272-15jmdv1f authors: Shaier, Sagi; Raissi, Maziar; Seshaiyer, Padmanabhan title: Data-driven approaches for predicting spread of infectious diseases through DINNs: Disease Informed Neural Networks date: 2021-10-11 journal: nan DOI: nan sha: ef764551bbe7d7d8fc69a5c89e8c4038dd2c2bbe doc_id: 457272 cord_uid: 15jmdv1f In this work, we present an approach called Disease Informed Neural Networks (DINNs) that can be employed to effectively predict the spread of infectious diseases. This approach builds on a successful physics informed neural network approaches that have been applied to a variety of applications that can be modeled by linear and non-linear ordinary and partial differential equations. Specifically, we build on the application of PINNs to SIR compartmental models and expand it a scaffolded family of mathematical models describing various infectious diseases. We show how the neural networks are capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate). To demonstrate the robustness and efficacy of DINNs, we apply the approach to eleven highly infectious diseases that have been modeled in increasing levels of complexity. Our computational experiments suggest that DINNs is a reliable candidate for effectively learn about the dynamics of spread and forecast its progression into the future from available real-world data. Understanding the early transmission dynamics of infection diseases has never been more important in history as of today. The outbreak of severe acute acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that led to several thousand confirmed cases as of March, 2020 across the globe has challenged us to reenvision how we model,analysis and simulate infectious diseases and evaluate the effectiveness of non-pharmaceutical control measures as important mechanisms for assessing the potential for sustained transmission to occur in new areas. There are seveal examples throughout history that have led to humans learning more about the nature of infectious diseases, including the Plague in 542 CE that claimed millions of lives and the Black Death in the 14th century, which was one of the first known recorded pandemics [1] . Another in this list was Smallpox that killed people in numbers that exceeded those of any who have fought in wars in history. To this date, however, Smallpox is the only disease that human beings have been able to eradicate completely. Cholera, which erupted in the nineteenth century, remains a concern and still does not have a complete cure. While Plague, Black Death, Smallpox, and Cholera impacted several million people, it was not until the 1918 influenza pandemic that people experienced one of the greatest "natural disasters" in terms of a twentieth century infectious disease with a death count estimated to be more than 50 million. Within a decade after the Kermack-McKendrick epidemic model of 1927 as an age of infection model, that is, a model in which the infectivity of an individual depends on the time since the individual became infective was introduced [2] . This was considered one of the earliest attempts to formulate a simple mathematical model to predict the spread of an infectious disease where the population being studies is divided into compartments namely a susceptible class S, an infective class I, and a removed class R. This simple SIR epidemic model can be illustrated in compartments as in Figure 1 . Not only was it capable of generating realistic single-epidemic outbreaks but also provided important theoretical epidemiological insights. In Figure 1 , it is assumed that each class resides within exactly one compartment and can move from one compartment to another. The dynamics of the three sub-populations S(t), I(t) and R(t) may be described by the following SIR model given by first order coupled ordinary differential equations (ODE) [3, 4, 5, 6] : Note that this closed system does not allow any births/deaths. This SIR model in system (1) is fully specified by prescribing the transmission rate β and recovery rate α along with a set of initial conditions S(0), I(0) and R(0). The total population N at time t = 0 is given by N = S(0) + I(0) + R(0). Adding all the equations in system (1), we notice that N (t) = 0 and therefore N (t) is a constant and equal to its initial value. One can further assume R(0) = 0 since no one has yet had a chance to recover or die. Thus a choice of I(0) = I 0 is enough to define the system at t = 0 since then S 0 = N − I 0 . Following the influenza pandemic, several countries and leading organizations increased funding and attention to finding cures for infectious diseases in the form of vaccines and medicines. Along with these policy implementations, newer modified SIR models for mathematical epidemiology continued to evolve, particularly for those diseases that are categorized as re-emerging infections [7] , those that are spread through sexual transmission such as HIV [8, 9] , those that are spread through vectors such as mosquitoes such as Malaria or Dengue [10, 11] , those that can spread through both sexual and vector transmissions such as Zika [12, 13] , and those that can be spread by viruses, including SARS and MERS [14, 15] . Diseases were also categorized according to the rate at which they spread, for example, super-spreader diseases. This point is especially relevant to COVID-19 [16, 17] , categorized as a super-spreader based on the disproportionately fast rate and large (and growing) number of infected persons. Along with the development of mathematical modeling, there have been a variety of approaches that have been introduced to estimate the parameters such as the transmission, infection, quarantine and recovery using real-data. These have includes nonparametric estimation [18] , optimal control [19] , Bayesian frameworks [20] , inverse methods, least-squares approach, agent-based modeling, using final size calculations [21, 22, 23, 5] . Also, researchers have employed a variety of statistical approaches including maximum-likelihood, Bayesian inference and Poisson regression methods [24, 25, 26, 27, 28] . Some of this work also showed that the precision of the estimate increased with the number of outbreaks used for estimation [27] . To determine the relative importance of model parameters to disease transmission and prevalence, there has also been work around sensitivity analysis of the parameters using techniques such as Latin Hypercube Sampling and Partial Rank Correlation Coefficients analysis with the associated mathematical models [29, 30, 31] . While there have been significant advances in estimating parameters, there is still a great need to develop efficient, reliable and fast computational techniques. With the advent of artificial intelligence ranging from computer vision [32, 33, 34, 35] to natural language processing [36, 37] ,the dominant algorithm associated with these advancements has been neural networks (NN). A main reason for it is its behavior as a universal function approximator [38] . However, this field is largely relying on huge amounts of data and computational resources. Recent approaches [39] have been shown to be successful in combining the best of both fields. That is, using neural networks to model nonlinear systems, but reducing the required data and by constraining the model's search space with known knowledge such as a system of differential equations. Along with this, there have also been several works recently showing how differential equations can be learned from data. For example, [40] used a deep neural network to model the Reynolds stress anisotropy tensor, [41] solved for parabolic PDEs and backward stochastic differential equations using reinforcement learning, and [42] solved ODEs using a recurrent neural network. Additionally, [43, 39] developed physics informed models and used neural networks to estimate the solutions of such equations. Using this, recently such physics informed neural network approaches were applied for the first time to estimating parameters accurately for SIR model applied to a benchmark application [44] . Building on this, a unified approach called DINNs: Disease Informed Neural Networks is introduced in this work and systematically applied to some increasingly complex governing system of differential equations describing various prominent infectious diseases over the last hundred years. These systems vary in their complexity, ranging from a system of three to nine coupled equations and from a few parameters to over a dozen. For illustration of the application of DINNs, we introduce its application to COVID, Anthrax, HIV, Zika, Smallpox, Tuberculosis, Pneumonia, Ebola, Dengue, Polio, and Measles. Our contribution in this work is three fold. First, we extend the recent physics informed neural networks (PINNs) approach to a large family of infectious diseases. Second, we perform an extensive analysis of the capabilities and shortcomings of PINNs on diseases. Lastly, we show the ease at which one can use DINNs to effectively learn the dynamics of the disease and forecast its progression a month into the future from real-life data. The paper is structured as follows. In Section 2 we review potential necessary background information. Section 3 presents our technical approach and experiments. Lastly, we conclude with a summary in Section 4. A grand challenge in mathematical biology and epidemiology with great opportunities facing researchers working on infectious disease modeling is to develop a coherent deep learning framework that enables them to blend differential equations such as the system (1) with the vast data sets now available. One of the tools that makes these deep learning methods successful is the use of neural networks which is a system of decisions modeled after the human brain [45] . Consider the illustration shown in Figure 2 . The first layer of perceptrons first weigh and bias the input which Figure 2 : An illustration of a neural network can be observed values of infected data. The next layer then will make more complex decisions based off those inputs, until the final decision layer is reached which generates the outputs which can correspond to the values of parameters such as β and α. In this research, we implement a physics informed neural network based approach which makes decisions based on appropriate activation functions depending on the computed bias (b) and weights (w). The network then seeks to minimize the mean squared error of the regression with respect to the weights and biases by utilizing gradient descent type methods used in conjunction with software such as tensorflow. While there is currently a lot of enthusiasm about "big data", useful data in infectious diseases is usually "small" and expensive to acquire. In this work, we will describe how one can apply such physics informed neural network based deep learning approaches specifically to infectious diseases using DINNs and apply it to a real-world example to estimate optimal parameters, namely the transmission and recovery rates, in the SIR model. In this section, we present the DINNs methodology (sample architecture can be seen in figure 3 ). Subsection 3.1 briefly discuss background information for neural networks respectively. Subsection 3.2 provides an overview of the DINNs approach and outlines the algorithm, associated loss functions, and training information. Lastly, subsection 3.3 reviews our experiments and analyses. Briefly speaking, neural network is an attempt to mimics the way the human brain operates. The general fully connected model is organized into layers of nodes (i.e. neurons) where each node in a single layer is connected to every node in the following layer (except for the output layer), and each connection has a particular weight. The idea is that deeper layers capture richer structures [46] . A neuron takes the sum of weighted inputs from each incoming connection (plus a bias term), applies an activation function (i.e nonlinearity), and passes the output to all the neurons in the next layer. Mathematically, each neuron's output looks as follows where n represents the number of incoming connections, x i the value of each incoming neuron, w i the weight on each connection, b is a bias term, and σ is referred to as the activation function. A schematic representation of the resulting disease informed neural networks is given in Figure 3 . Note that for simplicity of illustration figure 3 depicts a network that comprises of 2 hidden layers with 6 neurons in the first hidden layer and 3 in the second. Networks with this kind of many-layer structure -two or more hidden layers -are called deep neural networks. These neurons in the network may be thought of as holding numbers that are calculated by a special activation function that depends on suitable weights and biases corresponding to each connection between neurons in each layer. With prior knowledge of such an activation function, the problems boils down to identifying the weights and biases that correspond to computed values of infected data that is close to the observed values. The three sub-populations are approximated by on the deep neural network with calculus on computation graphs using a backpropogation algorithm [47, 48, 49] . Inspired by recent developments in physics-informed deep learning [43, 39] , we propose to leverage the hidden physics of infectious diseases (1) and infer the latent quantities of interest (i.e., S, I, and R) by approximating them using deep neural networks. This choice is motivated by modern techniques for solving forward and inverse problems associated with differential equations, where the unknown solution is approximated either by a neural network or a Gaussian process. Following these approaches, we approximate the latent function t −→ (S, I, R) by a deep neural network and obtain the following DINNs corresponding to equation (1) and the total population N = S + I + R, i.e., We acquire the required derivatives to compute the residual networks E 1 , E 2 , E 3 , and E 4 by applying the chain rule for differentiating compositions of functions using automatic differentiation [50] . In our formal computations, we employed a densely connected (physics uninformed) neural network, with 1 hidden layer and 32 neurons per hidden layer which takes the input variable t and outputs S, I, and R. We employ automatic differentiation to obtain the required derivatives to compute the residual (physics informed) networks E 1 , E 2 , E 3 , and E 4 . It is worth highlighting that parameters α and β of the differential equations turn into parameters of the resulting physics informed neural networks E 1 − E 4 . The total loss function is composed of the regression loss corresponding to I and the loss imposed by the differential equations system (2) . Here, Id denotes the identity operator and the differential operator d dt is computed using automatic differentiation and can be thought of as an "activation operator". Moreover, the gradients of the loss function are back-propogated through the entire network to train the parameters using a gradient-based optimization algorithm. To evaluate the predictive capability of any algorithm, it must be robust enough to work against data for which the parameters are known as well as a dataset for which the parameters are not known. A data set for known parameters can be simulated by solving a system of equations in a forward fashion and potentially adding some noise. If that is provided to any parameter estimation algorithm, the efficacy of the algorithm can be determined by how well it is able to predict the true values for a wide range of starting guesses. For simplicity, we generated data instead by solving the systems of disease ODEs using LSODA algorithm [51] , the initial conditions, and the true parameters corresponding to each disease (e.g. death rate) from the literature. These little amount of data (50 − 100 points) are of the form of the above SIR compartments. To make our neural networks disease informed, once the data was gathered we introduced it to our neural network without the parameters (not to be confused with the NN's parameters). It is worth noting that in this formulation there is no training, validation, and test data set, such as in most common neural networks training, but rather how the disease is spread over time. The model then learned the systems, and predicted the parameters that generated them. Since in many of these systems there exist a large set of parameters that can generate them, we restricted our parameters to be in a certain range around the true value. That is, to show that our model can in fact identify the systems and one set of parameters that match the literature they came from. However, our method is incredibly flexible in the sense that adding, modifying, or removing such restrictions can be done with one simple line of code. Additionally, we used nearly a years worth of real data aggregated over every US state and accurately predicted a month into the future of COVID transmission. Next we employ Literate programming style that is intended to facilitate presenting parts of written code in the form of a narrative [52] . DINN takes the form Here, net f bounds the NN by forcing it to match the environment's conditions (e.g. f 1 , f 2 , f 3 ). These f i corresponds to the E i that was described earlier and also note that E 4 equation is not included here. The parameters of the neural network net sir and the network net f can be learned by minimizing the mean squared error loss That is, minimizing the loss Here, "actual" and "predict" refer to the actual data that the model was provided with and the prediction the model computed, respectively. As seen from the above, we leveraged the automatic differentiation that neural networks are trained on to get the partial derivatives of each S,I,R with respect to time. The neural networks themselves are fairly simple, consisting of 8 fully connected layers with either 20 or 64 neurons each depending on the complexity of the system and rectified linear activation function (ReLU) activation in between. Since the data is relatively small, our batch size contained the entire time array. The networks were trained on Intel(R) Xeon(R) CPU @ 2.30GHz, and depending on the complexity of the system the training time ranged from 30 minutes to 58 hours, which could be accelerated on GPUs and TPUs. That is, to learn both a system and its unknown parameters. However if the parameters are known, the training time to solely learn the system can be as short as 3 minutes. We used Adam optimizer [53] , and PyTorch's CyclicLR as our learning rate scheduler, with mode = "exp range", min lr ranging from 1 × 10 −6 to 1 × 10 −9 depending on the complexity of the system, max lr = 1 × 10 −3 , gamma=0.85, and step size up=1000. In the next sections we will refer to "min lr" simply as "learning rate". It is important to note that some diseases' systems were much more difficult for DINN to learn (e.g. Anthrax) and further training exploration such as larger/smaller learning rate, longer training, etc. may be needed to achieve better performance. Most mathematical models describing the spread of the disease employ classical compartments, such as the Susceptible-Infected-Recovered (SIR) or Susceptible-Exposed-Infected-Recovered (SEIR) structure described as an ordinary differential equation system [54] . Over the past several months there have been a variety of compartmental models that have been introduced as modified SEIR models to study various aspects of COVID-19 including containment strategies [55] , social distancing [56] and the impact of non-pharmaceutical interventions and the social behavior of the population [16, 17] . Along with these there have been a lot of work on modified SIR models as well including the SIRD model [57, 58, 59, 60] . In this work, we apply DINNs on a simple SIRD model describing COVID-19 [58] for simplicity of illustration as given by the following differential equation system. Here α is the transmission rate, β is the recovery rate and γ is the death rate from the infected individuals. N represents the total population. Given that most models may include a large set of parameters, it is important to consider ranges for each of them. Hence, we restricted our parameters to be in a certain range to show that our model can learn from the set that was used in the literature. First, we experimented with various parameter ranges, to identify the influence they had on the model. In the following we used a 4 layer neural network with 20 neurons each, 1 × 10 −6 learning rate, 100 data points, and the models were trained for 700,000 iterations (taking roughly 30 minutes). In or experiments we report two kinds of relative MSE loss errors. The first, "Error NN", is the error on the neural network's predicted system. The second, "Error learnable parameters", is the error on the system that was generated from the learnable parameters. That is, using LSODA algorithm to generate the system given the neural networks' parameters (e.g. α). As an example, if the actual parameter's value was 0.1, a 0% search range would simply be (0.1, 0.1), a 100% range would be (0 Further ranges are multiplications of those: 1000% = (2, −2), 10000% = (20, −20) , and so on. Table 1 (left) shows the parameters, their actual value, the range DINNs was searching in, and the parameters values that were found by DINNs. The right part of the table shows the error of the neural network and the LSODA generation of the system from the parameters. That is, it shows the effect that the search range had on how well the neural networks' learned the parameters. As seen from table 1 and figure 4 (for the remaining tables and figures see appendix 6.1), at least in the case of the COVID-19 system (3)-(6), DINNs managed to find extremely close set of parameters in any range we tested. Specifically, in figure 4 , the panel on the left shows the effect that the parameter search range had on the neural networks' outputs and the right panel results show the effect that the search ranges had on how well the neural networks' learned the parameters. Additionally, the systems were almost learned perfectly, though, there was some variation in the relative error between experiments. Next, to show the robustness of DINNs, we generated various amounts of uncorrelated Gaussian noise. The models were trained for 1.4 million iterations (roughly 1 hour), using parameter ranges of 1000% variation and similar learning parameters (e.g learning rate) as the previous section. We used a 4 layer neural network with 20 neurons each, and 100 data points. The experiments showed that even with a very high amount of noise such as 20%, DINN achieves surprisingly accurate results with maximum relative error of 0.143 on learning the system. That being said, the exact parameters were harder to learn in that amount of noise. It appear that the models may need further training to stabilize the parameters, as there were some variations in the amount of noise versus the accuracy. Figure 5 shows DINN's predictions on 20% uncorrelated gaussian noise. For the remaining figures and tables on various uncorrelated gaussian noise see appendix 6.2. In another exercise to show robustness, we trained our models with various amounts of data -10, 20, 50, 100, and 1000 points. The models were trained for 700,000 iterations, consisting of 4 layers with 20 neurons each, and 1 × 10 −6 learning rate. Our analysis shows that there was a big increase in the parameters accuracy from 10 points to 20 points. The model that was trained on 1000 data points performed the best, followed by 100 points, 20 points, 50 points, and lastly 10 points. It may be the case that further training will stabilize the results and the 50 data points model will outperform the 20 points one. Though, even with 20 data points the model learns the system incredibly well (table 2) . The left-hand side of the table shows the parameters and values found after training. The right-hand side as before shows the two errors: "Error NN" is the relative MSE loss error from the system that the neural network output (what DINN believes the systems' dynamics look like), and "Error Learnable Parameters" is the relative MSE loss error from the LSODA generated system using the parameters found values.See appendix 6.3 for remaining tables and figures. Here we examined the effect that wider or deeper neural network architecture has on DINNs. The models were trained on 100 data points, using parameter ranges of 1000%, 1 × 10 −6 learning rate, and 700,000 iterations. Tables 3 and 4 show a clear decrease in error as one increases the amount of neurons per layer. Specifically, Table 3 itemizes the (S,I,D,R) error from the neural network's output. For the Neural network architecture variations (depth and width), relative MSE errors were reported on the predicted NN system. Table 4 itemizes similar findings for LSODA generation of the learning parameters. There also seem to be a clear decrease in error as the number of layers increase. However, the error seem to stabilize around 8 layers, with very minor performance increase in 12 layers. We found that quickly increasing the learning rates and then quickly decreasing it to a steady value allows the network to learn well. One such learning rate schedule is PyTorch's CyclicLR learning rate scheduler. To show the importance of learning rate in the amount of needed training time, we trained DINNs with several values: 1 × 10 −5 , 1 × 10 −6 , 1 × 10 −8 as well as different step size for each one: 100, 1000, 10000. We used 4 layers with 20 neurons each, and 100 data points. The time measured from the moment the network started training, and until the loss was smaller than 4 × 10 −4 -which usually corresponds to learning the system almost perfectly. As can be seen from the results (Table 5 ) both the minimum learning rate and the step size play an important role in learning the system. Reducing the learning rate to a small value too quickly may result in hours of training time instead of minutes. As an afterthought, this might be the reason why most of the systems were taking so long to train (>10 hrs), while the COVID system took <25 minutes. COVID with missing data on I, tuberculosis with missing data on L and I, and ebola with missing data on H. The models were trained on 100 data points, were given the known parameters from the literature, and were only given the initial conditions for the missing data. The COVID model was trained with 1 × 10 −6 learning rate for 1 million iterations (roughly 1 hour). The tuberculosis model was trained with 1×10 −5 learning rate for 100k iterations. The ebola model was trained with 1 × 10 −6 learning rate for 800,000 iterations. Our results show that DINN can in fact learn systems even when given partial data. However, it is important to note that the missing data compartments should be in at least one other compartment in order to get good results. For example, when we tried to remove the COVID recovered compartment (i.e R), DINN learned S, I, and D nearly perfectly. However, it did not do very well on R. That is because R is not in any of the other equations. We report the neural networks' systems outputs and their losses. See appendix 14 for figures. COVID system [58] : Tuberculosis system [61] : Ebola system [62] : Expanding on the relatively simple COVID model that was used for experimentation so far, here DINN was applied to 10 other highly infectious diseases, namely Anthrax, HIV, Zika, Smallpox, Tuberculosis, Pneumonia, Ebola, Dengue, Polio, and Measles. These diseases vary in their complexity, ranging from 3D to 9D ODEs, and from a few parameters to over a dozen. Table 9 provides a summary of our analysis. Specifically, it itemizes for each disease and its best, worst, and median parameter estimate error. See subsection 6.5 in the appendix for the remaining diseases analyses. Lastly, to verify that DINN is in fact as reliable as it appears, we used 310 days (04-12-2020 to 02-16-2021) of real US data from [63] . We trained a neural network that learned the cumulative cases of susceptible, infected, dead, and recovered, and predicted the cases for a future month. Specifically, out of those 310 days we gave the network 280 days worth of data and asked it to predict each compartment's progression a month (30 days) into the future. The network received 31 data points (1 per 10 days), was trained for 100k epochs (roughly 5 minutes), had 8 layers with 20 neurons each, a 1000% parameters variation, and 1 × 10 −5 learning rate. Our results suggest that the learnable parameters found from both networks were quite different from the parameters in the literature (for the cumulative and daily networks respectively: α = 0.0176 and α = 0.0031 instead of 0.191, β = 0.0046 and β = 0.0021 instead of 0.05, and γ = 0.0001 and γ = 9.6230 × 10 −5 instead of 0.0294). This may imply that either the data was different from the initial data distribution used in the literature [58] , or that as other authors mentioned these are time-varying parameters rather than constant ones. As seen from figure 7, the cumulative cases had less data variation and were fairly easy to learn. Additionally, it appears as DINN managed to accurately predict the future month on each compartment. The daily cases had much more data variations and were more difficult. That being said, DINN managed to learn the relative number of cases each day. In this work, we have introduced Disease Informed Neural Networks (DINNs) which is a neural network approach capable of learning a number of diseases, how they spread, forecasting their progression, and finding unique parameters that are used in models to describe the disease dynamics. Building on a simple SIRD model for COVID-19, we used it to model eleven deadly infectious diseases and show the simplicity, efficacy, and generalization of DINNs in their respective applications. These diseases were modeled into various differential equations systems with various number of learnable parameters. We found that DINN can quite easily learn systems with a low number of parameters and dimensions (e.g. COVID), and when the learnable parameters are known the training time can change from 50 hours to a few minutes. Moreover, it appears as if the number . From the anthrax model result we see that it's far more difficult for DINN to learn systems which have numerous quick and sharp oscillations. That being said, looking at the polio, and zika models results we can see that it's not impossible, but rather more time consuming (both in training and hyperparameter search). Also, based on the measles, tuberculosis, and smallpox models results we can see that low number of sharp oscillations are relatively easy to learn. Several interesting systems were the anthrax -as DINN appeared to be struggling with, zika -with the highest number of dimensions, parameters, and oscillations, and COVID -could be predicted to nearly perfection in roughly 3 minutes. Our results from this work suggest that DINNs is a robust and reliable candidate that can be used as an inverse approach to characterize and learn parameters used in compartmental models for understanding dynamics of infectious diseases. This work has been supported in part by the National Science Foundation DMS 2031027 and DMS 2031029. Left-hand side shows the parameters, their ranges, value found after training, and relative error percentage. Right-hand side shows 2 errors: "Error NN" is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems' dynamics look like), and "Error Learnable Parameters" is the relative MSE loss error from the LSODA generated system using the parameters found values. This subsection shows the remaining figures (12) for the various uncorrelated gaussian noise we trained DINN on. Table 11 : Various Guassian Noise. Left-hand side shows the parameters, their actual values, value found after training, and relative error percentage. Right-hand side shows 2 errors: "Error NN" is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems' dynamics look like), and "Error Learnable Parameters" is the relative MSE loss error from the LSODA generated system using the parameters found values. This subsection shows the remaining figure (13) and table (12) for the various data points settings we trained DINN on. Table 12 : Various Data Points. Left-hand side shows the parameters, their actual values, value found after training, and relative error percentage. Right-hand side shows 2 errors: "Error NN" is the relative MSE loss error from the system that the neural network outputted (what DINN believes the systems' dynamics look like), and "Error Learnable Parameters" is the relative MSE loss error from the LSODA generated system using the parameters found values. System: Figure 18 and tables 19, 20 show our results. System: Plagues and peoples. Anchor A contribution to the mathematical theory of epidemics The basic epidemiology models: models, expressions for R0, parameter estimation, and applications Mathematical models in population biology and epidemiology An introduction to mathematical epidemiology Mathematical epidemiology: Past, present, and future Mathematical approaches for emerging and reemerging infectious diseases: models, methods, and theory Mathematical and statistical approaches to AIDS epidemiology Bifurcations of a mathematical model for HIV dynamics Malaria model with stage-structured mosquitoes Estimation of the reproduction number of dengue fever from spatial epidemic data Mathematical modeling, analysis and simulation of the spread of Zika with influence of sexual transmission and preventive measures Computational and mathematical methods to estimate the basic reproduction number and final size for single-stage and multistage progression disease models for zika with preventative measures Modeling the SARS epidemic Mathematical modeling and control of MERS-CoV epidemics Mathematical modeling, analysis, and simulation of the COVID-19 pandemic with explicit and implicit behavioral changes Mathematical Modeling, Analysis, and Simulation of the COVID-19 Pandemic with Behavioral Patterns and Group Mixing Forecasting epidemics through nonparametric estimation of time-dependent transmission rates using the SEIR model An Introduction to Optimal Control with an Application in Disease Modeling A Bayesian framework for parameter estimation in dynamical models Mathematical biology Extracting the time-dependent transmission rate from infection data via solution of an inverse ODE problem Agent-based mathematical modeling as a tool for estimating Trypanosoma cruzi vector-host contact rates Hierarchical Bayesian methods for estimation of parameters in a longitudinal HIV dynamic system Statistical inference for infectious diseases: risk-specific household and community transmission parameters Parameter identification in epidemic models Fitting outbreak models to data from many small norovirus outbreaks Parameter estimation and uncertainty quantication for an epidemic model Sensitivity and uncertainty analysis of complex models of disease transmission: an HIV model, as an example A comparison of three methods for selecting values of input variables in the analysis of output from a computer code Determining important parameters in the spread of malaria through the sensitivity analysis of a mathematical model ImageNet Classification with Deep Convolutional Neural Networks You Only Look Once: Unified, Real-Time Object Detection Scalable and Efficient Object Detection. 2020 Pre-training of Deep Bidirectional Transformers for Language Understanding Attention Is All You Need Multilayer feedforward networks are universal approximators Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations Reynolds averaged turbulence modelling using deep neural networks with embedded invariance Deep Learning-Based Numerical Methods for High-Dimensional Parabolic Partial Differential Equations and Backward Stochastic Differential Equations Solving differential equations with unknown constitutive relations as recurrent neural networks Hidden physics models: Machine learning of nonlinear partial differential equations On parameter estimation approaches for predicting disease transmission through optimization, deep learning and statistical inference methods Deep learning". In: nature 521 The Power of Depth for Feedforward Neural Networks Theory of the backpropagation neural network". In: Neural networks for perception Deep learning in neural networks: An overview Deep learning Automatic differentiation in machine learning: a survey Differential Equation Solver for Stiff or Non-Stiff System Literate programming Adam: A method for stochastic optimization Basic ideas of mathematical epidemiology Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China Evaluating the effectiveness of social distancing interventions to delay or flatten the epidemic curve of coronavirus disease Estimating and simulating a SIRD model of COVID-19 for many countries, states, and cities Data-based analysis, modelling and forecasting of the COVID-19 outbreak Use of a modified SIRD model to analyze COVID-19 data Studying the progress of COVID-19 outbreak in India using SIRD model To treat or not to treat: The case of tuberculosis Understanding the dynamics of Ebola epidemics An interactive web-based dashboard to track COVID-19 in real time The following subsections provide additional information mainly in the form of figures and tables. This subsection shows the remaining figures (8, 9, 10, 11) and table (10) for the various parameter search ranges we trained DINN on.