key: cord-0114705-lvphptu7 authors: Wang, Rui; Yu, Rose title: Physics-Guided Deep Learning for Dynamical Systems: A Survey date: 2021-07-02 journal: nan DOI: nan sha: 42d0b72be32945f4a35a69d0010dd314520ca84b doc_id: 114705 cord_uid: lvphptu7 Modeling complex physical dynamics is a fundamental task in science and engineering. Traditional physics-based models are sample efficient, interpretable but often rely on rigid assumptions. Furthermore, direct numerical approximation is usually computationally intensive, requiring significant computational resources and expertise. While deep learning (DL) provides novel alternatives for efficiently recognizing complex patterns and emulating nonlinear dynamics, its predictions do not necessarily obey the governing laws of physical systems, nor do they generalize well across different systems. Thus, the study of physics-guided DL emerged and has gained great progress. Physics-guided DL aims to take the best from both physics-based modeling and state-of-the-art DL models to better solve scientific problems. In this paper, we provide a structured overview of existing methodologies of integrating prior physical knowledge or physics-based modeling into DL, with a special emphasis on learning dynamical systems. We also discuss the fundamental challenges and emerging opportunities in the area. Modeling complex physical dynamics over a wide range of spatial and temporal scales is a fundamental task in a wide range of fields including, for example, fluid dynamics [134] , cosmology [140] , economics [32] , and neuroscience [62] . Dynamical systems are mathematical objects that are used to describe the evolution of phenomena over time and space occurring in nature. Dynamical systems are commonly described with differential equations that are equations relate one or more unknown functions and their derivatives. is called a k -th -order partial differential equation (or ordinary differential equation when n = 1), where F : R n k × R n k−1 × ... × R n × R × U → R. F models the dynamics of a n-dimensional state x ∈ R n and it can be either a linear or non-linear operator. Since most dynamics evolve over time, one of the variables of u is usually time dimension. In general one must specify appropriate boundary and initial conditions of Equ.1 to ensure the existence/uniqueness of a solution. Learning dynamical systems is to search a good model F that can accurately describe the behavior of the physical process insofar as we are interested. Physics as a discipline, has a long tradition of using first-principles to describe spatiotemporal dynamics. The laws of physics have greatly improved our understanding of the physical world. Many physics laws are described by systems of highly nonlinear differential equations that have direct implications for understanding and predicting physical dynamics. However, these equations are usually too complicated to be solvable. The current paradigm of numerical methods for solution approximation is purely physics-based: known physical laws encoded in systems of coupled differential equations are solved over space and time via numerical differentiation and integration schemes [59; 60; 92; 63; 103; 120] . However, these methods are tremendously computationally intensive, requiring significant computational resources and expertise. An alternative is seeking simplified models that are based on certain assumptions and roughly can describe the dynamics, such as Reynolds-averaged Navier-stokes equations for turbulent flows and Euler equations for gas dynamics [22; 79; 138] . But it is highly nontrivial to obtain a simplified model that can describe a phenomenon to a satisfactory accuracy. More importantly, for many complex real-world phenomena, only partial knowledge of their dynamics is known. The equations may not fully represent the true system states. Deep Learning (DL) provides efficient alternatives to learn high-dimensional spatiotemporal dynamics from massive datasets. It achieves so by directly predicting the input-output mapping and bypassing numerical integration. Recent works have shown that DL can generate realistic predictions and significantly accelerate the simulation of physical dynamics relative to numerical solvers, from turbulence modeling to weather prediction [144; 72; 69; 68; 70] . This opens up new opportunities at the intersection of DL and physical sciences, such as molecular dynamics [129; 131] , epidemiology [156] , cardiology [91; 143] and material science [93; 19] . Despite the tremendous progress, DL is purely data-driven by nature, which has many limitations. DL models still adhere to the fundamental rules of statistical inference. Without explicit constraints, DL models are prone to make physically implausible forecasts, violating the governing laws of physical systems. Additionally, DL models often struggle with generalization: models trained on one dataset cannot adapt properly to unseen scenarios with different distributions, known as distribution (covariate) shift. For dynamics learning, the distribution shift occurs not only because the dynamics are non-stationary and nonlinear, but also due to the changes in system parameters, such as initial and boundary conditions [146] . Neither DL alone nor purely physics-based approaches can be considered sufficient for learning complex dynamical systems in scientific domains. Therefore, there is a growing need for integrating traditional physics-based approaches with DL models so that we can make the best of both types of approaches. There is already a vast amount of work about physics-guided DL [154; 40; 16; 77; 68; 116; 17] , but the focus on deep learning for dynamical systems is still nascent. Physics-guided DL offers a set of tools to blend these physical concepts such as differential equations and symmetry with deep neural networks. On one hand, these DL models offer great computational benefits over traditional numerical solvers. On the other hand, the physical constraints impose appropriate inductive biases on the DL models, leading to accurate simulation, scientifically valid predictions, reduced sample complexity, and guaranteed improvement in generalization to unknown environments. This survey paper aims to provide a structured overview of existing methodologies of incorporating prior physical knowledge into DL models for learning dynamical systems. The paper is organized as below. • Section 2 formulates the three main learning problems of physics-guided DL, including solving differential equations, dynamics forecasting, learning dynamics residuals. • Section 3 describes the five objectives of physics-guided DL. • Section 4∼8 categorizes existing physics-guided DL approaches into four groups based on the way how physics and DL are combined. Each lead with a detailed review of our recent work as a case study and further categorized based on application, model architecture or learning problems. -Section 4: Physics-guided loss function: prior physics knowledge are imposed as additional soft constraints in the loss function. -Section 5: Physics-guided architecture design: physics knowledge are incorporated into the design of neural network modules. -Section 6: Hybrid physics-DL models: complete physics-based approaches are directly integrated into DL models. -Section 8: Invariant and equivariant DL models: DL models are designed to respect the symmetries for a given physical system. • Section 9 summarizes the challenges in this field and discusses the emerging opportunities for future research. In this paper, we mainly focus on the following three learning problems. When F in Equ 1 is known but Equ 1 is too complicated to be solvable, researchers tend to directly solve the differential equations by approximating solution of u(x) with a deep neural network, and enforcing the governing equations as soft constraint on the output of the neural nets during training at the same time [114; 113; 76] . This approach can be formulated as the following optimization problem, (2) L(u) denotes the misfit of neural nets predictions and the training data points. θ denotes the neural nets parameters. L F (u) is a constraint on the residual of the differential equation system under consideration and λ F is a regularization parameter that controls the emphasis on this residual. The goal is then to train the neural nets to minimize the loss function in Equ 2. When F in Equ 1 is unknown or numerically solving Equ 1 require too much computation, many works studied learning high-dimensional spatiotemporal dynamics by directly predicting the input-output system state mapping and bypassing numerical discretization and integration [34; 68; 129; 147] . If we assume the first dimension x 1 of u in Equ 1 is the time dimension t, then the problem of dynamics forecasting can be defined as learning a map f : R n×k → R n×q that maps a sequence of historic states to future states of the dynamical system, f (u(t − k + 1, ·), ..., u(t, ·)) = u(t + 1, ·), ..., u(t + q, ·) where k is the input length and q is the output length. f is commonly approximated with purely datadriven or physics-guided neural nets and the neural nets are optimized by minimizing the prediction errors of the state L(u). When F in Equ 1 is partially know, we can use neural nets to learn the errors or residuals made by physics-based models [33; 160; 66] . The key is to learn the bias of physics-based models and correct it with the help of deep learning. The final prediction of the state is composed of the simulation from the physics-based models and the residual prediction from neural nets as below, whereû F is the prediction obtained by numerically solving F,û NN is the prediction from neural networks andû is the final prediction made by hybrid physics-DL models. This learning problem generally involves two training strategy: 1) joint training: optimizing the parameters in the differential equations and the neural networks at the same time by minimizing the prediction errors of the system states. 2) two-stage training: we first fit differential equations on the training data and obtain the residuals, then directly optimize the neural nets on predicting the residuals. This subsection provides an overview of the objectives of physics-guided DL for learning dynamical systems. By incorporating physical principles, governing laws, and domain knowledge into DL models, the rapidly growing field of physics-guided DL seeks to (1) accelerate data simulation (2) build physically scientifically valid models (3) improve the generalizability of DL models (4) solve complicated partial differential equations. We discuss each of the four objectives in detail. Simulation is an important method of analyzing, optimizing and designing real-world processes, which is easily verified, communicated, and understood. It serves as the surrogate modeling and digital twin and provides valuable insights into complex physical systems. Traditional physics-based simulations often rely on running numerical methods: known physical laws encoded in systems of coupled differential equations are solved over space and time via numerical differentiation and integration schemes [59; 92; 63; 103; 120] . These methods require significant computational resources because the discretization step size is usually confined to be very small due to stability constraints when the dynamics is complex. Moreover, the performance of numerical methods can highly depend on the initial guesses of unknown parameters [61] . Recently, DL has demonstrated great success in the automation, acceleration, and streamlining of highly compute-intensive workflows for science [116; 138; 72] . Deep dynamics models can directly approximate high-dimensional spatiotemporal dynamics by directly forecasting the future states and bypassing numerical integration [147; 34; 123; 109; 123; 148] . These models are trained to make forward predictions given the historic frames as input with one or more steps supervision, and can roll out up to hundreds of steps during inference. DL models is usually faster than classic numerical solvers by one to two orders of magnitude. Because DL is able to take much larger space or time steps than classical solvers [109] . Furthermore, deep generative models, such as generative adversarial network (GAN), can generate instantaneous high-resolution flow fields that are statistically similar to those of direct numerical simulation [70; 158] . The computer graphics community has also investigated using DL to speed up numerical simulations for generating realistic animations of fluids such as water and smoke [69; 138; 153] . However, the community focuses more on the visual realism of the simulation rather than the physical characteristics. Despite the tremendous progress of DL for science, e.g., atmospheric science [116] , computational biology [2] , material science [19] , quantum chemistry [128] , it remains a grand challenge to incorporate physical principles in a systematic manner to the design, training, and inference of such models. Purely data-driven DL models still adhere to the fundamental rules of statistical inference. Without explicit constraints, DL models are prone to make physically implausible predictions, violating the governing laws of physical systems. Thus, to build trustworthy predictive models for science and engineering, we need to leverage known physical principles to guide DL models to learn the correct underlying dynamics instead of simply fitting the observed data. For instance, [67; 64; 157; 12] improve the physical and statistical consistency of DL models by explicitly regularising the loss function with physical constraints. Hybrid DL models, e.g., [95; 6; 23] integrate differential equations in DL for temporal dynamics forecasting and achieve promising performance. [90] and [44] studied tensor invariant neural networks that can learn the Reynolds stress tensor while preserving Galilean invariance. [144] proposed a hybrid model by marrying RANS-LES coupling method and custom-designed U-net. [52] and [30] design Hamiltonian and Lagrangian neural networks that respect conservation laws. DL models often struggle with generalization: models trained on one dataset cannot adapt properly to unseen scenarios with distributional shifts that may naturally occur in dynamical systems [75; 3; 146] . In addition, most current approaches are still trained to model a specific system, making it challenging to meet the needs of the scientific domain with heterogeneous environments. Thus, it is imperative to develop generalizable DL models that can learn and generalize well across systems with various parameter domains, initial and boundary conditions. Prior physical knowledge can be considered as an inductive bias that can place a prior distribution on the model class and shrink the model parameter search space. With the guide of the inductive bias, DL models can better capture the underlying dynamics from the data that are consistent with physical laws. Across different data domains and systems, the laws of physics stay constant. Hence, integrating physical laws in DL enables the models to generalize outside of the training domain and even to different systems. Embedding symmetries into DL models is one way to improve the generalization, which we will discuss in detail in subsection 8. For example, [147] demonstrated that the encoding of rotation, scaling and uniform motion symmetries into DL models greatly improves the generalization on forecasting turbulence. There are many other ways to improve the generalization of DL models by incorporating other physical knowledge. [148] proposed a model-based meta-learning approach that can generalize across heterogeneous domains. It contains a prediction network which learns the shared dynamics of the entire domain, and an encoder which infers the parameters of the task. [43] encodes Lyapunov stability into an autoencoder model for predicting fluid flow and sea surface temperature. They show improved generalizability and reduced prediction uncertainty for neural nets that preserve Lyapunov stability. [130] shows adding spectral normalization to DNN to regularize its Lipschitz continuity can greatly improve the generalization to new input domains on the task of drone landing control. Although the governing equations of many physical systems are known, finding approximate solutions using numerical algorithms and computers is still prohibitively expensive. DL models may greatly reduce the computation since some models, such as graph neural nets [4; 88] , can be independent of resolution and can take much larger space and time steps during inference than classical solvers. For example, [96; 97] designed two PDE-Net that can accurately predict the dynamical behavior of data and have the potential to reveal the underlying PDE model that drives the observed data. The most common approach is that deep neural networks can directly approximate the solution of complex coupled differential equations via gradient-based optimization, which is the so-called physics-informed neural network (PINN) models. This approach has shown success in approximating a variety of PDEs [113; 112; 21; 54] . However, poor generalization to the unseen domain and slow convergence in training may have limited its applicability to complex physics problems [76] . One of the main themes of science is the search for fundamental laws of practical problems [40] . When the governing equations of dynamical systems are known, they allow for accurate mathematical modeling, robust forecasting, and increased interpretability. However, dynamical systems in many fields, such as epidemiology, finance, and neuroscience, have no formal analytical description. Therefore, physics-guided DL can also assist the search for unknown physical laws. One common paradigm for data-driven discovery of governing equations is to first define a large set of possible mathematical basis functions and then learn the coefficients. The seminal work by [18] proposed to find ordinary differential equations by creating a dictionary of possible basis functions and applying sparse regression to select the relevant basis. [80; 117] further extend this method by using neural networks to construct the dictionary of basis functions. [20] contributed to this trend by introducing an efficient first-order conditional gradient algorithm for solving the optimization problem of finding the best sparse fit to observational data in a large library of potential nonlinear models. However, the scalability, overfitting issues and over-reliance on high quality measurement data herein have been the critical concerns [115] . Complex physical dynamics occur over a wide range of spatial and temporal scales. Standard DL models may simply fit the observed data while failing to learn the correct underlying dynamics, thus leading to low physical consistency and poor generalizability. One of the simplest and most widely used approaches to incorporate physical constraints is via designing loss functions (regularization). Physics-guided loss functions (regularization) can assist DL models to capture correct and generalizable dynamic patterns that are consistent with physical laws. Furthermore, the loss functions constrained by physics laws can reduce the possible search space of parameters. This approach is sometimes referred to as imposing differentiable "soft" constraints, which will be contrasted with imposing "hard" constraints (physics-guided architecture) in the next section. In this chapter, we will start with a simple example of physics-guided loss from our previous work, and then categorize this type of methods based on their objectives, including improving prediction, accelerating data generaion and solving differential equations. One of our earlier works [145] studied the task of forecasting two-dimensional raw velocity fields of an incompressible turbulent flow, as shown in Figure 1 . Incompressible fluid flows have zero divergence everywhere. The DL models are trained to make forward predictions given the historical velocity fields, analogous to video prediction. In the proposed physics-guided DL model, Turbulent Flow Net (TF-Net), apart from the regular mean square error(MSE) loss between the target and the prediction, it includes an additional divergence-free regularizer to reduce the divergence of turbulent flow predictions during training, as shown below. where w(x, t) is the velocity field over space x and time t. ∇ is the vector differential operator. λ is a hyper-parameter that controls the strength of the divergence free regularizer. By penalizing the divergence of the predictions, the proposed model is able to obey the mass conservation law and generate predictions with almost zero divergence. Thus, generally speaking, a physics-guided loss function includes an additional physics-based regularizer that ensures consistency with physical laws and is controlled by a hyper-parameter. Figure 1 : A snapshot of the velocity norm ( v 2 x + v 2 y ) fields of the 2D Rayleigh-Bénard convection flow [26] . The spatial resolution is 1792 x 256 pixels. Similar to the divergence-free loss in Eqn (5), physics-guided loss functions or regularization have shown great success in improving the prediction performance, especially the physical consistency of DL models. [67] used neural nets to model lake temperature at different times and different depths. They ensure that the predictions are physically meaningful by regularizing that the denser water predictions are at lower depths than predictions of less dense water. [64] further introduced a loss term that ensures thermal energy conservation between incoming and outgoing heat fluxes for modeling lake temperature. [13] designed conservation layers to strictly enforce conservation laws in their NN emulator of atmospheric convection. [11] introduced a more systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the loss function. [163] incorporated the loss of atomic force and atomic energy into neural nets for improved accuracy of simulating molecular dynamics. [93] proposed a novel multi-fidelity physics-constrained neural network for material modeling, in which the neural net was constrained by the losses caused by the violations of the model, initial conditions, and boundary conditions. [38] proposed a novel paradigm for spatiotemporal dynamics forecasting that performs spatiotemporal disentanglement using the functional variable separation. The specific-designed time invariance and regression loss functions ensure the separation of spatial and temporal information. The Hamiltonian of a system is the sum of the kinetic energies of all particles, plus the potential energy of the particles associated with the system. [52] proposed Hamiltonian Neural Nets (HNN) that parameterizes a Hamiltonian with a neural network and then learn it directly from data. The conservation of desired quantities is constrained in the loss function during training. The proposed HNN has shown success in predicting mass-spring and pendulum systems. Lagrangian mechanics models the energies in a system rather than the forces. [30] proposed Lagrangian Neural Nets (LNN) used a neural network to parameterize the Lagrangian function that is the kinetic energy (energy of motion) minus the potential energy. They trained the neural network with the Euler-Lagrange constraint loss functions such that it can learns to approximately conserve the total energy of the system. [47] further simplify the HNN and LNN via explicit constraints. [83] further introduced a meta-learning approach in HNN to find the structure of the Hamiltonian that can be adapted quickly to a new instance of a physical system. [166] benchmark recent energy-conserving neural network models based on Lagrangian/Hamiltonian dynamics on four different physical systems. Simulation is an important method of analyzing, optimizing and designing real-world processes. Current numerical methods require significant computational resources when solving chaotic and complex differential equations. Because numerical discretization step size is confined to be very small due to stability constraints [61] . Also, the estimation of unknown parameters by fitting equations to the observed data requires much manual engineering in each application since the optimization of the unknown parameters in the system highly depends on the initial guesses. Thus, there is an increasing interest in utilizing deep generative models for simulating complex physical dynamics. Many works also imposed physical constraints in the loss function for better physical consistency. For instance, [157] enforced the constraints of covariance into standard Generative Adversarial Networks (GAN) via statistical regularization, which leads to faster training and better physical consistency compared with standard GAN. [158] proposed tempoGAN for super-resolution fluid flow, in which an advection difference loss is used to enforce the temporal coherence of fluid simulation. [152] modified ESRGAN, which is a conditional GAN designed for super-resolution, by replacing the adversarial loss with a loss that penalizes errors in the energy spectrum between the generated images and the ground truth data. Conditional GAN is applied to emulating numeric hydroclimate models in [99] . The simulation performance is further improved by penalizing the snow water equivalent via loss function. [69] proposed a generative model to simulate fluid flows, in which a novel stream functionbased loss function is designed to ensure divergence-free motion for incompressible flows. [50] proposed a physics-informed convolutional model for flow super-resolution, in which the physical consistency of the generated high-resolution flow fields is improved by minimizing the residuals of Navier-Stokes equations. The partial differential equations (PDEs) for dynamical systems are often derived by starting from governing first principles. It is very difficult to find analytical solutions for many complex PDEs. Neural networks have been widely used to solve complex differential equations, where PDE constraints are enforced using the loss function [168; 111; 113] . [112; 114; 1; 132] directly approximate the solution differential equations with fully connected neural networks given space coordinates and time stamps as input, which is the so-called physicsinformed neural network (PINN) model. They used automatic differentiation to differentiate the neural networks to calculate the first or second order derivatives with respect to their input coordinates and time. Then the governing equation can be enforced in the loss function as we formulate in the Section 2.1. PINNs have been demonstrated efficiency and accuracy in learning simple dynamics. However, [76] pointed out that PINNs often fail to learn complex physical phenomena because the PDE regularizations make the optimization problem much more difficult. The authors also proposed two ways to alleviate this optimization problem. One is to start by training the PINN on a small constraint coefficient and then gradually increase the coefficient instead of using a big coefficient right away. The other one is training the PINN to predict the solution one time step at a time instead of the entire space-time at once. As we can see, physics-guided loss functions are easy to design and use, and have the capability of improving the prediction accuracy and physical consistency. But they are usually considered as soft constraints because the physical constraints incorporated in the loss functions are not strictly enforced. In other words, the desired physical properties are not guaranteed via physics-guided loss function. Furthermore, PDE regularization may make loss landscapes more complex and cause optimization issues that are hard to address [76] . Additionally, there might be a trade-off between prediction errors and physics-guided regularizers. For instance, in the case study, although constraining the the model with the divergence-free regularizer can reduce the divergence of the predictions, too much regularization has the side effect of smoothing out the small eddies in the turbulence, which may result in a larger prediction error [144] . While incorporating physical constraints as regularizers in the loss function can improve performance, DL is still used as a black box model in most cases. The modularity of neural networks offers opportunities for the design of novel neurons, layers or blocks that encode specific physical properties. The advantage of physics-guided NN architectures is that they can impose "hard" constraints that are strictly enforced, compared to the "soft" constraints described in the previous section. The "soft" constraints are much easier to design than hard constraints, yet not required to be strictly satisfied. DL models with physics-guided architectures have theoretically guaranteed properties, and hence are more interpretable and generalizable. In this chapter, we will start with a case study of Turbulent-Flow Net (TF-Net) that unifies a popular Computational Fluid Dynamics (CFD) technique and a custom-designed U-net. We further categorize other related methods based on their architecture design. Figure 2 : Turbulent Flow Net: three identical encoders to learn the transformations of the three components of different scales, and one shared decoder that learns the interactions among these three components to generate the predicted 2D velocity field at the next instant. Each encoder-decoder pair can be viewed as a U-net and the aggregation is weighted summation. Turbulent-Flow Net (TF-Net) [144] is a physics-guided DL model for turbulent flow prediction. As shown in Figure 2 , it applies scale separation to model different ranges of scales of the turbulent flow individually. Computational fluid dynamics (CFD) techniques are at the core of present-day turbulence simulation. Direct Numerical Simulation (DNS) are accurate but not computationally feasible for practical applications. Great emphasis was placed on the alternative approaches including Large Eddy Simulation (LES) and Reynolds-averaged Navier-Stokes (RANS). Both resort to resolving large scales while modeling small scales, using various averaging techniques and/or low-pass filtering of the governing equations [103; 120] . One of the widely used CFD techniques, RANS-LES coupling approach [42] , combines both Reynolds-averaged Equations (RANs) and Large Eddy Simulation (LES) approaches in order to take advantage of both methods. Inspired by RANS-LES coupling, TF-Net replaces a priori spectral filters with trainable convolutional layers. The turbulent flow is decomposed into three components, each of which is approximated by a specialized U-net to preserve the multiscale properties of the flow. A shared decoder learns the interactions among these three components and generate the final prediction. The motivation for this design is to explicitly guide the ML model to learn the nonlinear dynamics of large-scale and Subgrid-Scale Modeling motions as relevant to the task of spatiotemporal prediction. In other words, we need to force the model to learn not only the large eddies but also the small ones. When we train a predictive model directly on the data with MSE loss, the model may overlook the small eddies and only focus on large eddies to achieve reasonably good accuracy. Besides RMSE, physically relevant metrics including divergence and energy spectrum are used to evaluate the performance of the models' prediction. Figure 3 shows TF-Net consistently outperforms all baselines on physically relevant metrics (Divergence and Energy Spectrum) as well as average time to produce single velocity field. Constraining it with divergence free regularizer that we describe in the case study of last chapter can further reduce the RMSE and Divergence. Figure 4 shows the ground truth and predicted velocity along x direction by TF-Net and three best baselines. We see that the predictions by our TF-Net model are the closest to the target based on the shape and frequency of the motions. Thus, TF-Net is able to generate both accurate and physically meaningful predictions of the velocity fields that preserve critical quantities of relevance. Convolutional architecture remains dominant in most tasks of computer vision, such as objection, image classification, and video prediction. Thanks to their efficiency and desired inductive biases, such as locality and translation equivariance, convolution neural nets have been widely applied to emulate and predict complex spatiotemporal physical dynamics. Researchers have proposed various ways to bake desired physical properties into the design of convolutional models. For example, [65] proposed to enforce hard linear spatial PDE constraints within CNNs using the Fast Fourier Transform algorithm. [31] modified the LSTM units to introduce an intermediate variable to preserve monotonicity in a convolutional auto-encoder model for lake temperature. [104] proposed a physics-guided convolutional model, PhyDNN, which uses physics guided structural priors and physicsguided aggregate supervision for modeling the drag forces acting on each particle in a computational fluid dynamics-discrete element Method. [95] designed HybridNet for dynamics predictions that combines ConvLSTM for predicting external forces with model-driven computation with CeNN for system dynamics. HybridNet achieves higher accuracy on the tasks of forecasting heat convectiondiffusion and fluid dynamics. [58] proposed to combine deep learning and a differentiable PDE solver for understanding and controlling complex nonlinear physical systems over a long time horizon. [128] proposed continuous-filter convolutional layers for modeling quantum interactions. The convolutional kernel is parametrized by neural nets that take relative positions between any two points as input. They obtained a joint model for the total energy and interatomic forces that follows fundamental quantumchemical principles. Standard convolutional neural nets only operate on the regular or uniform mesh such as images. Graph neural networks move beyond data on the regular grid towards modeling objects with arbitrary positions. For instance, graph neural networks can model the Lagrangian representation of fluids and Euclidean representation on irregular meshes that CNNs cannot. [123] designed a deep encoder-processordecoder graphic architecture for simulating fluid dynamics under Lagrangian description. The rich physical states are represented by graphs of interacting particles, and complex interactions are approximated by learned message-passing among nodes. [109] utilized the same architecture to learn mesh-based simulation. The authors directly construct graphs on the irregular meshes constructed in the numerical simulation methods. In addition, they proposed a adaptive remeshing algorithm that allows the model to accurately predict dynamics at both large and small scales. [15] further proposed two tricks to address the instability and error accumulation issues of training graph neural nets for solving PDEs. One is perturbing the input by a certain noise and only backpropagating errors on the last unroll step, and the other one is predicting multiple steps simultaneously in time. Both tricks make the model faster and more stable. [4] proposed a Neural Operator approach that learns the mapping between function spaces, and is invariant to different approximations and grids. More specifically, it used the message passing graph network to learn the Green's function from the data and then the learned Green's function can used to compute the final solution of PDEs. [88] further extended it to Fourier Neural Operator by replacing the kernel integral operator with a convolution operator defined in Fourier space, which is much more efficient than Neural Operator. In [122] , graph networks were also used to represent, learn, and infer robotics systems, bodies and joints. [87] proposed to learn compositional Koopman operators, using graph neural networks to encode the state into object-centric embeddings and using a block-wise linear transition matrix to regularize the shared structure across objects. Koopman theory [74] shows that it is possible to represent a nonlinear dynamical system in terms of an infinite-dimensional linear Koopman operator acting on a Hilbert space of measurement functions of the system state. However, it is highly nontrivial to find the measurement functions that map the dynamics to the function space as well as an approximate and finite-dimensional Koopman operator. An approximation of the Koopman operator can be computed via the Dynamic Mode Decomposition algorithm [126] but we need to prepare nonlinear observables manually according to the underlying dynamics, which is not always possible since we usually do not have any prior knowledge about them. Thus, neural networks have been recently brought to learn the Koopman operator. Many machine learning approaches hypothesize that there exists a data transformation that can be learnt by neural networks under which an approximate finite-dimensional Koopman operator is available. For instance, [159] and [135] proposed to use fully connected neural nets to directly map the observed dynamics to a dictionary of nonlinear observables that spans a Koopman invariant subspace. More specifically, this map is represented via an autoencoder network, embedding the observed dynamics onto a low-dimensional latent space, where the Koopman operator is approximated by a linear layer. [98] further generalize this idea for learning the Koopman operator to systems with continuous spectra. [7] also designed an autoencoder architecture based on Koopman theory to forecast physical processes. In the latent space, the consistency of both the forward and backward systems is ensured, while other models only consider the forward system. The proposed model performs well on systems that have both forward and backward dynamics for long time prediction. Embedding physics into the design of the model architecture can enable physical principles strictly enforced and theoretically guaranteed. That leads to more interpretable and generalizable deep learning models. However, it is not trivial to design physics-guided architectures that perform and generalize well without hurting the representation power of neural nets. Hard inductive biases can greatly improve the sample efficiency of learning, but could potentially become restrictive when the size of the dataset is big enough for models to learn all the necessary inductive biases from the data. Papers discussed in the previous two sections focus on incorporating the known properties of physical systems into the design of loss functions or neural network modules. In this section, we talk about works that directly combine pure physics-based models, such as numerical methods, with DL models. Perhaps the simplest form of hybrid modeling is residual learning, where DL learns to predict the errors or residuals made by physics-based models. The key is to learn the bias of physics-based models and correct it with the help of DL models [49; 137] . A representative example is DeepGLEAM [156] for forecasting COVID-19 mortality, as shown in Figure 5 . DeepGLEAM combines a mechanistic epidemic simulation model GLEAM with deep learning. It uses a Diffusion Convolutional RNN [86] (DCRNN) to learn the correction terms from GLEAM, which leads to improved performance. Figure 6 shows one week ahead COVID-19 death count predictions by GLEAM, DCRNN, and DeepGLEAM. We can see that DeepGLEAM can outperform purely mechanistic models and purely deep learning models. Apart from DeepGLEAM, [33] combines graph neural nets with a CFD simulator run on coarse mesh to generate high-resolution fluid flow prediction. CNNs are used to correct the velocity field from the numerical solver on a coarse grid in [72] . [102] utilized neural networks for subgrid modelling of the LES simulation of two-dimensional turbulence. In [121] , a neural network model is implemented in the reduced order modeling framework to compensate the errors from the model reduction. [66] proposed DR-RNN that is trained to find the residual minimizer of numerically discretized ODEs or PDEs. They showed that DR-RNN can greatly reduce both computational cost and time discretization error of the reduced order modeling framework. [160] introduced the APHYNITY framework that can efficiently augment approximate physical models with deep data-driven networks. A key feature of their method is being able to decompose the problem in such a way that the data-driven model only models what cannot be captured by the physical model. DL models can be used to replace one or more components of physics-based models that are difficult to compute or unknown. For example, [138] replaced the numerical solver for solving Poisson's equations with convolution networks in the procedure of Eulerian fluid simulation, and the obtained results are realistic and showed good generalization properties. [107] proposed to use neural nets to reconstruct the model corrections in terms of variables that appear in the closure model. [35] applied a U-net to estimate the velocity field given the historical temperature frames, then used the estimated velocity to forecast the sea surface temperature based on the closed-form solution of the advection-diffusion equation. [100] combined the high-dimensional model representation that is represented as a sum of mode terms each of which is a sum of component functions with NNs to build multidimensional potential, in which NNs are used to represent the component functions that minimize the error mode term by mode term. [24] developed a continuous depth NN for solving ordinary differential equations, Neural ODE. More specifically, they changed the traditionally discretized neuron layer depths into continuous equivalents such that the derivative of the hidden state can be parameterized using a neural network. The output of the network is computed using a blackbox differential equation solver. This allows for increased computational efficiency due to the simplification of the backpropagation step of training. Neural ODE offers tools to build continuous-time time series models, which can easily handle data coming at irregular intervals. It also has the advantage in building normalizing flow since it is easy to track the change in density even for unrestricted neural architecture [39] introduced Augmented Neural ODE that is more expressive, empirically more stable and more lower computationally efficient than Neural ODEs. More importantly, it can learn the functions that have continuous trajectories mappings intersecting each other, which Neural ODEs cannot represent. [110] further extended this idea of continuous neural nets to graph convolutions, and proposed Graph Neural ODE. [94] proposed Neural Stochastic Differential Equation (Neural SDE), which models stochastic noise injection by stochastic differential equations. They demonstrated that incorporating the noise injection regularization mechanism to the continuous neural network can reduce overfitting and achieve lower generalization error. [91] proposed a Neural ODE-based generative time-series model that uses the known differential equation instead of treating it as hidden unit dynamics, so that they can integrate mechanistic knowledge into the Neural ODE. Combining physics-based and deep learning models can enable leveraging both the flexibility of neural nets for modeling unknown parts of the dynamics and the interpretability and generalizbility of physics-based models. However, one potential downside of hybrid physics-DL models worth mentioning is that all or most of the dynamics could be captured by neural nets and the physics-based models contribute little to the learning. That would hurt the interpretability and the generalizablity of the model. We need to ensure optimal balance between the physics-based model and the deep learning model. We only need neural nets to model the information that cannot be represented by the physical prior. Symmetry has long been implicitly used in DL to design networks with known invariances and equivariances. Convolutional neural networks enabled breakthroughs in computer vision by leveraging translation equivariance [164; 82; 165] . Similarly, recurrent neural networks [118; 57] , graph neural networks [125; 71] , and capsule networks [119; 56] all impose symmetries. While the equivariant DL models have achieve remarkable success in image and text data [29; 150; 27; 25; 85; 73; 8; 155; 28; 46; 151; 37; 51; 133] , the study of equivariant nets in learning dynamical systems has become increasingly popular recently. Since the symmetries can be integrated into neural nets through not only loss functions but also the design of neural net layers and there has been a large volume of works about equivariant and invariant DL models for physical dynamics, we discuss this topic separately in this section. In physics, there is a deep connection between symmetries and physics. Noether's law gives a correspondence between conserved quantities and groups of symmetries. For instance, translation symmetry corresponds to the conservation of energy and rotation symmetry corresponds to the conservation of angular momentum. By building a neural network which inherently respects a given symmetry, we thus make conservation of the associated quantity more likely and consequently the model's prediction more physically accurate. Furthermore, by designing a model that is inherently equivariant to transformations of its inputs, we can guarantee that our model generalizes automatically across these transformations, making it robust to distributional shift. A group of symmetries or simply group consists of a set G together with an associative composition map • : G × G → G. The composition map has an identity 1 ∈ G and composition with any element of G is required to be invertible. A group G has an action on a set S if there is an action map · : G × S → S which is compatible with the composition law. We say further that ρ : G → GL(V ) is a G-representation if the set V is a vector space and each group element g ∈ G is represented by a linear map (matrix) ρ(g) that acts on V . Formally, a function f : X → Y may be described as respecting the symmetry coming from a group G using the notion of equivariance. Definition 2. Assume a group representation ρ in of G acts on X and ρ out acts on Y . We say a for all x ∈ X and g ∈ G. The function for all x ∈ X and g ∈ G. This is a special case of equivariance for the case ρ out (g) = 1. See Figure 7 for an illustration of a rotation equivariant function. • Scaling: T sc λ w(x, t) = λw(λx, λ 2 t), λ ∈ R >0 . Consider a system of differential operators D acting onF V . Denote the set of solutions Sol(D) ⊆F V . We say G is a symmetry group of D if G preserves Sol(D). That is, if ϕ is a solution of D, then for all g ∈ G, g(ϕ) is also. In order to forecast the evolution of a system D, we model the forward prediction function f . Let w ∈ Sol(D). The input to f is a collection of k snapshots at times t − k, . . It predicts the solution at a time t based on the solution in the past. Let G be a symmetry group of D. Then for g ∈ G, g(w) is also a solution of D. Thus f (gw t−k , . . . , gw t−1 ) = gw t . Consequently, f is G-equivariant. They tailored different methods for incorporating each symmetry into CNNs for spatiotemporal dynamics forecasting. CNNs are time translation-equivariant when used in an autoregressive manner. Convolutions are also naturally space translation equivariant. Scale equivariance in dynamics is unique as the physical law dictates the scaling of magnitude, space and time simultaneously. To achieve this, they replaced the standard convolution layers with group correlation layers over the group G = (R >0 , ·) (R 2 , +) of both scaling and translations. The G-correlation upgrades this operation by both translating and scaling the kernel relative to the input, v(p, s, µ) = λ∈R>0,t∈R,q∈Z 2 µw(p + µq, µ 2 t, λ)K(q, s, t, λ), where s and t denote the indices of output and input channels. They add an axis to the tensors corresponding the scale factor µ. The rotational symmetry was modeled using SO(2)-equivariant convolutions and activations within the E(2)-CNN framework [150] . Uniform motion transformation is adding a constant vector field to the vector field, which is part of Galilean invariance and relevant to all non-relativistic physics modeling. And the uniform motion equivariance is enforced by conjugating the model with shifted input distribution. Basically, or each sliding local block in each convolutional layer, they shift the mean of input tensor to zero and shift the output back after convolution and activation function per sample. In other words, if the input is P b×din×s×s and the output is Q b×dout = σ(P · K) for one sliding local block, where b is batch size, d is number of channels, s is the kernel size, and K is the kernel, then This will allow the convolution layer to be equivariant with respect to uniform motion. If the input is a vector field, this operation is applied to each element. The DL models used are ResNet and U-Net, and their equivariant counterparts. Spatiotemporal prediction is done autoregressively. Standard RMSE and a RMSE computed on the energy spectra are used to measure performance. The models are tested on Rayleigh-Bénard convection (RBC) and reanalysis ocean current velocity data. For RBC, the test sets have random transformations from the relevant symmetry groups applied to each sample. This mimics real-world data in which each sample has an unknown reference frame. For ocean data, tests are also performed on different time ranges and different domains from the training set, representing distributional shifts. Figure 8 shows the equivariant models perform significantly better than their non-equivariant counterparts on both simulated RBC data and real-world reanalysis ocean currents. They also show equivariant models also achieve much lower energy spectrum errors. We can exploit the symmetries in fluid dynamics to design more accurate and generalizable DL models. [148] utilized an encoder capable of extracting the time-invariant and translation-invariant part of a dynamical system, which then is used to guide the main forecaster to generate accurate predictions across heterogeneous domains. Time-invariance is achieved by using 3D convolution and time-shift invariant loss. [89] designed a tensor basis neural network that embeds the fundamental principle of rotational invariance for improved prediction of the metric tensor. It added a final higher-order multiplicative layer in NN to ensure the prediction of Reynolds stress anisotropy tensor lies on a rotationally invariant tensor basis. In [101] , weights and biases of neurons are constrained so that predictions from the NN are guaranteed to preserve even/odd symmetry and energy conservation. DL has also been used to accelerate computationally expensive molecular simulation [106] . Symmetry is ubiquitous in molecular physics. For instance, molecular conformations or coordinates have rototranslation equivariance. [128] proposed SchNet, which is a continuous convolution framework that generalizes the CNN approach to continuous convolutions among to model particles at arbitrary positions. Continuous convolution kernels are generated by dense neural networks that operate on the interatomic distances, which ensures rotational and translation invariance of the energy. [5] designed Cormorant, a rotationally covariant neural network architecture for learning the behavior and properties of complex many-body physical systems. Cormorant achieves promising results in learning molecular potential energy surfaces on the MD-17 dataset and learning the geometric, energetic, electronic, and thermodynamic properties of molecules on the GDB-9 dataset. [124] designed E(n)-equivariant graph neural network for predicting molecular properties. It updates edge features with the Euclidean distance between nodes and updates the coordinates of particles with the weighted sum of relative differences of all neighbors. [129] proposed to use score-based generative model for generating molecular conformation. The authors used equivariant graph neural networks to estimate the score function, which is the gradient fields of the log density of atomic coordinates, because it is roto-translation equivariant. [131] proposed a model for autoregressive generation of 3D molecular structures with reinforcement learning (RL). The method uses equivariant state representations for autoregressive generation, built largely from Cormorant, and integrating such representations within an existing actor-critic RL generation framework. [142] computed the force by directly computing the derivatives of the predicted energy with respect to the coordinates to achieve the equivariance of the force. [162] proposed an end-to-end modeling framework that preserves all natural symmetries, including translation, rotation, and permutation, of a molecular system using an embedding procedure that maps the input to symmetry-preserving components. Additionally, permutation invariance also exists in molecular dynamics. For instance, quantum mechanical energies are invariant if we exchange the labels of identical atoms. Hence, [10] ensures the energy permutation invariance by representing the total energy of the system as a sum of atomic contributions. [136] stated that enforcing equivariance to all permutations in graph neural nets can be very restrictive when modeling molecules. They proposed to decompose a graph into a collection of local graphs that are isomorphic to a pre-selected template graph, so that the sub-graphs can always be canonicalized to template graphs before convoluion applied. By doing this, the graph neural nets can not only be much more expressive but also locally equivariant. [139] proposed to build equivariant neural networks based on the idea that nonlinear O(d)-equivariant functions can be universally expressed in terms of a lightweight collection of scalars, which are simpler to build. They demonstrated the efficiency and scalability of their proposed approach on two classical physics problems, calculating the total mechanical energy of particles and the total electromagnetic force, that obey all translation, rotation, reflection, boost and permutation symmetries. In a traffic forecasting application, [141] proposed a novel model, Equivariant Continuous COnvolution (ECCO) that uses rotationally equivariant continuous convolutions to embed the symmetries of the system for improved trajectory prediction. The rotational equivariance is achieved by a weight sharing scheme within kernels in polar coordinates. ECCO achieves superior performance to baselines on two real-world trajectory prediction datasets, Argoverse and TrajNet++. By designing a model that is inherently equivariant or invariant to transformations of its inputs, we can guarantee that our model generalizes automatically across these transformations, making it robust to distributional shift. Both empirically and theoretically, equivariant and invariant neural nets have demonstrated superior data and parameter efficiency compared to data augmentation techniques. More importantly, incorporating symmetries improves the physical consistency of neural nets because of the Noether's Law. However, incorporating too many symmetries may overly constrain the representation power of neural nets and slow down both training and inference. In addition, many real world dynamics do not have perfect symmetries. A perfectly equivariant model that respects a given symmetry may have trouble learning partial or approximated symmetries in the real-world data. Thus, an ideal model for real world dynamics should be approximately equivariant and automatically learn the correct amount of symmetry in the data. Recently, a few works explore the idea of building approximately equivariant models. [48] proposed the soft equivariant layer by directly summing up a flexible layer with one that has strong equivariance inductive biases to model the soft equivariance. [149] design approximately equivariant models by relaxing the weight sharing schemes in the equivariant convolutional layers. Both works demonstrate the benefits of approximately equivariant models in modeling real-world dynamics. In this paper, we systematically review the recent progress in physics-guided DL for learning dynamical systems. We discussed multiple ways to inject first-principle and physical constraints into DL including (1) physics-informed loss regularizers (2) physics-based models, (3) physics-guided design , and (4) symmetry. By integrating physical principles, the DL models can achieve better physical consistency, higher accuracy, increased data efficiency, improved generalization, and greater interpretability. Despite the great promise and exciting progress in the field, physics-guided AI is still at its infant stage. Below we review the emerging challenges and opportunities of learning physical dynamics with deep learning for future studies. Generalization is a fundamental problem in machine learning. Unfortunately, most DL models for dynamics modeling are trained to model a specific system and still struggle with generalization. For instance, in turbulence modeling, DL models trained with a fixed boundary and initial conditions often fail to generalize to fluid flows with different characteristics. Incorporating prior physics knowledge can guide DL models to better learn complex patterns that are consistent with the physics laws from data, thus more generalizable to unseen scenarios. Another potential promising direction is meta-learning. For example, [148] proposed a model-based meta-learning method called DyAd which can generalize across heterogeneous domains of fluid dynamics. However, this model can generalize well on the dynamics with interpolated physical parameters but cannot extrapolate out of range of the physical parameters in the training set. A truly trustworthy and reliable model for learning physical dynamics should be able to extrapolate systems with various parameters, external forces, or boundary conditions while preserve high accuracy. Thus, further research into generalizable physics-guided DL is much needed. Long-term forecasting of physical dynamics up to hundreds of steps ahead is another big challenge. Error accumulations and instability to the perturbation in the input are the two main issues that prevent neural networks being consistently accurate over a long forecasting horizon. One common training trick is adding noise to the input [15; 109] , making models less sensitive to perturbations. [109; 9] utilized online batch normalization that is normalizing current training sample using running mean and standard deviation, which also increase the time horizon that the model can predict. These models are trained on a large amount of simulation data but for real-world problems, the real world data, such as experimental data of jet flow is very expensive to obtain. Thus, how to improve the robustness of prediction on limited training data is a big challenge. The majority of the reviewed literature focuses on the methodological and practical aspects. The research into the theoretical analysis of learning non-stationary and chaotic dynamics with DL is lacking. Current learning theory of DL is based on the typical assumption of both training and test data being identically and independently distributed (i.i.d.) samples from some unknown distribution [161; 84; 105] . However, this assumption does not hold for most dynamical systems, where observations at different times and locations may be highly correlated. [146] empirically showed that DL models fail to generalize under shifted distributions in both the data and parameter domains that naturally happens in dynamical systems. [78] provided the first generalization guarantees for time series forecasting with sequence-to-sequence models. The derived upper bound is expressed in terms of measures of non-stationarity and correlation strength as well as the Rademacher complexity. To better understand the performance of DL on learning physical dynamics, we need to derive a generalization bound expressed in terms of the characteristics of the dynamics, such as the order and dimensions of the governing equations. Theoretical studies can also inspire research into model design and algorithm development for learning dynamical systems. This survey primarily focuses on how DL can be used to model and predict complex physical dynamics given a fixed dataset. Next promising step would be, given the dynamics, how to design the environment to control it. For instance, automated computational fluid dynamics (CFD) analysis and control theory have been widely applied to aircraft design [41] . CFD can also be used to predict smoke and fire risks in buildings, quantify indoor environment quality and design natural ventilation systems [81] . How DL can assist and accelerate these processes still requires in-depth study. A fundamental pursuit in science is to find causal relationships. In terms of dynamical systems, one may ask which variables influence other variables, either directly or indirectly through intermediates. While traditional approaches to discovery of causation are through conducting controlled real experiments [108; 14] , data-driven approaches have been proposed to identify causal relations from observational data in the past few decades [55; 53] . Most data-driven approaches do not directly address learning causality with big data. Many questions remain open, such as using causality to improve DL models and disentangling complex and multiple treatments. Additionally, we are also interested in the system's response under interventions. For instance, when we use DL to model the climate dynamics, we need to make accurate predictions under different climate policy, such as carbon pricing policies and the development of clear energy, so that the government can make the correct decisions and better control the climate change. Another promising direction is to seek physics laws with the help of DL. The search for fundamental laws of practical problems is the main theme of science. Once the governing equations of dynamical systems are found, they allow for accurate mathematical modeling, increased interpretability, and robust forecasting. However, current methods are limited to selecting from a large dictionary of possible mathematical terms [115; 127; 18; 80; 117] . The extremely large search space, limited high-quality experimental data, and the overfitting issues have been the critical concerns. Another line of work is to discover symmetry from the observed data instead of the entire dynamics. For instance, [36] proposed Lie Algebra Convolutional Network, a novel architecture that can learn the Lie algebra basis and automatically discover symmetries from data. [167] factorized the weight matrix in a fully connected layer into a symmetry matrix and a vector of filter parameters. The two parts are learnt separately in the inner and outer loop training with the Model-Agnostic Meta-Learning algorithm (MAML) [45] , which is an optimization-based meta-learning method, so that the symmetry matrix can learn the weight sharing pattern from the data. Still, research on data-driven methods based on DL for discovering physics laws are quite preliminary. Given the rapid growth in high-performance computation, we need to improve automation, acceleration streamlining of highly compute-intensive workflows for science. We should focus on how to efficiently train, test, and deploy complex physics-guided DL models on large datasets and high performance computing systems, such that these models can be quickly utilized to solve real-world scientific problems. To really revolutionize the field, these DL tools need to become more scalable and transferable, and converge into a complete pipeline for the simulation and analysis of dynamical systems. A simple example is that we can integrate machine learning tools into the existing numerical simulation platforms, so that we do not need to move data between systems every time and we can easily use either or both types of methods for analyzing data. In conclusion, given the availability of abundant data and rapid growth in computation, we envision that the integration of physics and DL will play an increasingly essential role in advancing scientific discovery and addressing important dynamics modeling problems. Solving nonlinear and high-dimensional partial differential equations via deep learning Predicting the sequence specificities of dna-and rna-binding proteins by deep learning Concrete problems in ai safety Neural operator: Graph kernel network for partial differential equations Cormorant: Covariant molecular neural networks Learning partially observed PDE dynamics with neural networks Forecasting sequential data using consistent koopman autoencoders Equivariant neural networks and equivarification and USDOE National Nuclear Security Administration. Meshgraphnets Generalized neural-network representation of high-dimensional potential-energy surfaces Enforcing analytic constraints in neural-networks emulating physical systems. arXiv: Computational Physics Enforcing analytic constraints in neural-networks emulating physical systems Achieving conservation of energy in neural network emulators for climate modeling Introduction to focus issue: Causation inference and information flow in dynamical systems: Theory and applications Message passing neural PDE solvers Machine learning for fluid mechanics. ArXiv, abs/1905.11075 Applying machine learning to study fluid mechanics Discovering governing equations from data: Sparse identification of nonlinear dynamical systems Improving direct physical properties prediction of heterogeneous materials from imaging data via convolutional neural network and a morphology-aware generative model Cindy: Conditional gradient-based identification of non-linear dynamics -noise-robust recovery Solving the quantum many-body problem with artificial neural networks The state of the art of hybrid rans/les modeling for the simulation of turbulent flows Neural ordinary differential equations Neural ordinary differential equations Rotation equivariance and invariance in convolutional neural networks Towards lattice Boltzmann models for climate sciences: The GeLB programming language with applications Group equivariant convolutional networks Gauge equivariant convolutional networks and the icosahedral CNN Lagrangian neural networks Physics-guided architecture (pga) of neural networks for quantifying uncertainty in lake temperature modeling An introduction to dynamical systems and market mechanisms Combining differentiable pde solvers and graph neural networks for fluid flow prediction. ArXiv, abs Deep learning for physical processes: Incorporating prior scientific knowledge Deep learning for physical processes: Incorporating prior scientific knowledge Automatic symmetry discovery with lie algebra convolutional network Exploiting cyclic symmetry in convolutional neural networks {PDE}-driven spatiotemporal disentanglement Augmented neural odes Integrating machine learning with physics-based modeling Aircraft design analysis, cfd and manufacturing Reconstruction of turbulent fluctuations using a hybrid rans-les approach Physics-informed autoencoders for lyapunov-stable fluid flow prediction Pavlos Protopapas, and Sauro Succi. Deep learning for turbulent channel flow Model-agnostic meta-learning for fast adaptation of deep networks Generalizing convolutional neural networks for equivariance to lie groups on arbitrary continuous data Simplifying hamiltonian and lagrangian neural networks via explicit constraints. ArXiv, abs Residual pathway priors for soft equivariance constraints Combining semi-physical and neural network modeling: An example of its usefulness u. forssell and p. lindskog Super-resolution and denoising of fluid flow using physics-informed convolutional neural networks without high-resolution labels. arXiv: Fluid Dynamics Scale steerable filters for locally scale-invariant convolutional neural networks Hamiltonian neural networks. ArXiv, abs A survey of learning causality with data Solving many-electron schrödinger equation using deep neural networks Causal learning and explanation of deep neural networks via autoencoded activations Transforming auto-encoders Long short-term memory Learning to control pdes with differentiable physics A tutorial on numerical methods for state and parameter estimation in nonlinear dynamic systems The finite element method: linear static and dynamic finite element analysis. Courier Corporation A first course in the numerical analysis of differential equations Dynamical systems in neuroscience Physics guided rnns for modeling dynamical systems: A case study in simulating lake temperature profiles Enforcing physical constraints in {cnn}s through differentiable {pde} layer Dr-rnn: A deep residual recurrent neural network for model reduction Physics-guided neural networks (pgnn): An application in lake temperature modeling Physics-informed machine learning: case studies for weather and climate modelling Deep fluids: A generative network for parameterized fluid simulations Deep unsupervised learning of turbulence for inflow generation at various reynolds numbers Semi-supervised classification with graph convolutional networks Machine learning accelerated computational fluid dynamics On the generalization of equivariance and convolution in neural networks to the action of compact groups Hamiltonian systems and transformation in hilbert space An introduction to domain adaptation and transfer learning Characterizing possible failure modes in physics-informed neural networks Deep learning in fluid dynamics Foundations of sequence-to-sequence modeling for time series Advance in rans-les coupling, a review and an insight on the nlde approach Learning partial differential equations for biological transport models from noisy spatio-temporal data Building Aerodynamics Backpropagation applied to handwritten zip code recognition Identifying physical law of hamiltonian systems via meta-learning On generalization and regularization in deep learning Understanding image representations by measuring their equivariance and equivalence Diffusion convolutional recurrent neural network: Data-driven traffic forecasting Learning compositional koopman operators for model-based control Fourier neural operator for parametric partial differential equations Reynolds averaged turbulence modeling using deep neural networks with embedded invariance Reynolds averaged turbulence modeling using deep neural networks with embedded invariance Generative ode modeling with known unknowns Finite-difference algorithm with local time-space grid refinement for simulation of waves Multi-fidelity physics-constrained neural network and its application in materials modeling Neural sde: Stabilizing neural ode networks with stochastic noise. ArXiv, abs/1906.02355 Hybridnet: Integrating modelbased and data-driven learning to predict evolution of dynamical systems Pde-net: Learning pdes from data Pde-net 2.0: Learning pdes from data with a numeric-symbolic hybrid deep network Deep learning for universal linear embeddings of nonlinear dynamics Emulating numeric hydroclimate models with physics-informed cgans A random-sampling high dimensional model representation neural network for building potential energy surfaces Physical symmetries embedded in neural networks Subgrid modelling for two-dimensional turbulence using neural networks Introductory Lectures on Turbulence. Mechanical Engineering Textbook Gallery Phynet: Physics guided neural networks for particle drag force prediction in assembly Exploring generalization in deep learning Machine learning for molecular simulation. Annual review of physical chemistry A paradigm for data-driven predictive modeling using field inversion and machine learning Causal inference in statistics: An overview Learning mesh-based simulation with graph networks Graph neural ordinary differential equations Universal differential equations for scientific machine learning Hidden physics models: Machine learning of nonlinear partial differential equations Data-driven solutions of nonlinear partial differential equations Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations Discovering nonlinear PDEs from scarce data with physics-encoded learning Nuno Carvalhais, and Prabhat. Deep learning and process understanding for data-driven earth system science Data-driven discovery of partial differential equations Learning representations by back-propagating errors Dynamic routing between capsules Multiscale and Multiresolution Approaches in Turbulence Neural network closures for nonlinear model order reduction Graph networks as learnable physics engines for inference and control Learning to simulate complex physics with graph networks E (n) equivariant graph neural networks Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model Dynamic mode decomposition of numerical and experimental data Distilling free-form natural laws from experimental data Schnet: A continuous-filter convolutional neural network for modeling quantum interactions Learning gradient fields for molecular conformation generation Neural lander: Stable drone landing control using learned dynamics Symmetry-aware actor-critic for 3d molecular design Dgm: A deep learning algorithm for solving partial differential equations Scale-equivariant steerable networks Nonlinear dynamics and chaos: with applications to physics, biology, chemistry, and engineering Learning koopman invariant subspaces for dynamic mode decomposition Autobahn: Automorphism-based graph neural nets Modeling chemical processes using prior knowledge and neural networks Accelerating eulerian fluid simulation with convolutional networks Scalars are universal: Equivariant machine learning, structured like classical physics Dynamical systems in cosmology Trajectory prediction using equivariant continuous convolution Machine learning of coarse-grained molecular dynamics force fields Aortic pressure forecasting with deep learning Towards physicsinformed deep learning for turbulent flow prediction Towards physicsinformed deep learning for turbulent flow prediction Bridging physics-based and data-driven modeling for learning dynamical systems Incorporating symmetry into deep dynamics models for improved generalization Bridging physics-based and data-driven modeling for learning dynamical systems Approximately equivariant networks for imperfectly symmetric dynamics General E(2)-equivariant steerable CNNs Learning steerable filters for rotation equivariant CNNs Downscaling numerical weather models with gans Latent space physics: Towards learning the temporal evolution of fluid flow Integrating physics-based modeling with machine learning: A survey. CoRR, abs Harmonic networks: Deep translation and rotation equivariance Deepgleam: a hybrid mechanistic and deep learning model for covid-19 forecasting Enforcing Statistical Constraints in Generative Adversarial Networks for Modeling Chaotic Dynamical Systems tempogan: A temporally coherent, volumetric gan for super-resolution fluid flow Learning deep neural network representations for koopman operators of nonlinear dynamical systems Augmenting physical models with deep networks for complex dynamics forecasting Understanding deep learning requires rethinking generalization End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics Shift-invariant pattern recognition neural network and its optical architecture Parallel distributed processing model with local space-invariant interconnections and its optical architecture Biswadip Dey, and Amit Chakraborty. Benchmarking energy-conserving neural networks for learning dynamics from data Meta-learning symmetries by reparameterization Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data