key: cord-0015429-82jpr4hn authors: Morrison, Rebecca E. title: Data-Driven Corrections of Partial Lotka–Volterra Models date: 2020-11-18 journal: Entropy (Basel) DOI: 10.3390/e22111313 sha: 6f910ecca891c218b22c162b958f7e8a6962ad0d doc_id: 15429 cord_uid: 82jpr4hn In many applications of interacting systems, we are only interested in the dynamic behavior of a subset of all possible active species. For example, this is true in combustion models (many transient chemical species are not of interest in a given reaction) and in epidemiological models (only certain subpopulations are consequential). Thus, it is common to use greatly reduced or partial models in which only the interactions among the species of interest are known. In this work, we explore the use of an embedded, sparse, and data-driven discrepancy operator to augment these partial interaction models. Preliminary results show that the model error caused by severe reductions—e.g., elimination of hundreds of terms—can be captured with sparse operators, built with only a small fraction of that number. The operator is embedded within the differential equations of the model, which allows the action of the operator to be interpretable. Moreover, it is constrained by available physical information and calibrated over many scenarios. These qualities of the discrepancy model—interpretability, physical consistency, and robustness to different scenarios—are intended to support reliable predictions under extrapolative conditions. In the realm of computational modeling today, scientists, mathematicians, and engineers investigate, design, optimize, and make predictions and decisions about an incredible multitude of real-world systems. In general, a computational model implements a mathematical model; the mathematical model represents the actual system in question using abstraction and simplification. In this paper, we investigate what happens when common simplifications go too far-resulting in an overly reduced or partial model-and how to account for the discrepancy between this model and the true system of interest. At the same time, these partial models still contain significant deterministic information, and they should not be thrown out entirely. Instead, we augment the partial models with a data-driven correction: we use what we know, and learn the rest. Partial models are especially common in the context of the generalized Lotka-Volterra (GLV) equations. These equations describe the interactive behavior of any number S of different species. The concentration of each species is represented by a variable x i , i = 1, . . . S; there is one differential equation for each x i whose right-hand side (RHS) includes a linear growth rate term and nonlinear interaction terms. This framework, also called the quasipolynomial form, is canonical in dynamical systems [1] and is used to describe many types of physical systems, including reaction models for chemical kinetics [2] , ecological models [3] , and epidemiological models [4] . In these applied fields, modelers commonly build a partial model with only s < S species. For example, there are over 50 chemical species thought to be involved in methane combustion [5] ; in practice, often merely five to ten species are included [6] . Surprisingly, Kourdis and Belan found that the removal of 90%-99% specific calibration scenario; otherwise, we could just rely on observations without need for a model. To this end, we aim to represent the model discrepancy with a discrepancy operator embedded within the model itself, i.e., an embedded discrepancy operator. There are several advantages of an embedded discrepancy operator. First, the operator can be constrained by physical information such as conservation laws, symmetries, fractional concentrations, nonnegativity constraints, and so on. Second, as a function of state variables or other existing model variables, the action of the discrepancy operator is physically interpretable. Third, the operator can be calibrated over many different scenarios, such as initial conditions, boundary conditions, or simulation geometries. Because of these qualities-physical consistency, interpretability, and robustness in different scenarios-an embedded discrepancy model could be valid for extrapolative predictions. Embedded, or intrusive, approaches have been previously investigated. In [24] , Sargsyan et al. allowed for model error by endowing model parameters with random variable expansions. As an approach to model discrepancy, this does not break physical constraints, and the random parameters can be calibrated over many scenarios. However, not all model error can by captured in this way. With complex computational models, missing physics or misspecified physics sometimes causes the discrepancy. This is the situation considered in the current paper. Thus, the discrepancy model becomes part of the model itself, yielding an augmented or enriched model, in which case the specific form of the discrepancy model depends on the modeling context. In [25] , the authors investigated this type of inadequacy operator in the context of chemical kinetics for combustion. In this work, we propose and analyze a class of embedded discrepancy operators in the context of the generalized Lotka-Volterra equations, and we show that the error of highly reduced models can be captured by sparse linear operators. For example, a detailed model with 20 species includes 420 parameters where 400 correspond to nonlinear terms, a partial model of four species includes 20 of these, and our discrepancy operator introduces only eight new linear terms. Although the context is different, the work here is perhaps most similar in philosophy and techniques to that which examines the closure problem of reduced-order fluid dynamics models such as RANS (Reynolds-averaged Navier-Stokes) and LES (large-eddy simulation). For example, Portone et al. developed an embedded operator in [26] for porous media flow models of contaminant transport. Pan and Duraisamy constructed general data-driven closure models for both linear and chaotic systems, including canonical ordinary differential equations (ODEs) and the one-dimensional Burgers equation [27] . Other works describe data-driven closure models using proper orthogonal decomposition [28] , Mori-Zwanzig techniques [29] , and linear approximations to closure models for the Kuramoto-Sivashinsky equation [30] . In a sense, the current paper presents a type of data-driven "closure model" for partial Lotka-Volterra equations. The paper is organized as follows. A brief review of the generalized Lotka-Volterra (GLV) equations along with a description of the detailed and partial models is given in Section 2. In Section 3, a class of embedded discrepancy operators and a method for enforcing physics-based constraints are proposed. The details of calibration and validation for the enriched models are given in Section 4. Numerical results are listed in Section 5, and the paper concludes with a discussion in Section 6. Existing techniques to manipulate ordinary differential equations that motivate the proposed discrepancy operators are reviewed in Appendix A. The generalized Lotka-Volterra equations are coupled ordinary differential equations, used to model the time dynamics of any number of interacting quantities. In particular, the Lotka-Volterra framework allows for linear (growth rate) and quadratic (interaction) terms. The objects in the detailed models will be denoted with aˆsymbol. Let the vectorx ∈ R S represent species concentrations. Here, the units of a particularx i refer to the number of specimens per unit area, but specific units are omitted in this paper. The GLV equations of the detailed model D are written succinctly as: where the vectorr ∈ R S represents the intrinsic growth rates, and the matrix ∈ R S×S collects the interaction rates-that is, the ijth entry ofÂ, a ij , indicates how species j affects the concentration of species i. The equilibrium solution isx eq = − −1r . Since this model is completely determined by the vectorr and the matrix (modulo initial conditions), we also say that D = {Â,r}. The species included in the detailed model are called the detailed set. The term intraspecific refers to interactive behavior within a particular species (theâ ii are intraspecific terms), while the term interspecific refers to the behavior between two different species (the a ij , i = j, are interspecific terms). The partial model is comprised of all terms involving the s species of interest, i.e., by subsampling the detailed one. For example, suppose S = 3 and s = 2. Then, the detailed model, written out, is and the partial model is (now without theˆ) Likewise, the partial model is referred to as P, so and P = {A, r}. Here, the equilibrium solution is x eq = −A −1 r. In this example, the growth rate vectors and interaction matrices arê The species included in the partial model are called the partial set and sometimes also the remaining species-that is, remaining after a reduction process. As the objective of this paper is to understand model discrepancy in the context of partial GLV models, we must define the scope of this context. There are a few considerations to keep in mind. First, note that, as explained above, the partial model considered here follows immediately from the detailed model. Thus, when determining the scope of models under investigation, it suffices to determine the detailed model(s). Then, given a detailed model, we investigate all possible partial models from s = 1 to s = S − 1. Second, the GLV equations encompass an infinite number of specific models, or model realizations, as S can be any integer ≥ 2, and the entries of andr can be, in theory, any real numbers. Moreover, any two GLV models, determined by a specific andr, may behave differently from one another. At the most specific extreme, all model parameters are fixed, yielding a single fixed pair of detailed and partial models, and we could then investigate the model discrepancy therein. At the most general extreme, many models are supplied via highly unconstrained realizations of the model parameters, and we could hope to thus discover highly general results about the model discrepancy. In this paper, by aiming somewhere in between these two extremes, we examine a moderately general random class of LV models. This class is determined by specifying appropriate distributions for the entries of andr. Third, since this is an initial exploration into representing model discrepancy in the GLV context, let us narrow the scope in order to examine well-behaved models, i.e., those with stable equilibria. We focus on symmetric interaction matrices with negative entries. This constraint says that all interactions between and within species are competitive, not cooperative. Then, the matrix can be stabilized by making its diagonal entries larger in magnitude than the sum of off-diagonal entries in the same row (or column), ensuring diagonal dominance and thus negative eigenvalues. Here, we consider models whose interaction matrices are symmetric, diagonally dominant, and have negative entries. These restrictions (or similar, e.g., that be negative definite) are common assumptions in mathematical ecology studies; see, e.g., [3, 31, 32] . The distributional form characterizing entries of A andr are given in the following subsection. Then, in Section 5, specific models are sampled and analyzed through numerical examples. In this subsection, the information above is summarized and refined algorithmically. Algorithm 1 generates a realization of a detailed model, and Algorithm 2 provides the corresponding partial model. Recall that we useˆto differentiate between the two and denote a quantity of the detailed model. Generating a realization of the detailed model. Without loss of generality, we simply choose the first s species in Algorithm 2, as the detailed set follows no special or implicit ordering. Note that existence of a stable equilibrium of the partial model follows directly from that of the detailed model. Previous work shows how a set of S coupled Lotka-Volterra equations can be converted to a set of s equations, s < S, using algebraic substitutions and/or integration [33] . The resulting equations will either need to depend on higher derivatives of the remaining species or on their complete time history-that is, the exact dynamics from the detailed model may be written only in terms of the partial set: whereẋ is the first derivative of x,ẍ the second derivative, and so on, and K(x) represents some memory kernel. Two such manipulations are reviewed in Appendix A. This motivates an approximation of F with the available partial model P and a discrepancy model ∆ that is a function of either the derivatives or memories of the remaining variables-that is, we seek a model for the partial set of variables as: The above may be reminiscent of Takens's theorem [34] , in which a dynamical system is reconstructed from (delayed) observations of the system. However, the two approaches differ fundamentally: here, the LHS derivatives are restricted to those of the partial set, i.e., a subset of the original variables, but that is not true in a delay embedding. We now propose a particular form of ∆. Recall that the detailed model is and the partial model is We initially propose an enriched model E , linear in (x,ẋ), of the form where δ 0 = (δ 10 , δ 20 , . . . , δ s0 ) T and δ 1 = (δ 11 , δ 21 , . . . , δ s1 ) T . The subscripts on each δ ij are chosen so that i indicates that this coefficient appears in the RHS of the variable x i , and j indicates that this coefficient is multiplying the jth derivative of x i . A major advantage of an embedded operator, as opposed to a response discrepancy model, is that the operator can be constrained by any available information about the physical system. In this simple example, we do have some information about the system that implies constraints on the introduced discrepancy parameters δ 0 , δ 1 . First, we make the modeling ansatz that these discrepancy parameters should not depend explicitly on time. A result of this ansatz is then that the parameters be constrained independently. We also (assume that we) know that all interspecific interactions are competitive. In particular, note that a ij x i x j < 0 because a ij < 0 and x i , x j ≥ 0. Thus, we enforce that ∆ i (x i ,ẋ i ) ≤ 0. Thus, specific information about the high-fidelity physical system implies the following constraints: • We know x i ≥ 0 which implies δ i0 ≤ 0. • The constraint on δ i1 is slightly less clear since the sign ofẋ i could be positive or negative. Thus, we could set δ i1 =δ i1 sgn(ẋ i ), whereδ i1 ≤ 0. Equivalently, we can write the discrepancy as Then, set δ i1 ≤ 0 and the constraint is satisfied. Because of this final constraint, the discrepancy operator is no longer linear inẋ, but rather in |ẋ|. We still refer to such a formulation as linear (precedence for this use of linear is found in [35] ). Thus, we amend the above enriched model in lines (10a)-(10c) as Finally, the introduced discrepancy parameters δ 0 , δ 1 are calibrated, using observations of species concentrations generated by the detailed model. Indeed, the strength of the embedded operator approach stems from two properties: (1) the ability to constrain the formulation by available physical information, and (2) the ability to leverage information from the detailed system by calibrating the model discrepancy parameters. Moreover, we calibrate over a range of initial conditions, denoted as Each φ i specifies the species initial concentrations: By calibrating with observations from all n φ c scenarios, the goal is to build a more robust discrepancy model that is valid over several scenarios instead of only calibrated to a very specific dataset. This property of the model discrepancy construction further allows for the possibility, at least, that such an enriched model could be used in extrapolative conditions, such as a prediction in time, or in scenarios given by different initial conditions. Note that the actual observations used to calibrate the parameters are specified in Section 4, along with the particulars of the calibration itself. We have tried to separate what is essential to the formation of the discrepancy operator from the calibration details, which could reasonably change based on the example at hand. The equilibrium solution of the enriched model is Thus, the parameters δ 0 , which act linearly on the state x, directly control the equilibrium solution. The stability of the enriched model is less obvious because of the absolute values; here, we conjecture that the models are indeed stable. To see why, first consider an example enriched system of just one variable x: Depending on the sign ofẋ, this becomes one of the following logistic equationṡ The logistic equations admit solutions of the qualitative nature shown in Figure 1 . Importantly, the sign ofẋ never changes over any solution curve. Given the initial condition, we could in fact solve the same system without the absolute value by choosing either (16a) or (16b); both reach stable equilibrium. Thus, the presence of the absolute value in this example does not affect the stability of the system. In general, the signs of the derivatives may change. However, we conjecture that the derivatives of all species do not change sign after a given point in time, say t * (as seen in the numerical examples in Section 5). In this case, the enriched differential equation for x i (t), t > t * iṡ where The above does reach a stable equilibrium, as λ i simply scales the overall dynamics and the interaction matrix A (still) determines the stability of the system. A more rigorous analysis of stability for these systems will be addressed in future work. For now, we note that differential equations with absolute value terms have been treated in the literature. In particular, Khan and Barton showed that, for ODEs whose RHS are a composition of analytic and absolute value functions of the state variables, the arguments of the absolute values change sign finitely many times in any finite duration [36] , while Barton et al. provided a theoretical and computational framework for evaluating nonsmooth derivatives called lexicographic directional derivatives [37] . Finally, Oakley demonstrated that certain second-order differential equations with absolute values admit solutions of sets of related linear differential equations [35] . 3.3. Proof of Concept: Linear Embedded Discrepancy Operator, S = 2, s = 1 As an initial proof of concept, consider the S = 2, s = 1 case. The detailed model is and the partial model is simply P = (A, r) = (a 11 , r 1 ) = (−3, 5). In this case, the exact discrepancy is −x 2 x 1 , and we aim to approximate the effect of this term with Calibration yields posterior mean values ofδ 10 ≈ −0.837 andδ 11 ≈ −0.0224; further calibration details are deferred to the next section. The three models-detailed, partial, and enriched-are shown in Figure 2a ; excellent agreement between the detailed and enriched models is achieved. We also show the phase diagram of the three models in Figure 2b including the 2D phase diagram from the detailed model projected onto the x 1 -axis. The recovered derivatives of the enriched model approximately match this projection quite well. Analogous plots are difficult to visualize in higher dimensions, but in this low-dimensional case, this projection may provide some intuition about why the enriched model behaves like the detailed model. There are a number of related possible formulations of the model discrepancy. Some options are the following: 1. An affine expression up to the Nth derivative: A quadratic expression up to the Nth derivative. Let A memory expression, such as: x i (s)ds (23) for some β i ∈ R. Each of the above formulations includes an affine term µ i . Whether or not such a constant term would be advantageous when all the missing dynamics terms are state-dependent is not immediately clear. Of course, one could also propose some combination of the above formulations as an embedded discrepancy operator. Investigating the numerical advantages and limitations of many such discrepancy operators is beyond the scope of the current paper. For now, numerical results are presented in Section 5 about the proposed linear embedded discrepancy operator, as described in Section 3.1. This section contains all relevant details about the calibration and validation processes. First, for both of these, it is necessary to know what observations are available. The datasets used to calibrate and validate the discrepancy model include observations from the detailed model trajectories of the s species included in the partial model. From each trajectory, T observations are taken, and there is a new trajectory for each initial condition φ, so that the observations can be summarized as O = {y ijk }, i = 1, . . . , s; j = 1, . . . , T; k = 1, . . . , n φ (24) where y ijk is the observation of x i (t j ) given the initial condition φ k . This observed value y * is given by the true value y t with additive measurement error : where the distribution of measurement error is normal: p = N (0, σ 2 ). Finally, this set of observations is partitioned into two sets, one for calibration and the other for validation. Let us partition as follows: Validation data: That is, n φ c initial conditions are used for calibration, and the remaining n φ v are designated for validation, where n φ c + n φ v = n φ . The calibration is done using a Bayesian approach, and the details of the calibration problem are as follows. • Prior: We set uniform prior distributions on the discrepancy parameters θ: where p(δ ij ) = U (−100, 0) i = 1, . . . , s; j = 0, 1. (One might expect a negative lognormal distribution for these priors, and this was in fact the first choice. However, the uniform priors performed much better during the sampling process, and all of the parameter chains in Markov Chain Monte Carlo simulations were well-contained by the uniform bounds. Why the lognormal priors led to poor mixing will be investigated further in future work.) • Likelihood: The likelihood is determined by the measurement error: where the observations have been re-indexed from 1 to |O c | (to avoid triple subscripts here) and y l,E is the corresponding model output from the enriched model E . • Posterior: Given the prior and likelihood distributions above, the posterior distribution follows as: Specifically, the calibration is performed according to the Delayed Rejection Adaptive Metropolis (DRAM) method, introduced in [38] and implemented in the statistical library QUESO [39] . Next, we must define an appropriate quantitative validation metric. First, we quantify the consistency between the enriched model output and the corresponding observation. We compute how probable the observation is as a realization of the model output. The probability of observing some y * , given the data O c , is We can compare this probability to the rest of possible model outputs. In particular, we are interested in how much of the distribution corresponds to model outputs less likely than the one above in (32) . This amount is exactly given by the γ−value, as defined in [15] : (33) where S = {y : p(y|O c ) ≤ p(y * |O c )}. Note that a low γ-value implies that the observation is less probably an outcome of this model than most possible outcomes. In contrast, values that are not low demonstrate consistency between the model and observation. In this work, we compute the fraction of γ-values below a given threshold τ. For a more thorough introduction to γ-values and discussion of their use in model validation, see [15] , and for another example of this used in practice as a validation metric, see [25] . An example of the area corresponding to this integral is given in Figure 3 . In this work, we compute the integrals with a Monte Carlo approach [40] . We now present the numerical performance of the proposed linear embedded discrepancy operator described in Section 3.1. All code-to run forward and inverse problems, generate data, and postprocess-is available here: github.com/rebeccaem/enriched-glv [41] . First, let us examine results for a single detailed and partial model. The detailed model is generated according to Algorithm 1, with the following values: Then, the partial model is generated according to Algorithm 2 with s = 4. In this example, the observations from the detailed model are taken so that n φ c = 3, n φ v = 3, T = 10, and σ 2 = 0.001. The entries of each initial condition vector φ i are generated randomly from a lognormal distribution, log N (0, 1). Note that 90 parameters are omitted during reduction, while only eight are introduced during enrichment. Figure 4 shows trajectories for calibration scenarios from the three models: detailed, partial, and enriched. The 50% and 95% quantiles are plotted for the enriched model output. There is an obvious discrepancy between the output from the detailed and partial models, and the enriched model is able to capture the bulk of this discrepancy. Nearly all of the observations from the detailed model are contained within the model output bounds from the enriched model. Figure 5 shows the same results, but for validation scenarios. Recall that these observations have not been used to calibrate the discrepancy operator. The output of the enriched model, at least to the eye, appears decent. The enriched model is greatly improved in comparison to the partial model alone and, similarly to the calibration scenarios, captures the bulk behavior of the detailed model in the validation scenarios. In the above cases, there are a few observations which lie outside the predicted bounds of the enriched model. This problem must be addressed more carefully with a quantitative validation process as described in Section 4. Additionally, these results only show the performance of the discrepancy operator for a particular S and s and a single realization of (D, P). The agreement between trajectories from detailed and partial models for different choices of (D, P) are qualitatively similar, but some interesting differences appear by varying s with respect to S. In the next subsections, these statements are made more precise. We examine the performance of the proposed discrepancy model in the context of random forward models. To this end, three relevant concepts are detailed below. We quantify the average performance of the discrepancy models. In this sense, we compute these γ-values for trajectories from n M realizations of detailed models, where n M 1. Note γ-values are computed with two types of data: calibration and validation data. To refer to these two types of data, we will use the variable p = {c, v}, so that p = c denotes calibration data and p = v denotes validation data. We must check how well the enriched model performs both in terms of the data that has been used to calibrate it, and also in terms of data that has not. Both types are shown in Figures 8 and 9 . 3. Finally, let us examine how well the discrepancy operators perform for different pairs (S, s). We fix S, s, p, n M and then compute γ−values for all type p observations over n M models, for a particular pair (S, s). Call this set of γ-values Γ(S, s, p, n M ). Now let Q(S, s, p, n M , τ) = {γ i : γ i < τ, γ i ∈ Γ(S, s, p, n M )}. Then, the fraction of γ-values below the threshold τ is: For example, if we want to compute f γ for all calibration data over n M model realizations, the denominator above is |Γ| = sTn φ c n M . The value f γ is plotted in Figures 8 and 9 , and S is fixed at 10 and 20, respectively. Along the x-axis, s ranges from 1 to S − 1. The results for two values of τ-0.05 (shown in Figures 8a and 9a ) and 0.01 (shown in Figures 8b and 9b )-are also shown. Let α = s/S. In the case that the model truly does represent the data-generating process and in the limit of infinite observations, then this fraction of γ-values below the threshold is equal to the threshold itself-that is, lim when the model is a true match to the data-generating process. Indeed, f γ (10, s, c, 100, τ) approaches τ as α approaches 1 (Figure 8 ). This suggests that the enriched model is better able to capture the behavior of the detailed one as more species are included in the partial model, as one might expect. Interestingly, in the S = 20 case, f τ peaks somewhere in the middle of the plot, when α ≈ 0.5 ( Figure 9 ). In other words, the enriched model is poorest for moderate α, and performs best as α approaches 1. Consider that when α is low, only a few species are included in the partial model relative to the detailed one, but also consider that the discrepancy model has only those few species to modify. When α is close to one, the partial model already includes much of the detailed model, and the discrepancy model must only fill a small gap between the two. For moderate α, however, there are neither of these advantages-the discrepancy model must account for the behavior of a large enough number of species, but the partial model is still significantly lacking compared to the detailed model. At the same time, the S = 10 plots do not exhibit the above behavior. Note that the S = 20 cases appear to reach equilibrium more quickly that those with S = 10; the time to equilibrium may influence the shape of curves in Figures 8 and 9 . Future work will include extensive numerical testing to better understand these results. A good discrepancy model should not overfit the data, and the best discrepancy model would be rich enough to capture the relevant behavior of the detailed model without adding unnecessary complexity. Although there are different ways one might measure complexity, here, we measure the number of terms introduced in the enriched model (2s) compared to those omitted from the detailed model. These omitted terms include the S 2 − s 2 interspecific and intraspecific interaction terms, as well as the (S − s) growth rate terms. (Note that the number of terms introduced is equal to the number of enriched model parameters.) For the cases S = 10, 20, the absolute values are shown in Figure 10 . Enriched terms added Detailed terms ommitted (b) S = 20 Figure 10 . Comparison of number of terms added by the enriched model and terms omitted from the detailed model. In Figure 11 , this information is presented as a ratio of terms added relative to terms omitted for various values of S. We call this ratio the relative model complexity. The relative model complexity is plotted as s/S varies from 1/S to (S − 1)/S for a few different values of S. These include the two cases presented here (S = 10, 20) . We also show the relative model complexity for two higher values of S, namely S = 50 and S = 100. One might be interested in how this type of model complexity would scale for much larger systems. Moreover, if one knew a priori the true value of S for some system, one could balance the effectiveness of the enriched model (as measured by f τ ) against its relative model complexity. Strikingly, the enriched models introduce many fewer terms than what the partial models omit. For example, in the two specific forward models shown in Figures 4-7 , the relative model complexity is less than 0.1, yet the enriched model and observations do show surprisingly high consistency. This study is an initial step toward representing model discrepancy in nonlinear dynamical systems of interacting species. The proposed discrepancy model here is a linear operator embedded within the differential equations. The particular form is motivated by circumstances in which a set of differential equations can be converted to a set of fewer equations; in this decoupling process, more information must be introduced about the remaining set, such as memory or higher derivatives. In this work, the discrepancy model is similarly constructed by introducing more information about the partial set, namely as a linear operator which acts on the remaining variables and (the absolute values of) their first derivatives. We can examine the performance of the enriched models over two regimes: equilibrium and transient dynamics. The introduced parameters δ 0 act on the state variables, directly affect the equilibrium solution, and seem to be sufficient, as the enriched models typically recover equilibria of the detailed models. On the other hand, the parameters δ 1 act on the derivatives of the state, provide a type of overall scaling of the dynamics, and give an improvement but not a total correction; the enriched models recover much of the transient dynamics, but certainly not all of the discrepancy for every combination of (S, s). While the performance in the transient regime could be improved, the linear embedded discrepancy operators show promise as discrepancy models, even in scenarios that extrapolate over initial conditions. The results also bring up many new questions. For example, what is the effective dimension of the missing dynamics of the partial model? In other words, how many (and which) new random variables need to be introduced to effectively (i.e., within some tolerance) capture the error of the partial model? The initial results here suggest that the discrepancy between the partial and detailed models can, under some conditions, be adequately described with a relatively small number of discrepancy variables and parameters. An outstanding question is whether or not some estimate of this effective discrepancy dimension can be found a priori. Certainly, such an estimate would heavily rely on given knowledge of the detailed and partial models. Another avenue to explore is the design and analysis of more elaborate discrepancy representations in the generalized Lotka-Volterra setting, including those with second (or higher) derivatives, memory, nonlinear terms, or some combination of these. Of course, a trade-off exists between the richness of the discrepancy representation and the computational expense of both the forward and inverse problems. Finally, the detailed models (and thus also partial models) investigated here are quite simple; the interaction matrices are negative definite, diagonally dominant, and symmetric, with off-diagonal entries sampled from identical distributions. An immediate next step in this research is to examine the performance of linear embedded discrepancy operators after relaxing these restrictions on the random interaction matrices. Funding: This research received no external funding. Reducing nonlinear dynamical systems to canonical forms The effect of intra-and interspecific competition on coexistence in multispecies communities Calibration of a SEIR-SEI epidemic model to describe the Zika virus outbreak in Brazil On reduced mechanisms for methane-air combustion in nonpremixed flames High-pressure reduced-kinetics mechanism for n-hexadecane autoignition and oxidation at constant pressure Highly reduced species mechanisms for iso-cetane using the local self-similarity tabulation method COVID-19 pandemics modeling with SEIR (+ CAQH), social distancing, and age stratification. The effect of vertical confinement and release in Brazil Mosquito and primate ecology predict human risk of yellow fever virus spillover in Brazil Verification and Validation in Scientific Computing Practical methods for a posteriori error estimation in engineering applications Code Verification by the Method of Manufactured Solutions Practical Statistics for Data Scientists: 50 Essential Concepts Validating predictions of unobserved quantities A framework for validation of computer models Adaptive selection and validation of models of complex systems in the presence of uncertainty A sequential calibration and validation framework for model uncertainty quantification and reduction Bayesian calibration of computer models uncertainty and decision-support relevance in climate predictions Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors Uncertainty and variability in computational and mathematical models of cardiac physiology An information theoretic approach to use high-fidelity codes to calibrate low-fidelity codes On the statistical calibration of physical models Representing model inadequacy: A stochastic operator approach Operator Approach to Model Inadequacy with Applications to Contaminant Transport. arXiv 2017 Data-driven discovery of closure models Proper orthogonal decomposition closure models for turbulent flows: a numerical comparison Non-Markovian closure models for large eddy simulations using the Mori-Zwanzig formalism Data-based stochastic model reduction for the Kuramoto-Sivashinsky equation. Phys. D Nonlinear Phenom Feasibility and coexistence of large ecological communities Species competition: coexistence, exclusion and clustering Exact model reduction of the generalized Lotka-Volterra equations Detecting strange attractors in turbulence Differential equations containing absolute values of derivatives Switching behavior of solutions of ordinary differential equations with abs-factorable right-hand sides Computationally relevant generalized derivatives: Theory, evaluation and applications DRAM: Efficient adaptive MCMC The parallel C++ statistical library 'QUESO': Quantification of Uncertainty for Estimation, Simulation and Optimization rebeccaem/enriched-glv: Initial Release Algebraic decoupling of variables for systems of ODEs of quasipolynomial form Reduction of dimension for nonlinear dynamical systems Extracting macroscopic dynamics: model problems and algorithms Acknowledgments: I would like to acknowledge Youssef Marzouk, Prakash Mohan, Bob Moser, and Todd Oliver for many helpful discussions about this work. The author declares no conflict of interest. A system of S coupled ordinary differential equations can sometimes be converted (decoupled) to a system of s differential equations, where s < S, without loss of information. Possible structures of the resultant set, comprising s equations, motivates the functional form of the proposed model discrepancy here. This appendix briefly reviews two methods of model conversion, and what the application of each method yields in the GLV context. For more information about these types of model conversion, or exact model reduction, see [33, 42] .In [43] , Harrington and van Gorder present a method to algebraically convert systems of coupled differential equations from one form to another. As an example from that paper, consider the Lorenz system of three ODEs:ẋThrough algebraic substitutions, this can be converted to a single third-order nonlinear differential equation in only the variable x and its derivatives. After substituting expressions for z and y in terms of x and its derivatives, we have:In this example, variables y and z have been exchanged for derivatives of x.In the GLV setting, we can perform a similar exchange. For example, consider the following system for x and y:ẋThis is in fact equivalent to the following single differential equation for x: d dtEquation (A4) can also be written more compactly aṡwhereWhile this single differential equation is quite messy, it is now written entirely in terms of x and its derivatives. Similarly, the Mori-Zwanzig approach to model reduction makes an exchange, but here, variables may be exchanged for time history, or memory, of the remaining variables. Again, a simple example starts with a two-variable system:where U, V are noise processes. This system of two equations can be converted to one by introducing the memory kernel K:wheref (x(t)) represents a Markovian term that depends only on the current state of x, the integral t 0 K (x(t − s), s) ds depends on the entire history of x between 0 and t, and the final term n (x(0), y(0), t) satisfies an auxiliary equation. Further details about this process are provided in [44] .The analogue of a Mori-Zwanzig type process in the GLV setting (S = 2, s = 1) yields:and z is defined as in the previous subsection. Note that, like the algebraic reduction, we now have a single differential equation in terms of x. In this case, the variable y is exchanged for the memory of x.Publisher's Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.