key: cord-0637273-7992myzv
authors: Eftekhari, Aryan; Scheidegger, Simon
title: High-Dimensional Dynamic Stochastic Model Representation
date: 2022-02-14
journal: nan
DOI: nan
sha: 10393c310c1b893f3d22b79014663a9e92e8ce3d
doc_id: 637273
cord_uid: 7992myzv

We propose a scalable method for computing global solutions of nonlinear, high-dimensional dynamic stochastic economic models. First, within a time iteration framework, we approximate economic policy functions using an adaptive, high-dimensional model representation scheme, combined with adaptive sparse grids to address the ubiquitous challenge of the curse of dimensionality. Moreover, the adaptivity within the individual component functions increases sparsity since grid points are added only where they are most needed, that is, in regions with steep gradients or at nondifferentiabilities. Second, we introduce a performant vectorization scheme for the interpolation compute kernel. Third, the algorithm is hybrid parallelized, leveraging both distributed- and shared-memory architectures. We observe significant speedups over the state-of-the-art techniques, and almost ideal strong scaling up to at least $1,000$ compute nodes of a Cray XC$50$ system at the Swiss National Supercomputing Center. Finally, to demonstrate our method's broad applicability, we compute global solutions to two variates of a high-dimensional international real business cycle model up to $300$ continuous state variables. In addition, we highlight a complementary advantage of the framework, which allows for a priori analysis of the model complexity.

1. Introduction. Motivated by empirical observations, features such as heterogeneity, interconnectedness, and uncertainty have become vital ingredients for capturing the salient features in contemporary dynamic economic models. In macroeconomics for instance, heterogeneity between types of agents, such as hand-to-mouth and non-hand-to-mouth consumers (see, e.g., [23] ), financial frictions such as collateral constraints (see, e.g., [29] ), or distributional channels (see, e.g., [27] ) have become widely recognized as essential components for modern macroeconomic models. The need for heterogeneity has recently become even more evident during the current COVID-19 pandemic, where unprecedented policy actions have to be taken to mitigate this once-in-a-century event (see,e.g., [19] ).

To address today's key questions in economics quantitatively, one quickly ends up with an intricate formal structure that relies on considering so-called recursive equilibria [50, 33] . In such equilibria, a potentially high-dimensional state variable x ∈ X ⊂ R d represents the state of the economy, d is the dimensionality of the state space, and a time-invariant optimal policy function p : X → Y ⊂ R m , the desired unknown, captures the model dynamics and can be characterized as the solution to a functional equation that reads [14] :

This abstract description nests various characterizations of recursive equilibria, and in particular, the widespread case where the operator H captures discrete-time firstorder equilibrium conditions, the focal point of this paper. A standard method for solving such dynamic stochastic economic models is the so-called time iteration algorithm [9] , which computes the recursive equilibrium of a dynamic economic model by guessing a policy function and iteratively updating it using the first-order equilibrium conditions of the model. However, solving for global solutions 1 to models with substantial heterogeneity and highly nonlinear policy functions is very costly for two key reasons. First, no matter which model characterization is used, the curse of dimensionality [2] imposes a roadblock as soon as X is of a higher dimension (say d > 3), and a global solution has to be computed, where the equilibrium conditions need to be satisfied throughout the entire computational domain. A grid-based solution technique that relies on a naive tensor-product construction will require O(M d ) points when M points are needed in each dimension. This exponential growth makes tensor-product grids infeasible as soon as M and d reach moderate levels. Second, at each grid point, a system of nonlinear equations must be solved. When solving the system of equations at a given grid point, one needs to frequently interpolate from the function computed in the previous iteration step. These interpolation operations can account for up to 99% of the overall compute time for solving the system of equations [47] . These two impediments make it difficult to achieve an acceptable time-to-solution which can quickly reach the order of days on modern supercomputing facilities. As a result, present methods frequently fall well short of including as much heterogeneity as a reasonable modeling choice would imply. 2 To deal with the ever-increasing complexity of state-of-the-art dynamic stochastic economic models, we propose in this work a generic, scalable, and flexible computational framework which can efficiently address the previously noted bottlenecks. Building on [47, 6, 5] and [12] , we specifically contribute (i) an adaptive high-dimensional model representation scheme that is coupled with an adaptive sparse grid algorithm applicable for recursively formulated economic models that significantly reduces the number of grid points in the approximation and the time needed for each function evaluation, (ii) adaptivity criteria which can be used as an on-the-fly analysis tool elucidating the complexity of the model under consideration, (iii) a vectorized implementation for performant interpolation, and (iv) a hybrid parallelized time iteration solution framework fit for virtually any dynamic stochastic economic model. Finally (v), we deploy our solution framework at the Swiss National Supercomputing Center (CSCS) to solve highly nonlinear dynamic stochastic economic models of up to 300 dimensions globally.

The grid point reduction is achieved by combining adaptive sparse grids (adaptive SGs; see, e.g., [7, 42, 34] ) with a dimensional decomposition (DD) framework that is 1 A global solution adheres to the model equilibrium conditions throughout the entire state space-that is, the computational domain, whereas a local solution is only concerned with the local approximation around a steady state. 2 For example, [26] examined the welfare implications of social security reforms using a model in which one period amounted to six years rather than one, thus lowering the number of adult cohorts and therefore the problem's dimensionality by a factor of six. Similarly, international real business cycle models often incorporate only a small number of countries. For instance, [3] studied cross-country risk-sharing at the business cycle frequency using a two-country model with one focus country and the rest of the world. By reducing the problem's dimensionality in this way, valuable qualitative insights can be gained. However, in order to obtain reliable quantitative results or simply to assess the robustness of qualitative findings, it is frequently necessary to consider problems of larger dimensions.

based on high-dimensional model representation (see, e.g., [32, 35, 53] ). Finally, the time-to-solution is substantially accelerated by using a hybrid parallelization scheme (i.e., using both distributed-and shared-memory hardware) combined with a novel vectorization approach for fast interpolation on the policy functions.

SGs can alleviate the curse of dimensionality to some extent, allowing one to tackle models that incorporate rich economic settings, including international real business cycle (IRBC) models of up to 50 countries, that is, 100 dimensions [6] . Furthermore, adaptive SGs can resolve steep gradients or nondifferentiabilities efficiently, making them useful in the context of solving a broad range of mid-scale economic models (say d < 20) with non-smooth policy functions that arise, for example, in the presence of collateral constraints on borrowing (see [4] for a review on the use of adaptive SGs in economics and finance).

However, the limitations of adaptive SGs become evident when one considers highly nonlinear economic models of much more heterogeneity, for example, IRBC models that consist of dozens of countries that face irreversible investment constraints. In such situations, adaptive SGs are no longer an applicable solution method, as the number of grid points increases substantially with the approximation quality or, equivalently, with the resolution of the grid. Furthermore, in high-dimensional SGs, access times of the data structures become computationally expensive (see, e.g., [40] ).

The noted issues can be surpassed by combining DD (see, e.g., [31, 44, 17] ) with adaptive SGs, which we refer to as DDSG. The core idea of DD is to approximate a function by decomposing it into a series of lower-dimensional functions. This decomposition gives a way to represent, for example, in the simplest case, a 300-dimensional function as a summation 300 one-dimensional functions. We will focus on one variate of DD referred to in the literature as High-Dimensional Model Representation (HDMR). The HDMR technique has been applied to multiple fields including chemistry (see, e.g., [53] ), physics (see, e.g., [35] ), and machine-learning (see, e.g., [39] ). With this said, and to the best of our knowledge, HDMR so far seems to have gone unrecognized in the field of economics until now.

The remainder of this paper is organized as follows. In Section 2, we provide a very brief review of the related literature on global solution techniques for high-dimensional dynamic economic models. In Section 3, we describe the abstract structure of the models we aim to solve with our proposed method and specify a conceptually simple yet computationally demanding economic test case-the IRBC model. The latter has become the de-facto workhorse for studying methods for solving high-dimensional economic models, as its dimensionality can be scaled up in a straightforward and meaningful way, as it just depends linearly on the number of countries considered. In Section 4, we detail the proposed DDSG method. Next, in Section 5, we embed DDSG in a time iteration algorithm and discuss the hybrid parallelization scheme of the complete method. In Section 6, we support the performance claims and applicability of the proposed solution framework by providing a series of unit tests and global solution results in two different configurations of high-dimensional IRBC models. Finally, in Section 7 we conclude.

A brief literature review on global solution methods. Over the past two decades, there have been significant advancements in the development of algorithms and numerical tools to compute global solutions for high-dimensional dynamic stochastic economic models (see [36, 14] for recent reviews). To overcome the inherent curse of dimensionality, the computational economics community has pursued two main strands of research: i) SG-based solution algorithms and ii) grid-free meth-ods. SG methods [8] are a mathematically well-studied, systematic way to tackle the numerical difficulties that arise in dynamic economic models due to the highdimensional state spaces. However, they typically fail in real applications if the dimensionality of a highly nonlinear model exceeds about 20 (see [4] for a recent review). However, such problem sizes are often required from a theoretical point of view, for instance, in annually calibrated multi-country overlapping generation models with borrowing constraints-a 240-dimensional problem in its minimal formulation. In contrast, DDSG can, as we show below, handle problem sizes of at least 300 continuous state variables. Grid-free approaches have been proposed, for example, by [11] and have lately become more powerful by leveraging the rapid developments in machine learning. In [45, 25] and [46, 28] , the authors combine Gaussian processes with reinforcement learning and active subspaces, respectively. However, the combination of these methods typically cannot deal in a straightforward manner with frictions such as irreversible investments and the related nonlinearities (as presented in the IRBC model in Section 3.1.2), whereas DDSG can. Other research in computational economics has recently applied deep neural networks to compute global solutions to high-dimensional dynamic stochastic models (see, e.g., [1, 37, 13] ). However, while these methods could be considered an alternative to the work presented here, they nowadays still suffer from several drawbacks that limit their general applicability. Their convergence properties, for example, with respect to the network architecture, are still poorly understood, thus often requiring a substantial amount of hyper-parameter tuning for a successful model solution. In contrast, DDSGs provide a transparent way to handle high-dimensional models. Finally, nesting the DD approach with the time iteration algorithm is not restricted to SGs. Thus, alternative solution methods for low-dimensional economic models could also be scaled up in a relatively straightforward way if embedded in DD, in turn providing a simple means to extend the boundaries for research.

3. Large-scale dynamic stochastic economic models. This section outlines the types of models and the recursive solution techniques that we use below to demonstrate the versatility, accuracy, and computational scalability of the method introduced in this paper. To do so, we proceed in two main steps. First, we begin in Section 3.1 by briefly describing smooth and non-smooth variants of the IRBC model as concrete test cases 3 , thereby closely following [5] . The latter has recently become a workhorse model for studying methods for solving high-dimensional dynamic stochastic models (see, e.g., [22, 10] , and references therein). The IRBC model is straightforward to explain, has a unique solution [50] , and its dimensionality can be scaled up in a meaningful way as it just depends linearly on the number of countries considered. This property of the model allows us to concentrate on the computational issues of handling high-dimensional state spaces. To demonstrate that we can also handle non-smooth problems, we also consider a version of the IRBC model where investment is irreversible. Second, we present in Section 3.2-based on the example of the IRBC models-how a recursive equilibrium can be computed by applying the time iteration algorithm and discuss the computational challenges associated with solving for global solutions of large-scale dynamic stochastic economic models, and which motivate the development of our proposed DDSG method.

3.1. A scalable test case: the international real business cycle model. In the IRBC model we are considering as test case, time t is discrete, there are N countries, j = 1, . . . , N , each using its accumulated capital stock, k j,t ∈ R + , to produce an output good, which can be used for investment, I j,t , and for consumption, c j,t ∈ R + , generating utility, u j (c j,t ). In the model formulation we follow, complete markets are assumed [24] . Thus, a social planner solves the following (infinite-horizon) optimization problem:

where a j,t denotes productivity, τ j are the welfare weights, β is the discount factor, E 0 is the expectation conditional on the information available at t = 0, and where the initial capital stocks k 0 ∈ R N + and productivity levels a 0 ∈ R N + are assumed to be given. The parameterization for both the smooth and non-smooth IRBC model follows [22, 6] and is reported in Table 1 . The first constraint (1) is the so-called aggregate resource constraint, while the second constraint (2) enforces irreversible investments. 4 We refer to the smooth IRBC model where only constraint (1) is used. In contrast, the non-smooth IRBC model requires the solution of (3.1) with both constraints (1) and (2) . Furthermore, we assume an additive separable per-period utility function that is given by u j (c j,t ) = c

The aggregate resource constraint is a function of the production function Y, investment I j,t , the capital adjustment costs Γ, and consumption c j,t : R(a j,t , k j,t , k j,t+1 , c j,t ) = Y(a j,t , k j,t ) − Γ(k j,t , k j,t+1 )−I(k j,t , k j,t+1 ) − c j,t , Y(a j,t , k j,t ) = A · a j,t · k α j,t , I j,t (k j,t , k j,t+1 ) = k j,t+1 − (1 − δ) · k j,t , and (3.2)

The law of motion of productivity is the sole source of stochasticity in the model and is given by: ln a j,t = ρ · ln a j,t−1 + σ · (e j,t + e t ), (3.4) where e j,t ∼ N (0, 1) and e t ∼ N (0, 1) denote the country-specific, and the global shocks, respectively. Both are assumed to be independent from each other and across time. Thus far, we have considered an infinite horizon problem. However, as indicated previously, economics frequently focuses on recursive equilibria [50, 33] , in which the state of the economy is represented by a state variable and the economy's dynamics are given by a time-invariant function of this state (cf. (1.1)). We now briefly describe the recursive structure, that is, the two IRBC model formulations' first-order optimality conditions (FOCs). We direct the reader to [6] for the derivations. 

In order to obtain FOCs of the optimization problem stated in equation (3.1), we need to differentiate the Lagrangian with respect to c j,t and k j,t+1 . Denoting λ t as the multiplier on the resource constraint at time t, and defining the growth rate of capital by g j,t = k j,t /k j,t−1 − 1, we obtain a system of N equilibrium conditions that have to hold at each t and for all countries j:

where E t is the expectation conditional on the information available at t. Furthermore, the aggregate resource constraint (holding with equality due to strictly increasing perperiod utility assumed) reads as follows:

where we use the fact that c j,t = (λ t /τ j ) −γ j holds at an optimal choice [6] . To explicitly point out the link to the abstract definition of a recursive equilibrium (1.1) (and the time iteration Algorithm 3.1 described in section 3.2 below), note that the smooth IRBC model presented here has a (d = 2N )-dimensional state space. As a reminder, the state variables are given by x t = (a t , k t ) ∈ R 2N , where a t = (a 1,t , . . . , a N,t ) and k t = (k 1,t , . . . , k N,t ), are the productivity and capital stock of country j. Furthermore, the optimal, time-invariant policy p : R 2N → R N +1 -the desired unknownmaps the current state into policies as p(x t ) = (k t+1 , λ t ). Note that the investment choices determine the capital stock of the next period (i.e., t + 1) in a deterministic way through (3.2) . In contrast, the law of motion of productivity, (3.4), is stochastic. Taken together, (3.2) and (3.4) specify the distribution of x t+1 . 5

Non-smooth IRBC model -recursive structure. To demonstrate the strength of our proposed algorithm to deal with high-dimensional and highly nonlinear models, we consider next a variant of the IRBC model where investment is irreversible, which in turn leads to non-smooth optimal policies. More precisely, assumed to keep the model tractable. 5 Throughout this paper, we compute the expectations by a simple yet fast monomial quadrature rule (see, e.g., [20] , Sec. 7.5). 

end for 6: until EulerEquationErrors(p ) < 7: returnp we assume that investment cannot be negative (cf. constraint (2) in equation (3.1)). As a direct consequence, we have to solve a system of 2N + 1 equilibrium conditions. These conditions now include the Karush-Kuhn-Tucker (KKT) multiplier, µ j,t , for the irreversibility constraint. The optimality conditions for investment in capital as well as the irreversibility assumption for investment in each country j, and the associated complementary conditions, read as:

In addition, the aggregate resource constraint holds again. The state variables of this non-smooth IRBC model are again given by x t = (a t , k t ). The optimal, timeinvariant policy p : R 2N → R 2N +1 now maps the current state into policies as p(x t ) = (k t+1 , µ t , λ t ) where µ t = (µ 1,t , . . . , µ N,t ). As in the previous Section 3.1.1, all the policies will have to be determined by iterating on (3.7) and the aggregate resource constraint.

3.2. The time iteration algorithm and its computational challenges. In this section, we briefly introduce the time iteration algorithm [9] . The latter solves for a recursive equilibrium of a dynamic economic model by guessing a policy function and iteratively updating it using the first-order equilibrium conditions of a given model such as those presented in Sections 3.1.1 and 3.1.2). For the sake of brevity, the discussion that follows only considers the non-smooth IRBC model. However, the algorithmic logic carries over to any model formulated analogously. Let x t ∈ X ⊂ R d denote again the state of the economy at discrete time t. 6 Recall that in the model under consideration, the economy is represented as a (d = 2N )dimensional state variable x t = (a t , k t ), and that the optimal policy function p : X → R 2N +1 maps the state variable to the capital choice, the KKT multipliers for the irreversible investments constraint, and the Lagrange multiplier for the aggregate resource constraint, that is,

The goal of the time iteration algorithm is to determine an approximate optimal policy function on the entire domain X, that is, to find a global solution to the problem stated in expression (3.1) (see, e.g., [21] , Section 17.8 for further details). Such a policy function must satisfy the FOCs throughout the state space, that is,

The time iteration procedure presented in Algorithm 3.1 iteratively computes the approximate optimal policy functionp -as a numerical proxy to the true p-by using the previous policy iterate (or initial guess)p to interpolate the values k t+2 , µ t+1 , and λ t+1 (cf. expression (3.8)). The main components of the algorithm can be found in steps (3) (4) (5) . In contrast to equation (3.9), we approximatep here; thus, the FOCs hold exactly (up to numerical precision) for the discrete grid points x t ⊂ X. At each grid point where we solve (3.9), we require the solution for a system of 2N + 1 nonlinear equations, that is, for the FOCs presented in section 3.1.2. In order to solve this nonlinear set of equations, a large number of interpolated values fromp are required, as illustrated in Figure 1 . Finally, in step (6), the time iteration algorithm is stopped conditional on satisfying that the so-called Euler Equation Errors are smaller than a given accuracy threshold. In economics, the latter criterion is a commonly used, unit-free measure of how accurately the equilibrium conditions are numerically satisfied (see, e.g., [6] Appendix C, or [14] Section 7.2 for further details). This process is repeated by swapping the policy function until the threshold is satisfied. The fundamental operations of Algorithm 3.1, which become computationally restrictive in a high-dimensional setting, are (i) generating the approximate policy function and (ii) costly interpolations from a policy functionp for solving the system of nonlinear equations. Notice that the policy function we are interested in is high-dimensional and multivariate. In addition, due to the concavity assumptions on utility and production functions, the optimal policy p will also be nonlinear (cf. Section 3.1.2). Hence, approximating it only locally might provide misleading results. For such applications, we need a global solution and a performant interpolation method that approximates p over the entire state space X. To do this efficiently, we will employ DDSG in combination with a hybrid parallelization scheme in Sections 4 and 5, respectively.

4. Function approximation. The time iteration algorithm outlined in the previous section poses multiple computational challenges as it requires repeated function approximations and interpolation of a potentially high-dimensional policy function. This section outlines a function approximation method that addresses these concerns by using adaptive SGs and DD, which we discuss in Sections 4.1 and 4.2, respectively. With this background, we then combine the two numerical techniques in Section 4.3 resulting in the DDSG approximation scheme for high-dimensional functions. We note that various forms of SG and DD exist; we will outline each method's key underlying concepts in a condensed fashion. For thorough introductions to SGs and DD, we refer to [8] and [32, 44, 35] , respectively.

and d in our case is the number of continuous state variables in the economic model of interest, a potentially large number. For the sake of brevity, we assume that the function value is zero on the domain's boundary. This is not a necessary condition, and it can be easily changed by augmenting the basis function (see, e.g., [47] ). First, consider a one-dimensional domain, discretized with grid spacing h l = 2 −l . The grid points are located at x l,i = i · h l , where i ∈ {1, . . . , 2 l } and l ∈ N + are the grid point indices and refinement level, respectively. Using the standard hat function, we define a family of univariate basis functions as

The one-dimensional basis functions can be extended to a d-dimensional domain by introducing the multi-indices i = (i 1 , . . . , i d ) and l = (l 1 , . . . , l d ) ∈ N d + . The grid points are now denoted as x l,i = (x l1,i1 , . . . , x l d ,i d ), and the corresponding d-linear hierarchical basis function is constructed by a tensor product, that is,

Next, we introduce the hierarchical index-set I l and corresponding hierarchical subspace W l , which are given by:

Notice that the odd increments of i j result in the mutually disjoint support of the basis functions that cover the entire domain. For the space of piecewise linear functions,

we can construct a corresponding equidistant Cartesian grid, also called a full grid, with M = 2 number of grid points in each dimension, where denotes the maximum refinement level. Approximations using the full grid will have an L 2 interpolation error of O(M −2 ) and number of grid points are of O(M d ) [8] . It is clear that the full grid suffers from the curse of dimensionality and, thus, is not a scalable approach for high-dimensional function approximation.

SGs can alleviate this issue; their underlying construction principle is to systematically eliminate those hierarchical increment spaces, which contribute only little to the overall quality of the approximation [8] . It can be shown that the SG space is given by

In contrast to the full grid space in (4.4) , where the maximum grid refinement levels in any dimension are restricted to , now the sum is restricted. The SG-based interpolation of a function f at point x, for a maximum refinement level , can be uniquely expressed as

where the coefficients α l,i ∈ R, commonly referred to as the hierarchical surpluses, can be readily computed (see, e.g., [7] for details). Under certain technical assumptionsthe function to be approximated needs to exhibit bounded second-order mixed derivativesthe SG approximation error is O(M −2 (log M ) d−1 ), whereas the number of grid points grows as O(M log(M ) d−1 ). Thus, the number of grid points in an SG is significantly reduced, whereas the error has only slightly deteriorated. Finally, quadrature on SGs can be performed very effectively, that is,

can be readily evaluated by integrating the basis functions in (4.2)-a key of SGs that we will leverage in the sections to come. In summary, the SG approximation method is a computationally efficient method for interpolation and or quadrature of sufficiently smooth functions on moderately high-dimensional domains.

However, in situations where f are highly nonlinear and show distinct local features, a high resolution level is required, but only in particular locations of the domain, which renders the ordinary SG inefficient. Adaptive SGs can cope with this issue to some extend. Unlike the ordinary SG, which has an a priori selection of grid points, adaptive SGs utilize an a posteriori refinement, which, based on a local error estimator, selects which grid points in the SG structure to be refined further [41, 34] . For a predefined threshold γ ∈ R + and γ = α l,i ∞ , (4.8)

we deem a grid point to be significant if γ > γ . The refinement choice is governed by (4.8); however, depending on the application, more sophisticated criteria may need to be imposed for an efficient refinement [48] . If a grid point is not accepted, all grid points that fall in its support will be excluded in the higher refinement level.

A parallel implementation of adaptive SGs is a computationally efficient technique to tackle nonlinear economic models of moderately high dimensions, say d 100 in case the smooth IRBC models, or d 20 in the non-smooth IRBC models [6] . However, when working with models of significantly higher complexity, such as those with state-space dimensions > 100 or policy functions that exhibit high gradients, SGs, adaptive or not, are no longer a practical tool for function approximation. In particular, the data structure becomes computationally demanding to operate on (see, e.g., [40] ). Furthermore, increasing refinement levels to resolve non-smooth features in high-dimensional settings significantly increases the number of grid points, thus quickly rendering the approximation uncomputable [6] .

In this section, we outline the DD approach that directly targets the curse of dimensionality by attempting to model the input-output behavior of a high-dimensional function as a sum of low-dimensional 

where f ∅ is a constant, f i (x i ) models the independent contribution, f ij (x i , x j ) the pairwise dependent contribution, and so on, up to the last term f 12...d (x 1 , x 2 , . . . , x d ), which accounts for the residual contributions. In its complete form, the summation in (4.9) is exact, as the last term accounts for all contributions of all the input variables. This representation is referred to as High-Dimensional Model Representation (HDMR) and is a general method for a function decomposition that captures highdimensional input-output system behavior [44, 32, 18, 31] . The summation in (4.9) can be compactly written as

The terms in the summation in (4.10) are categorized by the expansion order k = |u| or equivalently by the dimension of x u . Suppose this summation can be truncated to some maximum expansion order K d without significant degradation in the approximation quality. In that case, one can reduce the overall dimensionality of the function. For example, if we consider a 100-dimensional function, its first-order expansion k = 1, will result in 100 1-dimensional functions. With this said, selecting the appropriate K for the decomposition will undoubtedly depend on f .

Two popular versions of HDMR are cut-and ANOVA-HDMR [43, 32] . 7 We will define the component function for both types of HDMR, but our focus will be on cut-HDMR. As we will see, the cut-HDMR approach is a more fitting approach for our application due to its reliance on function evaluations as opposed to ANOVA-HDMR, which requires high-dimensional numerical integration. Let w(x) = d i=1 w i (x i ) be a product measure, with w i (x i ) having a unit volume. By sequentially ascending through the expansion orders, starting from the zeroth-order, the optimally and uniquely defined HDMR component function

will only be dependent on lower-order component functions f v (x x ) for v ⊂ u. This is particularly important, as the contrary would eliminate any reduction in dimensionality. This attribute is a result of the imposed orthogonality condition in (4.11) [44, 18] . The cut-HDMR component functions are based on the Dirac measure,

is a reference point in space, referred to as the anchor point. Several techniques exist for the selection of the anchor point [15] . A straightforward approach is to simply choose the anchor point randomly in the high-dimensional space. However, this can affect the approximation (i.e., for K < d), and a careless selection may result in large errors (see, e.g., [52] for details on HDMR error analysis). As noted in [49] a suitable anchor point should satisfy

An appropriate anchor point can be selected via samplingx such that f (x) approximates the mean of the function over the domain (see, e.g., [35] for further details). It is important to highlight thatx is not unique since a given function value could attain the mean value in several parts of the domain. The cut-HDMR component functions can be explicitly evaluated from (4.11) as a telescopic summation [30] f cut

(4.14)

We use the notation x =x\x v to refer to assigning x the values ofx but excluding the indices of v. For example, given x = (x 1 , x 2 , x 3 ), thenx\x 1,2 = (x 1 , x 2 ,x 3 ). In Figure 2 , we depict an example 2-dimensional function on the left panel and the first-order cut-HDMR approximation on the right panel.

Using the Lebesgue measure, in place of the Dirac in (4.12) the ANOVA-HDMR decomposition is recovered [16] , with the component function

Notice that on the k-th expansion order, (4.15) requires quadrature across (d − k) dimensions. Computationally, high-dimensional quadrature is a prohibitive operation subject to the same computational challenges we wish to address: the curse of dimensionality. In contrast, the cut-HDMR component functions are easily computed, requiring only function evaluation, which in turn for an arbitrary input x, necessitates interpolation of f (x)| x=x\xv . As such, we proceed with adopting the cut-HDMR decomposition for which we will simply refer to as f u (x u ). In principle, representing the cut-HDMR component functions is independent of the underlying numerical approach. In practice, however, we will need a very efficient numerical approach since the component functions with high DD expansion orders will, to some extent, be exposed to the curse of dimensionality.

4.3. Dimensional decomposition using adaptive sparse grids. To approximate an economic policy function globally, we follow [35] and [53] , and embed adaptive SGs within DD to form the DDSG method. 8 Utilizing adaptive SGs for the underlying numerical method has two desirable properties: (i) they can be applied to moderately high-dimensional component functions with non-smooth features, and (ii) quadrature operations can be carried out efficiently on them. Using the combined DDSG numerical approach, we can approximate the function f as

which is a summation of |v|-dimensional adaptive SGs, where here we truncate the DD maximum expansion order K d. The accuracy of the DDSG approximation is dependent on the underlying function, SG approximation, maximum expansion order K, and in turn, the number of component functions. 9 At a given expansion order, the DDSG component functions increase combinatorially. The number of grid points in the approximation is the summation over the grid points of lower-dimensional SGs,

where |V SG ,k | = O(M log(M ) k−1 ) denotes the number of grid points for k-dimensional SG with maximum refinement level (see, e.g., Section 4.1 for further details). 10 For K d, the number grid points for DDSG is given by O(M log(M ) K−1 d K ). In comparison to SG, we have removed the exponential dependence on d, but now instead, the problem is exponentially dependant on K. Even for moderately sized problems, the number of component functions corresponding to an adaptive SG can pose a computational challenge for values of K > 1.

To address this shortcoming, we outline two criteria that will efficiently truncate the expansion by ignoring its insignificant component functions. Specifically, we will outline two DDSG adaptivity criteria for the relevance of (i) the proceeding expansion order and (ii) of each component function within an expansion order. With these criteria, we can generate an adaptive variate of (4.16) by only proceeding to higher expansion orders when required and eliminating insignificant component functions within an expansion order. We emphasize that the expansion criterion, or the active dimension selection for that matter, is not a measure of convergence of the DD expansion. These criteria provide an assessment of the importance of the current expansion or component function with respect to the previously computed values. As such, the usage of the criteria, in particular with an aggressively high tolerance may lead to might lead to excessive truncation or pruning of the component functions. As discussed in more detail in Section 6.2, these criteria can also be used as an analysis tool to understand the impact of the different component functions and or expansion orders on the overall approximation.

The expansion criterion is the relative residual between two consecutive expansion orders k and k − 1. Given the current expansion order k, a predefined convergence threshold ρ ∈ R + , and a coefficient

we justify the progression to the next expansion order k + 1 if ρ > ρ . The SG quadrature operation Q is given by (4.7). With the assumption that K d, we can expect that the component function will not be high-dimensional, and thus, quadrature will not pose a computational bottleneck here.

It is important to highlight that a truncation of (4.16) is not necessarily an approximation. Indeed, the truncation can be error-free as long as the underlying function is additively separable up to the K-th expansion order. This feature is shown in Figure 3 , where we display the value of ρ and relative absolute error regarding the expansion order k for a 4-dimensional polynomial of varying degree c. Both ρ and the relative absolute error reach machine precision when the expansion order is k c. We note that the expansion criterion is not a measure of convergence, as an increase in the expansion order does not necessarily translate into a decrease in the error.

We look to identify insignificant component functions within an expansion order by using the active dimension selection criterion. Here, we assume that the underlying function is not constant with respect to a single variable. Thus, we only focus on component function in- dices |u| 2. Given a predefined active component function threshold η ∈ R + and coefficient

we deem the component function index u as important if η u > η . 11 Component indices that do not satisfy this condition and any superset of these indices are not computed. In Figure 4 , we schematically depict the active dimension selection criterion on a three-dimensional function and the resulting "pruning" effect on the combinatorial tree of component functions. In the left panel, we can see the component functions resulting from a full expansion. In the right panel, we display a scenario for which η > η 1,2 holds. In this case, the corresponding component index is removed from the computation, but also {1, 2, 3} as {2, 3} ⊂ {1, 2, 3}. Notice that the values from the quadrature operations in (4.19) can also be shared with (4.18) , and vice versa.

High-dimensional parallel time iteration framework. This section introduces a performant framework for high-dimensional DDSG function interpolation and approximation, leveraging an optimized adaptive SG framework [47] . In Section 5.1, we describe a vectorized approach to DDSG interpolation, which allows for a performant, cache-efficient execution of function calls. Next, we outline in Section 5.2 the parallelized DDSG time iteration framework for solving large-scale dynamic stochastic economic models.

In solving the system of nonlinear equations at a given point, the time iteration algorithm requires frequent interpolation on the policy functionsp from the previous iteration (see Section 4.3 for further details). These interpolations typically take up the majority-often far beyond 90% [47] -of the computation time needed to solve the nonlinear set of equations. Therefore, the time-to-solution of the time iteration algorithm is highly sensitive to the performance of the DDSG interpolation function call. Direct implementation of (4.16) results in a massive number of repeated computations as many of the SG interpolants are identical. Consider for example, the component functions f 1,2 and f 1,2,3 both require the interpolation values of I f (x)| x=x and I f (x)| x=x . Notice that the number of repeated computations increases nonlinearly with respect to the function's dimensionality and DDSG expansion order. Furthermore, a simple lookup-table approach would result in an erratic memory access pattern on a large array; thus, the computation would be plagued with cache misses.

We can eliminate all redundant SG interpolation and achieve an ideal access pattern without significant overhead in the memory footprint. This is achieved by separating telescopic summation in (4.16) and storing the SG interpolation and coefficient values in two separate arrays

In addition, b, as it is independent of x, will only need to be computed once. 12 Notice that for notation clarity, the formulation above does not consider the two DDSG adaptivity criteria described in Section 4.3 and asserts the full expansion up to the maximum expansion order K. The following section will provide an explicit algorithm describing how each component function is selected with the respective DDSG adaptivity criteria. The DDSG interpolation function call now reduces to a desirable dot product f (x) ≈ a(x) b with a contiguous data access pattern.

We begin with the general description of the generic parallelized DDSG Algorithm 5.1 and proceed to outline the inclusion of DDSG in the time iteration Algorithm 3.1.

The parallelized DDSG algorithm takes as input: the function f to be approximated, maximum expansion order K, expansion criterion tolerance ρ , active dimension selection tolerance η , anchor pointx, maximum refinement level , and adaptivity SG tolerance γ . We begin in step (1) with all compute instances initializing the empty, vectorized DDSG arrays defined in (5.1), the zeroth-order component function f ∅ = f (x), and the reject index-set Z = ∅. The reject index-set collects all component function indices excluded from the DDSG expansion as per the active dimension selection criterion. We sequentially progress through the expansion orders in the body of the algorithm in steps (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) . At expansion order k, the current order index-set C is defined as the component indices of order k, which are not a superset of any of the indices in Z. Expanding on the example in Figure 4 with expansion order k = 3, values of the reject and current order index set will be Z = {{1, 2}} and C = {∅} as {1, 2, 3} ⊃ {1, 2}, respectively. Subsequently, at step(4), we rebalance the compute resources evenly based on the current order index set, and the computation is carried out in parallel in steps (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) . For each parallel SG interpolant, the quadrature value, and the active dimension selection coefficient η i are computed for component index i. Next, in steps (9-13), we employ the active dimension selection criterion for each component index. If the component function is accepted, we assign the DDSG vector arrays as defined in (5.1). If not, the component index i is added to the rejected index set, and the SG interpolant and its quadrature values are discarded. Notice that each compute instance has a local or partial version of the computed variables within the parallel section of the algorithm. We perform the global synchronization upon exiting the parallel region at step (15) . In steps (16) (17) (18) (19) , with the globally available quadrature values available, we apply the DDSG expansion criterion (4.19) . If the threshold is reached, the routine terminates. Otherwise, we proceed to the next ex-Algorithm 5.1 Generic Parallel Adaptive DDSG Algorithm.

load balance given C

for all i ∈ C parallel do 6: compute : I f (x)| x=x\x i Using SG adaptivity tolerance γ .

compute : Q f (x)| x=x\x i Using SG adaptivity tolerance γ .

compute : η i 9:

if η i ≥ η then 10:

compute : {a i (x), b i } As defined in (5.1). end if 20: end for 21: return a, b pansion order. The return values of the routine at step (21) are the vectorized DDSG interpolation arrays.

In Figure 5 we show the schematic for the parallel DDSG time iteration algorithm. In particular, we display two levels of parallelism: the first is based on distributed-memory (dashed blue lines), whereas the second relies on mainly sharememory 13 (dotted red lines) 14 . Starting in step(1), we assign the newly computed policy functionp as the current policy functionp, or in the case of the initial step, we assign a random (guessed) policy function. Next in step(2), we begin the HDMR decomposition starting from expansion order k = 1 and evenly assign the component functions amongst groups of distributed-memory processes, referred to as a processgroups. For each process group in step(3), the respective SGs of the component functions are computed in parallel using a shared-memory approach. At a given refinement level, first, the SG grid points are evenly partitioned amongst the compute resources (threads), and second, at each grid point, we solve the first-order conditions 15 noted in Section 3.2. We incrementally ascend through the SG refinement levels based on the adaptivity criterion described in Section 4.1. Next in step(4), having computed the SG, we evaluate its quadrature for use in the DDSG adaptivity criteria noted in Section 4.3. In step(5),corresponding to step(15) in Algorithm 5.1, we globally synchronize the distributed computations. We proceed in step(6) by checking the expansion criterion noted in (4.18) and move to the next HDMR expansion order if required, that is, back to step (2) . Finally, in step (7), we reconstruct the DDSG policy function for the next iteration in step (1) . With this parallelization approach, we expect that the parallel efficiency would be higher in the DD component, the pri- mary layer, compared to the SG component, the secondary layer of the algorithm. As highlighted in [12] , this is due to the unutilized computing resources in the SG algorithm at low refinement levels and also because of the repeated synchronization on increasing refinement levels. For example, in a one-dimensional SG with a maximum refinement level of = 4, we would have 1,2,4, and 8 grid points at each refinement level. Thus allocating 8 cores for this computation would only achieve full utilization on the last refinement level. Furthermore, at each refinement level, we would require synchronization.

Results. This section demonstrates the capabilities of the parallelized DDSG time iteration framework introduced in Section 5. We begin in Section 6.1 with a set of basic performance tests for grid point reduction, vectorization performance, scalability, and speedup. In Section 6.2, we utilize DDSG as a tool to analyze both variates of the IRBC model described in Section 3.1. Using conclusions drawn from this analysis, in Section 6.3, we deploy the parallel DDSG time iteration framework to solve a set of large-scale smooth and non-smooth IRBC models. We introduce the following naming standard for the parameterization of the DDSG routine:

It should be assumed that ρ = η (see Section 4.3 for definitions of ρ and η ) unless otherwise noted. If a nonadaptive variate of the DDSG method is used, the values of η and γ are omitted. To measure our scheme's convergence, we follow the previous Equivalence Line Equivalence Line Fig. 6 . A plot of the ratio of SG to DDSG grid points for a maximum refinement level = 4 (left), and 10 (right) with respect to the maximum expansion order K at varying dimension. In both figures, the thicker lines represent data points for which the number of DDSG grid points is < 10 9 .

literature and report the Euler Equation Errors (see [6] , Appendix C for details). A total of 10, 000 error samples are gathered for which the maximum (Maximum Euler Error) and average (Average Euler Error) are reported in log 10 scale. In all test cases, the reported targeted Average and Maximum Euler errors in our approximate solutions align with the current literature. All experiments were conducted on the Piz Daint supercomputer at CSCS (Cray XC50 with 12-cores and 64GB of memory per node).

6.1. Unit Tests. We now outline a set of unit test results to highlight the reduction in the number of grid points, the performance of the vectorized interpolation, the parallel scalability, and the speedup of the framework with respect to the stateof-the-art SG framework.

6.1.1. Grid point reduction. In Figure 6 , we show the SG to DDSG grid point ratios with respect to the maximum expansion order K for various function dimensionalities. We refer the reader to Section 6 for further details on the adopted notation standard DD η K SG γ . The red line in each graph denotes a ratio of one-that is to say, where the number of grid points for DDSG and SG are equivalent. The plot lines not marked with thick lines are where using DDSG would result in a number of grid points > 10 9 . In these conditions, memory issues would render DDSG, or even SG, inoperable. The two plots presented correspond to two scenarios, one where the underlying function exhibits relatively smooth dynamics (lower maximum refinement levels) and the other where one wishes to resolve non-smooth futures (higher maximum refinement levels). For a lower refinement level = 4, shown in the left panel, DDSG provides a reduction in grid points up to a maximum expansion order K = 2. At a maximum expansion order of K = 1, we can see orders of magnitude in grid point reduction. This trend is further exaggerated when we require higher SG refinement levels. We show the same test in the right panel but with = 10. Here there is a reduction in grid points up to an expansion order of K = 5. However, at such high expansion orders and refinement levels, the overall number of grid points is much too high for practical usage. of the time iteration using SG 4 and DD 2 SG 4 . In all tests, the optimizer is set with a termination tolerance of 10 −4 . In the left panel, we can see that DDSG provides roughly 30% reduction in the number of function calls compared to SG. In the right panel, we see the relative compute times of the naive and vectorized approach (see Section 5.1) for DDSG interpolation. Here the naive approach is the evaluation of (4.16) directly. We can see that the vectorized approach provides a speedup of roughly 2.8 times that of the naive implementation.

6.1.3. Scalability. In Figure 8 , we show the normalized compute time for one step of the time iteration for a 20 and 50-dimensional IRBC model. As noted in 5.2 and described in detail in [12] , we expect the primary layer of parallelization, the DD component of the DDSG algorithm, to have better parallel efficacy than the SG component, that is, the secondary layer. The left panel shows numerical results for a 20-dimensional model with a fixed maximum refinement level of 10 and a varying maximum expansion order 1 and 2. Here, we have 20 and 210 component functions for the respective maximum expansion orders. We can see almost ideal strong scaling up to a number of nodes equal to the number of component functions for both tests. After this point, additional parallelism is taken from the secondary, less efficient layer of parallelism in the SG algorithm. In the right panel, we use a 50-dimensional model with a fixed maximum expansion order of 2 and varying maximum refinement levels of 4 and 5. In contrast to previous test cases, we have 1, 275 component functions for both tests, which is larger than the maximum number of nodes. Here we can observe almost ideal strong scaling up to 1, 000 nodes, which is the same for both tests. Furthermore, we see that the difference in the SG maximum refinement level slightly affects the parallel performance, with higher maximum refinement levels providing a marginal advantage in scalability.

6.1.4. Speedup. In Figure 9 , we display the speedup of the parallelized DDSG time iteration framework with respect to an SG version. We show a single time iteration step of an 8-dimensional model for DDSG at maximum expansion order 1 and 2 at varying refinement levels in the left panel. This test is done on a single node using shared-memory parallelism, and both DDSG and SG approximation methods use the same adaptive coefficient. At refinement level 7, for the respective tests, we can see that the DDSG approach provides 240 and 10 times faster runtimes than the SG approach. In the right panel, we show the time-to-solution for a single DDSG time iteration step at a maximum expansion order of 1, maximum refinement level 3, and varying dimensions for the smooth IRBC model. The tests are deployed on a number of nodes equal to that of the dimensionality of the model. The DDSG routine provides a speedup of up to 10 times over SG implementation. The reason for this speedup is primarily due to DDSG operating on 100 one-dimensional SGs while the SG time iteration framework operates on a 100-dimensional SG.

6.2. Model analysis. We now use the active dimension selection criterion outlined in Section 4.3 as an analysis tool to assess the significance of the component functions of the policy function. In Figure 10 , we show the aggregate minimum, average, and maximum values of η u for the 8-dimensional smooth and non-smooth models, at varying expansion orders. For a given expansion order, an increase in variability between the maximum and minimum values of η u signifies that active dimension selection could be effective in selecting only a subset of the component functions. In contrast, if both the maximum and minimum values are equivalent to the average, then we would conclude that no component function could be considered to be more significant than the other. In such scenarios, active dimension selection would not be an effective way to reduce the computational cost of approximating the respective policy functions. In the left panel of Figure 10 , we can see that the smooth model's policy function only significantly impacts up to the third expansion order in the left panel. The second and third-order terms are approximately 100 times less significant compared to the first-order component functions. Our experiments show that com- ponent functions with η u < 10 -4 do not play a significant role in the approximation of the policy function. Thus, they can be eliminated without significantly degrading the overall approximation quality. Notice that using η = ρ = 10 -4 , the active dimensional selection criterion, and the expansion criterion would truncate the expansion at the first DDSG expansion order. In the right panel of Figure 10 , we see the same analysis for the non-smooth IRBC model, whereas in this case, the component functions show significance up to the fourth-order expansion terms. Here we can see that both the first and second-order component functions are required while the majority of the third and fourth-order component functions fall below the 10 -4 threshold. With this analysis, we proceed with our experiments using K = 1 and 2 for the smooth and non-smooth IRBC models, respectively, and η = ρ = 10 -4 for both cases.

We now look at the effect of these parameters on the convergence trajectories. In the top panel of Figure 11 , we show the average and maximum Euler errors for the smooth IRBC model using SG, adaptive SG, and the DDSG time iteration algorithm. The horizontal axis corresponds to the cumulative number of grid points evaluated for several time iteration steps. For DDSG to be a superior approximation method than SG and adaptive SG, we should attain smaller Euler errors for the same number of grid points. We test our DDSG implementation with K = 1 at = 4, using γ = 10 -3 and 10 -4 . The observed average and maximum Euler errors begin relatively high in both configurations but quickly decrease beyond classical and adaptive SG. Notice that DDSG with γ = 10 -4 does not improve the convergence rate in comparison to DDSG with γ = 10 -3 , but there is a reduction in the number of grid points. With this said, DDSG requires roughly 10 times fewer grid points to attain equivalent Euler adaptive SG errors. In the bottom panel of Figure 11 , we show the average and maximum errors for the non-smooth IRBC model using SG, adaptive SG, and the DDSG method at varying SG adaptivity coefficients. Unlike the previous case, the non-smooth IRBC model requires significantly higher SG refinement levels to represent its non-smooth policy function adequately. Two tests were conducted using the DDSG method with K = 2, one using = 8 with and γ = 10 -2 , and the other with = 10 and 5 × 10 -3 . Notice that in high-dimensions, at such high SG refinement levels, the number of grid points would almost surely render a model uncomputable for adaptive SGs, even if run on modern supercomputer facilities. For example, using an SG at = 10, a 20-dimensional model would consist of about 5 billion grid points. As observed in the right bottom panel, DDSG requires higher refinement levels to sufficiently decrease the maximum Euler error. Even at this refinement level, in comparison to adaptive SG at = 8 and γ = 10 -2 , the DDSG method allows for a significant reduction in the Euler error with half the number of grid points of adaptive SG.

6.3. Large-scale models. Based on the analysis conducted in the previous section, we proceed here by adopting the DDSG parameters noted above as a baseline for solving a set of large-scale IRBC models with up to 300 and 60 dimensions for the smooth and non-smooth IRBC models, respectively. At such dimensions, adaptive SGs would not be a fitting numerical approach due to the sheer number of grid points. The effect of this massive increase in the number of grid points is proportional to an increase in the computation time, which quickly surpasses a day of runtime on a high-performance computer. In Table 2 , we show the results for the two IRBC model variants. Firstly, we achieve low average and maximum Euler errors for the smooth model for all test cases. For the 300-dimensional model, we have used a lower SG refinement level of 3, compared to 4 for the 100-and 200-dimensional models. To compensate for this lower refinement level, we use γ = 10 -6 . With this said, we can see that even for the 300-dimensional case, the proposed framework is sufficient to achieve −2.89 and −1.78 for the average and maximum Euler errors, respectively. For comparative purposes, we show the number of SG grid points at refinement level 4. There is roughly a four-orders-of-magnitude difference between the required number of grid points between SG and the DDSG. Using adaptive SGs will undoubtedly decrease the number of grid points. However, dimensions > 100 remain uncomputable using adaptive SG. The per iteration runtimes of the smooth IRBC model tests are 0.5, 1.6 and 4.2 hours using 100, 200 and 300 nodes, for model dimensions 100, 200 and 300, respectively. For the non-smooth IRBC models, we look at model dimensions of 20, 40, and 60. Compared to the smooth model, the number of grid points required is significantly larger, as we need a much denser grid to capture the strong nonlinearities Average and maximum Euler errors for smooth IRBC models using the DDSG time iteration method. Note that no SG tests could be conducted at such high dimensions. * The reported SG grid points are for comparison purposes and are reported using = 4 and 10 for the smooth and non-smooth model, respectively.

in the policy functions. We can efficiently alleviate this problem by using DDSG with a refinement level of 10. The shortfall of a pure SG becomes apparent when we look at a comparative number of SG grid points at the same refinement level, surpassing trillions of grid points in a 60-dimensional model with level 10. Using adaptive SGs can provide some degree of efficiency in lower-dimensions, for example, [6] show that a 20-dimensional model can be solved using roughly 10 4 grid points. With this said, the DDSG approximation method required more than half the grid points to achieve the same error metrics. Furthermore, in a higher dimension, the number of grid points for both SG and adaptive SG will increase significantly, rendering the model uncomputable on contemporary supercomputers. The per-iteration-runtimes of the kink model are 1.8, 9.9, and 16.4 hours using 38, 156, and 354 nodes, for model dimensions 20, 40 and 60, respectively.

We introduced a computational framework to solve large-scale dynamic stochastic economic models on practical time scales. As a secondary benefit, it can serve as an a priori analysis tool to shed light on the model's complexity. At the core of the methodology is the DDSG function approximation that combines an HDMR technique with adaptive SGs. We parallelized the DDSG method by leveraging the intrinsic separability in the computation, embedded it in a time iteration algorithm, and deployed it on the Cray XC50 system installed at CSCS. Our numerical experiments-that is, solving a set of smooth and non-smooth IRBC models, showed a speedup of 10 times in comparison to state-of-the-art adaptive SGs in cases of midscale models, where both methods were applicable. In addition, we showed that even for a relatively small 50-dimensional model, the proposed framework provides excellent strong scaling up 1, 000 compute nodes. Furthermore, we demonstrated that we can compute global solutions to IRBC models with at least 300 continuous dimensions in only a few hours. This is a substantial improvement over the previous literature and opens new possibilities for a richer set of model specifications. Finally, note that the scope of the presented method is not restricted to models that are recursively formulated via first-order conditions, but more broadly to high-dimensional models that can be characterized in the functional equation (1.1). The latter also nests the common characterizations of recursive equilibria in discrete time, where p is the value function, and the operator H captures the Bellman equation it has to satisfy-or the Hamilton-Jacobi-Bellman equation in continuous time [14] . However, tackling such models with the proposed method is subject to further research.

Deep equilibrium nets

Adaptive Control Processes: A Guided Tour

Capital mobility and international sharing of cyclical risk

Sparse grids for dynamic economic models

Scalable high-dimensional dynamic stochastic economic modeling

Using adaptive sparse grids to solve high-dimensional dynamic models

Multivariate Quadrature on Adaptive Sparse Grids, Computing

Sparse grids

Solving the stochastic growth model by policy-function iteration

Computational suite of models with heterogeneous agents ii: Multi-country real business cycle models

Solving the stochastic growth model by parameterizing expectations

Parallelized dimensional decomposition for large-scale dynamic stochastic economic models

Financial frictions and the wealth distribution. Working paper

Chapter 9 -solution and estimation methods for dsge models

On anova expansions and strategies for choosing the anchor point

Sparse Grid Quadrature in High Dimensions with Applications in Finance and Insurance

Generalized functional anova diagnostics for high-dimensional functions of dependent variables

Generalized Functional ANOVA Diagnostics for High-Dimensional Functions of Dependent Variables

The effect of large-scale anti-contagion policies on the covid-19 pandemic

Multi-country real business cycle models: Accuracy tests and test bench

Monetary policy according to hank

Comparison of solutions to the multi-country Real Business Cycle model

Pareto-improving carbon-risk taxation

Computing equilibrium in OLG models with stochastic production

of Handbook of Macroeconomics

Self-justified equilibria: Existence and computation

Stationary equilibria in asset-pricing models with incomplete markets and collateral

On decompositions of multivariate functions, Mathematics of Computation

General formulation of HDMR component functions with independent and correlated variables

High dimensional model representations

Recursive macroeconomic theory

An adaptive hierarchical sparse grid collocation algorithm for the solution of stochastic differential equations

An adaptive high-dimensional stochastic model representation technique for the solution of stochastic partial differential equations

Chapter 7 -numerical methods for large-scale dynamic economic models

Deep learning for solving dynamic economic models

memo"functions and machine learning

A surrogate model for computational homogenization of elastostatics at finite strain using highdimensional model representation-based neural network

Compact data structure and parallel alogrithms for the sparse grid technique

Spatially Adaptive Sparse Grids for High-Dimensional Problems

Spatially adaptive sparse grids for high-dimensional data-driven problems

Efficient input-output model representations

General foundations of high-dimensional model representations

Machine learning for dynamic incentive problems

Machine learning for high-dimensional dynamic stochastic economies

Rethinking large-scale economic modeling for efficiency: Optimizations for gpu and xeon phi clusters

Pricing american options under high-dimensional models with recursive adaptive sparse expectations

Theorems and examples on high dimensional model representation

Recursive Methods in Economic Dynamics

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming

On the approximation error in high dimensional model representation

Adaptive ANOVA decomposition of stochastic incompressible and compressible flows

Acknowledgments. We thank Lorenzo Bretscher, Johannes Brumm, Felix Kübler, Philipp Renner, Olaf Schenk, Karl Schmedders, Tony Smith, Fabio Trojani, and seminar participants at the University of Geneva, the University of Lausanne, Università della Svizzera italiana, the Russian Presidential Academy of Science, Stanford University, the University of Zurich for their extremely valuable comments.