key: cord-0116337-pkbn4ifo
authors: Farcas, Ionut-Gabriel; Merlo, Gabriele; Jenko, Frank
title: A general framework for quantifying uncertainty at scale and its application to fusion research
date: 2022-02-08
journal: nan
DOI: nan
sha: 2cfdb2a3723437b56c078b29633f12a729698f18
doc_id: 116337
cord_uid: pkbn4ifo

In many fields of science, remarkably comprehensive and realistic computational models are available nowadays. Often, the respective numerical calculations call for the use of powerful supercomputers, and therefore only a limited number of cases can be investigated explicitly. This prevents straightforward approaches to important tasks like uncertainty quantification and sensitivity analysis. As it turns out, this challenge can be overcome via our recently developed sensitivity-driven dimension-adaptive sparse grid interpolation strategy. The method exploits, via adaptivity, the structure of the underlying model (such as lower intrinsic dimensionality and anisotropic coupling of the uncertain inputs) to enable efficient and accurate uncertainty quantification and sensitivity analysis at scale. We demonstrate the efficiency of our approach in the context of fusion research, in a realistic, computationally expensive scenario of turbulent transport in a magnetic confinement device with eight uncertain parameters, reducing the effort by at least two orders of magnitude. In addition, we show that our method intrinsically provides an accurate reduced model that is nine orders of magnitude cheaper than the high-fidelity model. Our approach enables studies previously considered infeasible.

During the past few decades, science has been revolutionized through computing. First, model-based computing ('simulation') [5] has allowed us to investigate complex phenomena which are not accessible to theoretical analysis alone, and now data-centric computing [17, 38] is enabling us to more effectively explore, understand, and use large amounts of data originating from experiments, observation, and simulation. Building on these important advancements, we are now entering a new phase in which the focus lies on transitioning from merely interpretive and qualitative models to truly predictive and quantitative models of complex systems through computing.

One obstacle on the path towards predictive physics-based models is uncertainty. Whether stemming from incomplete knowledge of the given system, measurement errors, inherent variability, or any other sourceuncertainty is intrinsic to most real-world problems, and this aspect needs to be included in respective modeling efforts. Accounting for, understanding, and reducing uncertainties in numerical simulations of real-world phenomena is carried out within the framework of Uncertainty Quantification (UQ) [16, 33] . In the present work, we assume that uncertainty enters the underlying model through a set of scalar inputs (parameters characterizing the state of the system, the initial or boundary conditions, or other aspects of the system under consideration), whose cardinality is referred to as the stochastic dimension or, simply, simulation, in the sense that they produce significant variations in the output(s) of interest. In addition, assuming that nonlinear effects exist in the way subsets of uncertain inputs interact with each other, these interactions are often anisotropic, i.e., the interaction strength varies significantly. Having information about the importance of uncertain inputs and the strength of their interaction is clearly advantageous for UQ/SA since, for example, the probabilistic space in which the underlying uncertain parameters live can be sampled accordingly, thus potentially significantly decreasing the number of required samples. In general, however, this information -typically obtained via SA -is available only a posteriori, after the simulations have been performed.

In the present article, we employ our sensitivity-driven dimension-adaptive sparse grid interpolation approach that explores and exploits the structure of realistic simulations online via adaptive refinement with a sensitivity-driven refinement policy. The novelty here is that, as it will be demonstrated in the following, our structure-exploiting method enables UQ and SA at scale, which goes beyond what most existing methods offer. In addition, we study for the first time multi-dimensional UQ and SA in first-principles based turbulence simulations of fusion plasmas. In a realistic and practically relevant simulation scenario of turbulent transport in the edge of the DIII-D fusion experiment with more than 264 million degrees of freedom and eight uncertain inputs, our approach requires a mere total of 57 high-fidelity simulations for UQ and SA. In addition, since our approach is based on interpolation, a byproduct is an interpolation-based reduced model of the parameters-to-output of interest mapping. The obtained reduced model is accurate and nine orders of magnitude cheaper than the high-fidelity model.

Let d ∈ N denote the number of uncertain inputs modeled as random variables distributed according to a probability density, π. The input density stems from, e.g., experimental data analysis, expert opinion, or a combination thereof. The high-fidelity simulation code calculates an output of interest, which, for simplicity, is assumed here to be a scalar quantity, noting that our approach can be trivially employed in the multi-variate case as well by, e.g., treating each output component separately. UQ and SA generally require ensembles of high-fidelity model evaluations at prescribed input values representing samples distributed according to π. In our sensitivity-driven dimension-adaptive sparse grid interpolation strategy [10, 11] , these realizations are points living on a d-dimensional (sparse) grid that is constructed online via adaptive refinement. What enables UQ and SA at scale is our specific strategy for adaptive refinement.

Sparse grid approximations are constructed as a linear combination of judiciously chosen d-variate products of one-dimensional approximation operators. This linear combination is done with respect to a multi-index set L ⊂ N d comprising d-variate tuples = ( 1 , 2 , . . . d ) ∈ N d called multi-indices. Each multi-index uniquely identifies a product defined on a d-variate subspace. Here, the underlying approximation operation is interpolation defined in terms of (global) Lagrange polynomials, which can be trivially mapped to an equivalent spectral projection approximation. We note, nevertheless, that our sensitivity-driven approach can be implemented for other approximation operations such as quadrature as well. Moreover, we note that our approach is hierarchical, meaning that the results corresponding to a multi-index can be re-used at its forward neighbors + e i , where i = 1, 2, . . . , d and e ij = 1 if i = j and e ij = 0 otherwise.

We construct L online via our sensitivity-driven dimension-adaptive refinement strategy. In dimensionadaptivity [15, 19] , L is split into two sets, the old index set, O, and the active set, A such that L = O ∪ A. The active set A contains the candidate subspaces for refinement. The old index set O comprises the already visited subspaces. In each refinement step, we ascertain the importance of individual inputs and their interactions to guide the adaptive process. Specifically, we determine the sensitivity informationwith respect to the output of interest -of all uncertain inputs in each candidate subspace for refinement. We distinguish here between sensitivities corresponding to individual inputs, referred to as local sensitivity indices or indicators, and sensitivity indices associated to interactions of inputs. We use this information to compute a sensitivity score which is an integer counting the amount of important sensitivity indices. To this end, we compare these indices with user-prescribed tolerances, and whenever a tolerance is exceeded, the corresponding sensitivity index is considered important and the score is increased by one. These tolerances can be viewed as a (heuristic) proxy for the accuracy with which we want the algorithm to explore the directions associated with the sensitivity indicators. Upon computing the scores for all candidate subspaces, we refine the subspace with the largest sensitivity score noting that if two or more subspaces have identical scores, we select the one with the largest sum of sensitivity indicators. Note that for a problem with d uncertain inputs, we have 2 d − 1 sensitivity indices in total: d for each individual input, d(d − 1)/2 for pairs of input interactions and so forth up the index measuring the sensitivity of the interaction of all d inputs.

When d is small to moderate, e.g., d ≤ 15, we can exhaustively compute and use all 2 d − 1 sensitivity indices in each refinement step and thus prescribe 2 d − 1 associated tolerances. For larger values of d, however, 2 d − 1 is prohibitively large. We can, nonetheless, exploit that in most practical applications it is unlikely that interactions beyond a few, e.g., two or three parameters are important and can therefore account for these interactions only in our refinement procedure, thus avoid computing all 2 d − 1 indices. Furthermore, if the prescribed tolerances do not provide the desired accuracy, they can be sequentially decreased at no additional cost since our approach is hierarchical.

Our sensitivity-driven refinement policy therefore explores and exploits the sensitivities of individual inputs and interactions thereof, ensuring that the important stochastic directions are preferentially refined. In this way, the obtained multi-index set will emulate the structure of the underlying problem. For example, if we have d = 20 uncertain input parameters but only three are important and, furthermore, only four interactions are important as well, our approach will exploit this structure and construct a multi-index set having more multi-indices in the directions corresponding to the three important individual parameters and four interactions. In contrast, if all 20 inputs are important, we will have many multi-indices in all 20 directions. Furthermore, our method can be trivially incorporated in the underlying simulation pipeline since it only requires (i) the ability to prescribe the simulation inputs by, e.g., accessing the parameters/configuration file and (ii) the value of the output of interest. In this way, the framework can be easily used on a wide range of computing systems, ranging from laptops to large supercomputers.

At the end of the adaptive process, the sensitivity-driven method yields (i) statistics such as the mean and variance of the output of interest, (ii) the sensitivity indices of all individual parameters and either of all interactions -if d is moderately large -or of a subset of interactions, and (iii) an interpolation-based reduced model for the parameters-to-output of interest mapping. We provide, in Figure 1 , a visual representation of our framework through an example in which we consider, for simplicity, a setup with d = 3 uncertain inputs.

Simulations of turbulence in the near-edge region of fusion devices

To demonstrate the capabilities of our sensitivity-driven approach, we employ it to study turbulent transport in magnetic confinement devices, such as tokamaks or stellerators. This is a paradigmatic example of a situation in which UQ and SA are cleared called for but most standard approaches are infeasible due to the large computational cost of the associated simulations. The experimental error bars of various input parameters (such as the spatial gradients of the density and temperature of a given plasma species) can be relatively large, on the order of a few ten percent, which makes the SA task especially valuable since understanding the impact of these uncertainties is critical. Moreover, ascertaining the impact of variations in parameters that characterize the confining magnetic field is crucial as well (e.g., for design optimization).

As a practically relevant example of such simulations, we focus here on the near-edge region of fusion experiments, which is recognized as crucial for setting the overall performance of these devices. To achieve core temperatures and densities which are sufficiently high to yield self-heated ('burning') plasmas in future power plants, it is necessary to create and sustain a region of steep gradients in the edge, known as the uniform uncertain inputs θ = (θ1, θ2, θ3) ∈ [0, 1] 3 such that θ3 is the most important, θ1 is the second most important and θ2 is significantly less important than the other two. Moreover, we have that θ1 and θ3 interact rather strongly. We begin in the upper right corner by prescribing d 2 − 1 = 3 2 − 1 = 7 tolerances. In the following, we describe an adaptive step in the latter stages of the refinement process. Going from top right to top left, we compute the sensitivity scores of the candidate subspaces for refinement (active set, depicted in green). If the imposed tolerances are not reached, we have non-zero scores and therefore refine the subspace with the largest score (the subspaces added upon refinement are depicted in purple). We see that the multi-index set L (top center) reflects the structure of the underlying problem: most multi-indices are in the 3 and 1 directions (the two most important), as well as in the ( 1, 3) plane, which is due to the strong interaction between θ1 and θ3. In the current refinement step, the algorithm adds three subspaces in the second (the least important) direction. The corresponding grid, depicted on the left of the multi-index set, contains only 21 points. This low number of grid points and hence of necessary high-fidelity simulations is key to enabling UQ and SA at scale. The three refinement subspaces contain three points, indexed by 19, 20, and 21. We next evaluate the high-fidelity model at these three refinement points. In the case of large-scale applications, these simulations often require supercomputers (bottom left). Note that these evaluations are independent and hence can be executed in parallel. In addition, if the underlying highfidelity code is executed in parallel as well, we can have multiple layers of parallelism. We then determine the three corresponding outputs of interest, denoted by y hi−fi . Subsequently, we calculate the sensitivity scores of the three refinement subspaces. If we have non-zero scores, the refinement process continues. Otherwise, it ends here, and we can compute statistics (such as expected values and variances of outputs) and local or interaction sensitivity indices. Moreover, our approach intrinsically provides a reduced model ySG of the parameters-to-output of interest mapping y hi−fi , which can be used in further studies (bottom right). courtesy of EUROfusion). Here, a hot hydrogen plasma is confined in a doughnut-like shape with the aid of strong magnetic fields. However, the magnetic confinement is not perfect: turbulent fluctuations driven by micro-instabilities cause heat losses from the hot core towards the colder edge (second figure from the left). In so-called high-confinement (H-mode) discharges, one can induce the formation of a near-edge region characterized by reduced transport and steep gradients (third figure from the left). The properties of this pedestal are influenced by the residual turbulent transport, which can be calculated, e.g., by means of the Gene code (right-most figure) . Performing UQ and SA in this context is a vital yet nearly impossible task with standard methods due to the enormous computational cost of each simulation together with the large set of input parameters.

pedestal; see Figure 2 .

The formation of the pedestal is an extremely complex process, and its understanding is still incomplete at this point. A known key element in this context is plasma turbulence, which develops within the pedestal due to the very large spatial changes of density and temperature, inducing turbulent transport and hence contributing to its self-regulation. An important driver for this kind of dynamics is a plasma micro-instability called the Electron Temperature Gradient (ETG) mode, which tends to operate on sub-mm scales in planes perpendicular to the background magnetic field. Quantifying the impact of ETG turbulence on the pedestal structure is of high practical relevance, as it can aid in the design of configurations with improved energy confinement.

We consider a numerical setup modeling a specific pedestal of the DIII-D tokamak and investigate the edge plasma at a normalized radius of ρ = 0.95. Simulations with the gyrokinetic turbulence code Gene [21] show that ETG modes are the main drivers of turbulent transport under these conditions. The employed grid in five-dimensional position-velocity space consists of 256 × 24 × 168 × 32 × 8 = 264, 241, 152 points. The simulations are performed using 16 compute nodes, i.e., a total of 896 cores on the Frontera supercomputer at the Texas Advanced Computing Center at The University of Texas at Austin 1 . With this setup, the average run time exceeds 8, 000 core-hours; the smallest was about 4, 000 core-hours, while the largest run time exceeded 14, 000 core-hours. We refer the reader to the Methods section for more details about the employed gyrokinetic model and simulation code Gene.

The experimental parameters necessary to determine the transport levels caused by ETG modes are the spatially local values of electron temperature T e (I) and density n e (II), together with their normalized inverse scale-lengths ω Te (III) and ω ne (IV). We also consider the electron-to-ion temperature ratio τ (V) and account for the effects of plasma impurities via an effective ion charge Z eff (VI). Basic properties of the magnetic geometry are characterized via the safety factor q (VII) and the magnetic shearŝ (VIII). This leaves us with a total of eight uncertain parameters, summarized in Table 1 . We model these inputs as uniform random variables with left and right bounds (columns three and four) as follows: the first two parameters are varied by 10% around their nominal value (the center/mean of the uniform interval, showed in the second column) whereas the remaining six inputs are varied by 20% around their nominal values. The Gene output is the electron heat flux calculated over a sufficiently long time interval to collect statistics (see the Methods section for more details . From here, we compute the corresponding time-averaged radial heat flux across a given magnetic surface, denoted by Q hi−fi and measured in Megawatts (MW), which represents the output of interest in our simulations. Our main goal in the following is to show that our sensitivity-driven approach enables an efficient UQ and SA in these simulations which would otherwise be impossible with standard methods.

Accurate uncertainty propagation and sensitivity analysis at a cost of only 57 high-fidelity simulations

The employed tolerances in our sensitivity-driven approach are 1 63 × 10 −4 , which were sufficiently small for our purposes. Here, 1 63 denotes a vector with 63 unity entries and 63 = 2 8 − 1 is the total number of sensitivity indices used in each refinement step. Remarkably, our approach required only 57 high-fidelity simulations to reach the prescribed tolerances. This low number is due to the ability of the sensitivitydriven approach to explore and exploit that, as depicted in Figure 3 , (i) only four of the total of eight uncertain parameters are important, with two parameters -the density and temperature scale-lengthsbeing significantly more important that the remaining six parameters and (ii) the important parameters interact anisotropically. These findings are consistent with generic qualitative expectations. The remarkable feat of the the sensitivity-driven approach is that it is able to learn and exploit this inherent structure automatically and non-intrusively.

To put into perspective the cost in terms of high-fidelity simulations of our approach, a standard full tensor grid-based method with only three points per dimension, e.g., one for the center of the domain and two other for the extrema -assuming that the input density is symmetric and with bounded supportentails a total of 3 8 = 6, 561 high-fidelity simulations. While this number might not be considered high for model problems, it would require roughly 53 million core-hours for the scenario under consideration, which is computationally prohibitive. In contrast, our adaptive sensitivity-driven approach required roughly 460, 000 core-hours in total, i.e., a factor of 115 less in comparison, which is computationally feasible. In general, in large scale applications it is unrealistic to perform more than a handful of runs, which is in par with what our sensitivity-driven approach typically requires.

For the UQ analysis, we obtain a mean heat flux estimate E[Q hi−fi ] ≈ 0.7530 MW and a variance estimate Var[Q hi−fi ] ≈ 0.2571 MW 2 . The local sensitivity indices as well as all interactions are visualized in Figure  3 . We note that being able to compute the sensitivities of the parameter interactions goes beyond what standard UQ and SA techniques used in plasma physics research offer. We visualize these interactions in the four pie charts in Figure 3 . We see that the four important parameters interact with each other, the strongest interaction being between the two scale-lengths. In the four pie charts, we visualize all non-negligible interactions (> 0.5%), which exist between pairs of two of the four important parameters. We denote the sensitivity index due to the interaction between two parameters θ1 and θ2 by S θ 1 ,θ 2 noting that it is symmetric by definition, i.e., S θ 1 ,θ 2 = S θ 2 ,θ 1 . Moreover, we denote by S θ,others all remaining interactions involving a parameter θ. The strongest interaction occurs between the two most important individual parameters, ωT e and ωn e , with Sω Te ,ωn e = 0.049. The second strongest interaction is between ωT e and τ , with Sω Te ,τ = 0.011. All other interactions have indices smaller than 0.01. Our approach is able to exploit this lower dimensionality and anisotropic coupling of the uncertain inputs online, via our sensitivity-driven refinement policy, which is key to it requiring a total of only 57 simulations for UQ and SA.

We additionally obtain, at no extra computational cost, an interpolation-based reduced model of the parametersto-(time-averaged) heat-flux mapping. This reduced model can be further used, e.g., in an optimization procedure or as a low-fidelity model in a subsequent multi-fidelity [26] UQ study at, e.g., a neighboring radial position. However, our goal in the following is to ascertain the efficiency of this reduced model to further emphasize the generality and efficacy of our sensitivity-driven dimension-adaptive sparse grid interpolation approach in the nonlinear scenario under consideration. We draw 32 pseudo-random test samples from the eight-dimensional input uniform density and use them to evaluate both the high-fidelity nonlinear model and our reduced sparse grid interpolation model. Note that the large computational cost of the high-fidelity model prohibits using a large number of testing samples. Nonetheless, as visualized in Figure 4 , the heat fluxes at the 32 testing parameter sets vary broadly, from roughly 0.1 to 2.6 MW, indicating that the testing samples are well distributed in the input domain. We denote here the time-averaged heat-flux computed with the reduced model by Q SG . The results are displayed in Figure 4 . We see that the values yielded by the reduced model closely match the high-fidelity values, suggesting that the reduced model is accurate. To quantify the accuracy of the reduced model, we compute its mean-squared error (MSE)

which confirms that it is indeed accurate. Moreover, the average evaluation cost of the reduced model is c SG = 9.4046 × 10 −3 seconds, i.e., nine order of magnitude smaller than the average evaluation cost of the high-fidelity model. We therefore conclude that the obtained reduced model, constructed at a cost of only 57 high-fidelity evaluations, is also extremely efficient.

It is well established that UQ and SA are essential in the context of accurate, predictive numerical simulations of real-world systems. However, in many cases, the computational cost of a single simulation tends to be large, making UQ and SA very challenging and rendering most standard methods computationally infeasible.

In the present article, we have demonstrated that these challenges can be overcome with the help of our recently formulated sensitivity-driven dimension-adaptive sparse grid interpolation framework, which enables UQ and SA at scale. It does so by exploring and exploiting -via sensitivity-driven adaptivity -the structure of the underlying problem in terms of lower intrinsic dimensionality and anisotropic coupling of the uncertain input parameters. The framework is fully non-intrusive and requires only the capability to prescribe the input parameters in the underlying high-fidelity code and the value of the output of interest. Furthermore, the method is generic and applicable to a broad spectrum of problems, and it can be used on a wide range of computing systems, from laptops to high-performance supercomputers. Examples of practically relevant applications that can benefit from our framework include -but are not limited to -tsunami or earthquake simulations, climate modeling, combustion in rocket engines, or models of epidemics dynamics.

Here, we have demonstrated the power and usefulness of our framework in the context of magnetic confinement fusion research. Our focus was on the challenging and practically relevant question on the nature of turbulent transport in the edge region of tokamak devices. The present study was the first of its kind, applying multi-dimensional UQ and SA to first-principles based turbulence simulations of fusion plasmas. In a scenario comprising more than 264 million degrees of freedom and eight uncertain parameters, for which a single simulation required (on average) more than 8, 000 CPU hours on 896 compute cores on the Frontera supercomputer, our framework required a mere total of 57 high-fidelity simulations for UQ and SA. In addition, it intrinsically provided an accurate interpolation-based reduced model of the parameters-to-output of interest mapping that was nine orders of magnitude cheaper than the high-fidelity model. This reduced model can be further used, for example, for optimization purposes or as a low-fidelity model in a subsequent multi-fidelity UQ study. Note note that in the context of the simpler linear gyrokinetic simulations concerning turbulence suppression by energetic particles, we have recently showed that the sensitivity-driven approach can be effectively used as surrogate model for optimization [12] or as a low-fidelity model in a multi-fidelity study [22] .

Since our method is based on globally defined interpolation polynomials, its main drawback is that it is generally not applicable to problems characterized by discontinuities or sharp gradients in the parametersto-output of interest mapping. A possible remedy is an extension to instead using basis functions with local support, e.g., wavelets, which is subject to our ongoing research.

Let f hi−fi : X → Y denote the underlying high-fidelity model. The domain X ⊂ R d denotes the set of the d ∈ N uncertain inputs θ = [θ 1 , θ 2 , . . . , θ d ] T which parametrize the high-fidelity model. The scalarvalued output y = f hi−fi (θ) ∈ Y ⊂ R. We note that the presented methodology can trivially employed for multi-variate outputs as well by, e.g., applying it to each output component separately. We assume that f hi−fi is bounded and measurable with respect to the Lebesgue measure and the Borel σ-algebra on R. We make use of the following two assumptions: (i) the set X of uncertain inputs has a product structure, i.e., X = d i=1 X i , and the input density, π, has a product structure as well, that is, π(θ) = d i=1 π i (θ i ), where X i is the image of the uni-variate density π i associated with input θ i . In other words, the d uncertain parameters are assumed to be independent. However, we note that this assumption can be relaxed, by using, for example, a (possibly nonlinear) transformation that makes the inputs independent, such as a transport map [24] .

Let = ( 1 , 2 , . . . d ) ∈ N d denote a multi-index and let L ⊂ N d be a set of multi-indices. The d-variate sparse grid approximation of f hi−fi is defined as

where

are the so-called hierarchical surpluses, with

The multi-index set L needs to allow for all terms in the hierarchical surpluses (2) to be computed. Such a multi-index set is called admissible or downward closed, i.e., it does not contain "holes".

To fully specify the sparse grid approximation (1), we need three ingredients: (i) the one-dimensional approximation operators U i i , i = 1, 2, . . . , d; (ii) the grid points with which we compute these 1D approximation and (iii) the multi-index set L. In our method, the underlying operation is Lagrange interpolation, implemented in terms of the barycentric formula for improved numerical stability. We note that our approach can also be employed for other approximation operations, such as spectral projection or quadrature. The interpolation points are weighted (L)-Leja points. For a continuous function g : X i → R, we define the univariate interpolation polynomial U i i associated to i as:

where {θ n } i n=1 are weighted (L)-Leja points computed w.r.t. the density π i :

and {L n (θ)} i n=1 are Lagrange polynomials of degree n − 1 satisfying the interpolation condition L n (θ m ) = δ nm , where δ nm is Kronecker's delta function. When π i is a uniform density with support [a, b] ⊂ R, as in our numerical experiments, we set θ 1 = (a + b/2). We employ (L)-Leja sequences for four main reasons. First, they are an interpolatory sequence, meaning that n (L)-Leja points uniquely specify a polynomial of degree n − 1. Note that even though here we employ (L)-Leja points at level , (L)-Leja sequences are arbitrarily granular and therefore other growth strategies (with respect to ) can be employed as well. Second, (L)-Leja points are hierarchical, meaning that evaluations from previous levels can be reused. In our context this means that adjacent levels differ by only one (L)-Leja point. Third, (L)-Leja sequences lead to accurate interpolation approximations [25] . These three properties make (L)-Leja sequences an excellent choice for higher-dimensional interpolation approximations. Finally, (L)-Leja points can be constructed for arbitrary probability densities.

Let P denote the set all d-variate degrees p = (p 1 , p 2 , . . . , p d ) with 0 ≤ p i ≤ i − 1. In addition, denote by θ p = (θ p1 , θ p2 , . . . , θ p d ) the d-variate (L)-Leja point associated with a d-variate degree p. The multi-variate interpolation approximation (3) associated to the multi-index reads

where the d-variate

, which follows from the independence assumption of the uncertain inputs.

To fully define the sparse grid interpolation approximation (1), we need to specify the third and most important ingredient, the multi-index set L, which we determine online via our sensitivity-driven dimensionadaptive strategy.

We begin by determining the equivalent spectral projection representation of the multi-variate interpolation operators (5):

are orthonormal polynomial with respect to the input density π and c p are the corresponding spectral coefficients. For example, if π is the uniform distribution, as in our numerical experiments, Φ p (θ) are Legendre polynomials. To determine the spectral coefficients c p , we simply solve

for all (L)-Leja points θ k associated to the multi-index . Once we have determined the spectral coefficients c p , we can rewrite the hierarchical interpolation surpluses (2) in terms of hierarchical spectral projection surpluses:

with the convention ∆c 0 = c 0 . We rewrite hierarchical interpolation surpluses in terms hierarchical projection surpluses (7) because the latter allow to trivially compute the desired sensitivity information which we use to drive the adaptive process. Specifically, from Parseval's theorem and the equivalence between spectral projections and Sobol' decompositions [34] introduced in reference [35] , we have that can be arbitrarily small, in which case such a division would be close to the indeterminate operation 0/0. We compute the equivalent hierarchical spectral projection surpluses (7) and their L 2 norm (8) for all candidate multi-indices for refinement. We then use the unnormalized Sobol' indices given by the L 2 norms (8) to compute our refinement indicator, which is an integer s ∈ N, referred to as the sensitivity score. Initially, s = 0. Since (8) comprises 2 d − 1 unnormalized Sobol' indices in total, we prescribe 2 d − 1 user defined tolerances tol = (tol 1 , tol 2 , . . . , tol 2 d −1 ) which are a heuristic for the accuracy with which we want the algorithm to explore the d individual directions and all their 2 d − d − 1 interactions. We compare the mth term in (8) with tol m and if this tolerance is exceed, s is increased by one. In other words, if an individual parameter or an interaction are important in a candidate subspace for refinement -as compared to the prescribed tolerance -the sensitivity index will reflect this information. Therefore, s can be at most 2 d − 1, which, however, is very unlikely in practice since usually only a handful of parameter interactions are important. In our experiments, we have used tol = 10 −4 × 1 63 , which were sufficiently small for our purposes. We note that when d is large, 2 d −1 will be prohibitively large making the computation of all 2 d −1 sensitivities in (8) . Nevertheless, as noted above, in most practical applications it is unlikely that pairings beyond few, e.g., two, three parameters are important and therefore (8) can be truncated to comprise only these interactions.

We refine the multi-index with the largest sensitivity score noting that if two or more multi-indices have identical scores, we select the one with the largest variance ∆Var [f hi−fi ]. Refining a multi-indices means that (i) it is moved to the old index set O and (ii) all its forward neighbours + e i with i = 1, 2, . . . , d that maintain L admissible are added to the active set A. Note, therefore, that we can have at most d refinement points per refinement step, whose corresponding high-fidelity evaluations can be performed embarrassingly parallel. The refinement ends if either the prescribed tolerances tol are reached, which is the desired scenario, if A = ∅ or if a user-defined maximum level L max is reached. In our experiments, we used L max = 20.

Upon termination, we obtain the multi-index set L and therefore the approximation (1) is fully defined. This approximation intrinsically provides a reduced model for the parameter-to-output of interest mapping. In addition, our approach also provides statistics such as expectation and variance of the output of interest, and sensitivity indices of individual inputs and interactions, computed as follows. Let N ∈ N denote the cardinality of L and let m = ( m1 , m2 , . . . , md ) denote the mth multi-index in L with 1 = (1, 1, . . . , 1) . The index m is incremented as the adaptive refinement process adds multi-indices to L such that N is added last. Let P L = {p m := ( m1 − 1, m2 − 1, . . . , md − 1) : m ∈ L} denote the set of multi-variate degrees of the equivalent global spectral projection basis. We can re-write the interpolation approximation (5) as

where the spectral coefficients ∆c p m are computed analogously to (7). To simplify the notation in the following, denote P * L = P L \ {p 1 }. We estimate the expectation and variance of the high-fidelity model using these coefficients as [40] 

In addition, the (normalized) Sobol' indices corresponding to individual parameters (local sensitivity indices) are computed as [31] S where J i = {p m ∈ P * L : p mk = 0, ∀k = i}. Analogously, indices corresponding to interactions of two parameters θ i and θ j are computed as

where J ij = {p m ∈ P * L : p mk = 0, ∀k = i ∧ k = j}, and so forth for all other interactions.

Gyrokinetic theory [6] provides an efficient description of low-frequency, small-amplitude, small-scale turbulence in strongly magnetized plasmas. Here, the fast gyromotion is removed from the equations, and electrically charged particles are effectively replaced by respective rings which move in a weakly inhomogeneous background magnetic field and in the presence of electromagnetic perturbations. This process reduces the kinetic description of the plasma from six to five dimensions (three spatial and two velocity space coordinates of the gyrocenters) and, even more importantly, removes a number of extremely small, but irrelevant spatio-temporal scales from the problem.

In gyrokinetics, each plasma species s is described by a distribution function F s (X, v , µ, t) whose dynamics is governed by the following equation:

Here, X is the gyrocenter position, v is the velocity component parallel to the background magnetic field B = Bb, and µ is the magnetic moment (a conserved quantity in the collisionless limit). C denotes a collision operation describing inter-and intra-species interactions. In the numerical experiments carried out in the present work, we used a linearized Landau-Boltzmann collision operator. The corresponding equations of motion for a gyrocenter of a particle with mass m and charge q reaḋ

where v ∇B = (µ/(mΩB)) B × ∇B is the grad-B drift velocity, v κ = (v 2 /Ω) (∇ × b) ⊥ is the curvature drift velocity, and v E = (1/B 2 ) B×∇(φ−v Ā ) is the generalized E×B drift velocity. Here, Ω = qB/m is the gyrofrequency, and B * is the parallel component of the effective magnetic field B * = B+ B Ω v ∇×b+∇× bĀ . Finally,φ andĀ are the gyroaveraged versions of the electrostatic potential and the parallel component of the vector potential, which are self-consistently computed from the distribution function. Assuming a static background distribution function F B,s (X, v , µ), which allows for the decomposition F s (X, v , µ, t) = F B,s (X, v , µ)+f s (X, v , µ, t), φ can be calculated via the Poisson equation, which -expressed at the particle position x -reads

while A is obtained by solving the parallel component of Ampère's law for the fluctuation fields:

Expressions for connecting f s (X, v , µ) and f s (x, v , µ) can be found in the literature. [6] Note that in these formulas, the time dependence has been suppressed for simplicity.

Our high-fidelity model of plasma turbulence is the Eulerian gyrokinetic code Gene [21] , which solves the coupled system of equations (9), (12) and (13) on a fixed grid in five-dimensional phase space using a mix of spectral, finite difference, and finite volume methods. In order to adapt the simulation volume to the dominant influences of the background magnetic field, Gene employs a field-aligned [9] coordinate system (x, y, z): x and y are the two directions perpendicular to the magnetic field whereas z parametrizes the position along B.

When simulating ETG turbulence, the scale separation between turbulent and ambient characteristic spatial lengths is usually large, allowing us to simulate only a relatively small volume around a given field line. This is the so-called flux-tube limit, which is widely used in studies of plasma turbulence, including the present work. Variations of background fields in the direction perpendicular to the magnetic field across the simulation domain are neglected and replaced by constant gradients. Furthermore, periodic boundary conditions are used for both perpendicular directions x and y, which are therefore numerically implemented with spectral methods (we will refer in the following to the corresponding Fourier modes as k x and k y ).

The magnetic geometry used for all our simulations has been constructed modeling the DIII-D 2 tokamak, using a Fourier decomposition of the magnetic surface at ρ = 0.95 of an experimental magnetohydrodynamic equilibrium. This allows us to specify the values of the safety factor q (the number of toroidal transits per single poloidal transit a given field line winds around the torus) and magnetic shearŝ = (x/q) (dq/dx) independently from the flux surface shape, which is kept fixed in all our numerical experiments. Plasma profiles are chosen representative of a pedestal. The absolute value of density and temperature n e and T e , as well as the corresponding normalized inverse scale-lengths ω Te = −(R 0 /T ) (dT /dx) and ω ne = −(R 0 /n) (dn/dx), have been considered uncertain. Here R 0 = 1.6 m indicates the major radius of the tokamak. Collisionality is computed consistently with the plasma parameters, and we also include a nonunitary Z eff = i n i Z 2 i /n e , where the sums are over all ion species, to account for impurities. Similarly, the consistent value of β e = 2µ 0 n e T e /B 2 0 is adopted to describe electromagnetic fluctuations. Finally, we have considered a deuterium-electron plasma and assumed an adiabatic response for the ions, i.e., n 1,i = −n e q i φ/T i , j ,i = 0 while retaining a non-unitary value of τ = Z eff T e /T i , such that we need to simulate only the evolution of the electron distribution function.

All high-fidelity runs have been carried out using a box with n kx ×n ky ×n z ×n v ×n µ = 256×24×168×32×8 degrees of freedom. Moreover, we set k y,min ρ s = 7 and L x 2.7 ρ s , where ρ s = c s /Ω is the ion sound radius and c s = T e /m i is the ion sound speed. In velocity space, a box characterized by −3 ≤ v /v th ≤ 3 and 0 ≤ µT /B ≤ 9 has been used. Convergence tests have been performed considering in particular all the extrema of the parameter space explored.

For a given set of input parameters {n e , T e , ω ne , ω Te , q,ŝ, τ, Z eff }, Gene explicitly evolves in time f e (considering a local Maxwellian for F B ) allowing turbulence to fully develop until reaching a quasi-steady-state. The steady state of the system is simulated long enough to collect sufficient statistics, typically for a few R 0 /c s units. Throughout the simulation, the turbulent heat flow is evaluated as

where dΣ is the surface element of the surface ∂S, in our case the flux surface at ρ = 0.95, and q(t) is the instantaneous energy flux induced by the generalized E × B advection, i.e., accounting for both electrostatic and electromagnetic perturbations:

Simulated fluxes are averaged over the quasi-steady-state phase of the run and the result provides the highfidelity output of interest Q hi−fi in our numerical experiments.

A multivariate statistical approach to predict covid-19 count data with epidemiological interpretation and uncertainty quantification

The need for uncertainty quantification in machineassisted medical decision making

Dynamic Programming

A survey of projection-based model reduction methods for parametric dynamical systems

Modeling and simulation

Foundations of nonlinear gyrokinetic theory

Extreme-scale uq for bayesian inverse problems governed by pdes

Bayesian inference of heterogeneous epidemic models: Application to covid-19 spread accounting for long-term care facilities

Flux coordinates and magnetic field structure: a guide to a fundamental tool of plasma theory

Sensitivity-driven adaptive sparse stochastic approximations in plasma microinstability analysis

Context-aware Model Hierarchies for Higher-dimensional Uncertainty Quantification. Dissertation

Turbulence suppression by energetic particles: a sensitivity-driven dimension-adaptive sparse grid framework for discharge optimization

Computational challenges in magnetic-confinement fusion physics

Gyrokinetic simulations of turbulent transport

Dimension-adaptive tensor-product quadrature

Stochastic Finite Elements: A Spectral Approach

Learning physics-based models from data: perspectives from inverse problems and model reduction

Sensitivity analysis and uncertainty quantification on aluminum particle combustion for an upward burning solid rocket propellant

Adaptive sparse grids

Uncertainty quantification and propagation in computational materials science and simulation-assisted materials design. Integrating Materials and Manufacturing Innovation

Electron temperature gradient driven turbulence

Datadriven low-fidelity models for multi-fidelity monte carlo sampling in plasma micro-turbulence analysis

A stochastic newton mcmc method for largescale statistical inverse problems with application to seismic inversion

Sampling via Measure Transport: An Introduction

Adaptive leja sparse grid constructions for stochastic collocation and high-dimensional approximation

Survey of Multifidelity Methods in Uncertainty Propagation, Inference, and Optimization

Embedded ensemble propagation for improving performance, portability, and scalability of uncertainty quantification on emerging computational architectures

Uncertainty quantification in climate modeling and projection

Quantification of uncertainty in computational fluid dynamics

Sensitivity Analysis

Global Sensitivity Analysis: The Primer

High performance uncertainty quantification with parallelized multilevel markov chain monte carlo

Uncertainty Quantification: Theory, Implementation, and Applications. Society for Industrial and Applied Mathematics (SIAM)

Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates

Global sensitivity analysis using polynomial chaos expansions

Uncertainty quantification in epidemiological models for the covid-19 pandemic

Extreme scale multi-physics simulations of the tsunamigenic 2004 sumatra megathrust earthquake

The imperative of physics-based modeling and inverse theory in computational science

Computational medicine: Translating models to clinical care

Numerical Methods for Stochastic Computations: A Spectral Method Approach

All authors have been supported by the Exascale Computing Project (No. 17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The authors also gratefully acknowledge the compute and data resources provided by the Texas Advanced Computing Center at The University of Texas at Austin https://www.tacc.utexas.edu.