key: cord-0456309-8d23tj1l
authors: Moramarco, Graziano
title: Optimal Regime-Switching Density Forecasts
date: 2021-10-26
journal: nan
DOI: nan
sha: f666f9b96d83f1e99cdb323104b303b4acebc0e4
doc_id: 456309
cord_uid: 8d23tj1l

This paper proposes an approach for enhancing density forecasts of non-normal macroeconomic variables using Bayesian Markov-switching models. Alternative views about economic regimes are combined to produce flexible forecasts, which are optimized with respect to standard objective functions of density forecasting. The optimization procedure explores both forecast combinations and Bayesian model averaging. In an application to U.S. GDP growth, the approach is shown to achieve good accuracy in terms of average predictive densities and to produce well-calibrated forecast distributions. The proposed framework can be used to evaluate the contribution of economists' views to density forecast performance. In the empirical application, we consider views derived from the Fed macroeconomic scenarios used for bank stress tests.

In recent years, it has become essential for forecasting institutions to characterize the uncertainty around their point forecasts by assigning probabilities to a range of possible economic outcomes. Accordingly, generating economic predictions in the form of continuous probability distributions, or density forecasts, is now common practice (Elliott and Timmermann 2016).

The task of forming reliable density forecasts for macroeconomic variables is a challenging one, which requires accounting for the departures from normality that are often observed empirically. In this respect, econometric research has shown that gains in density forecast performance can often be achieved by combining different predictive distributions (Hall and At the same time, as the global financial crisis and the COVID-19 crisis have highlighted, macroeconomic projections should in general allow for the possibility of abrupt changes or regime shifts occurring in the economy, whether they be outbreaks of financial instability, political changes or pandemics. Relatedly, while many economic agents, such as financial institutions, routinely evaluate their potential losses as random draws from continuous distributions, macroeconomic outlooks are often reduced to a limited number of distinct scenarios or regimes (e.g., Moody's 2017) . This logic facilitates communication regarding economic uncertainty and finds important practical applications, e.g., in the design of bank stress tests which are now integral part of the financial regulatory framework and risk management practices in major economies (e.g., Federal Reserve 2018). The specific characteristics of different economic regimes are themselves subject to uncertainty, and a great deal of qualitative assessments are generally required to define macro scenarios, giving rise to different views or beliefs that may be considered when producing density forecasts. This paper develops an approach to enhance density forecasts for macroeconomic variables using regime-switching models. In this approach, density forecasts are constructed by pooling alternative assumptions (views) on economic regimes or scenarios. The composition of such forecasts is optimized with respect to standard evaluation criteria for density forecasts, such as the log predictive scores and a test of uniformity for probability integral transforms (PIT). Views differ in terms of the assumed number of (unobserved) regimes and/or in terms of priors on the parameters governing the economy under different regimes. Two pooling methods are explored: ex-post combinations of density forecasts from different views and Bayesian averaging of views. Based on the past performance of forecasts, an optimization procedure selects forecast weights or Bayesian prior probabilities to be used for forecasting future periods. The resulting mixture forecasts are evaluated and compared to alternative approaches by means of a recursive out-of-sample forecasting exercise. Empirically, the approach is illustrated using a Markov-switching autoregressive model (MSAR) for U.S. GDP growth, considering both vague views and strong views derived from the Fed macroeconomic scenarios used in the bank stress tests 2015-2018. In the application, the approach is found to be especially useful to improve the calibration of forecast distributions. In this respect, it outperforms a number of alternative approaches by generating PITs that are well-behaved according to several criteria. At the same time, the proposed method achieves good accuracy in terms of log scores, in line with the best alternative methods.

The approach is intended to deal with non-normality by producing extremely flexible regimes using a nonparametric Dirichlet process. However, when estimating a model for U.S. GDP growth (with different breaks for the mean and variance parameters), they find that the posterior probability that the number of regimes is at most 5 lies between 98% and 100% for the mean parameters and between 74% and 100% for the variance, depending on the prior used for estimation. optimal density forecast combinations (Hall and Mitchell 2007 , Geweke and Amisano 2011 , Ganics 2017 . In fact, the proposed approach can be thought as a convenient alternative to forecast combinations of different models, since it combines views on a single Markovswitching model. Given its ability to produce highly flexible approximations of unknown distributions by means of finite mixtures of normals, it can also be seen as a parsimonious alternative to nonparametric methods. In addition, it differs from approaches that assume non-normal errors (e.g., Hansen 1994) , in that it allows for a clear economic explanation of non-normality based on different macroeconomic regimes.

While the available evidence on the point forecast performance of regime-switching models Density forecasts can be evaluated using several criteria (see Corradi and Swanson 2006, Elliot and Timmermann 2016 for reviews). This paper adopts two of most popular criteria as objective functions to build optimal composite forecasts. The first one is the log score, which measures the ability to assign high probabilities to outcomes that are truly likely to be observed. The second one is a uniformity test on the sequence of PITs, which provides a measure of the calibration of the forecasts. 2 Both measures have been used to compute forecast combinations. Hall and Mitchell (2007) pioneered density forecast combinations using log scores. Geweke and Amisano (2011) use the log scores to combine five different models of stock returns. Ganics (2017) provides theoretical results on the use of PITs for optimal forecast combinations and presents an empirical application using linear autoregressive distributed lag (ARDL) models of industrial production. Finally, to evaluate the results, two other measures of correct calibration are also considered, namely two tests of independence based on the first two moments of the PITs (Rossi and Sekhposyan 2014).

The remainder of the paper is organized as follows: Section 2 explains the methodology, Section 3 introduces the empirical application and presents the results, and Section 4 concludes.

2 A well-calibrated forecast is one that does not make systematic errors: if p is the predicted probability assigned to a given random event, then that event should empirically occur with frequency p 

where S t is the unobserved state variable at time t, β St is the intercept in regime S t , α j for j = 1, . . . , p is a state-independent autoregressive term, 3 p is the maximum lag, ε t is the error term and σ 2 St is the regime-dependent variance of the error. In particular, S t is a Markov chain characterized by a transition matrix ξ, where the element ξ kj in row k and column j represents the probability of transition from state k to state j:

with k, j = 1, . . . , K, where K is the number of regimes in the economy. Therefore, the MSAR captures the typical autocorrelation of macro variables in two ways: by means of the autoregressive coefficients in (1) and through the persistence in the state variable S t as expressed by the transition matrix. Finally, let ϑ denote the vector of parameters of the MSAR model, i.e. ϑ = (β 1 , . . . , β K , σ 1 , . . . , σ K , α 1 , . . . , α p , ξ), and let θ = (β 1 , . . . , β K , σ 1 , . . . , σ K , α 1 , . . . , α p , ).

This section summarizes the Bayesian approach to the estimation of Markov-switching models following Frühwirth-Schnatter (2006) and adopting her notation. Let us define y = (y 0 , y 1 , . . . , y T ) and S = (S 0 , S 1 , . . . , S T ). The posterior distribution p(ϑ|y) for model (1) is obtained using Bayes' theorem:

where p(ϑ) is the prior on the parameters and p(y|ϑ) is the likelihood function, which in this case is a Markov mixture of normals. Treating S as data, the Markov mixture likelihood can be expressed as the sum of the complete-data likelihood p(y, S|ϑ) over all possible values of the state vector S: 

In practice, Bayesian estimation samples from the joint posterior p(S, ϑ|y), using:

In line with the estimation framework presented so far, the MSAR (1) is estimated here using MCMC methods and assuming independence priors of the following form:

The priors follow conventional distributions, which are:

where N and G −1 denote Normal and inverse Gamma distributions, respectively, and b 0,k , B 0,k , c 0 , C 0 , a j,0 , A j,0 are hyperparameters to be selected by the researcher.

In addition, for the transition matrix ξ it is assumed that the rows are independent and each row follows a Dirichlet distribution D:

where e k1 , . . . , e kK are hyperparameters, for k = 1, . . . , K.

The number of regimes is also treated as unknown. Accordingly, a discrete prior is defined for K, fixing a maximum number K:

Note that the letter π will be used throughout the text to denote discrete probability distributions.

Next, for any given number of states K, a number P K of alternative priors on the MSAR parameters are considered. Each prior is identified by a specific set of values for the hyperpa-

In other words, a discrete hierarchical prior is defined with respect to ϑ. The unconditional prior probability of ϑ 0 K,i is equal to the joint prior probability of ϑ 0 K,i and the number K of regimes, i.e. π(ϑ 0 K,i ) = π(ϑ 0 K,i , K). Using π 0 K,i to denote this unconditional probability, we have that:

In what follows, let us refer to ϑ 0 K,i as a view about the regime-switching properties of the economy. Thus, defining a view implies (i) choosing the number of regimes and (ii) choosing a prior for the MSAR parameters ϑ. Also, let π 0 denote the vector of length K K=1 P K containing the unconditional prior probabilities of all views, i.e. π 0 = (π 0 1,1 , . . . , π 0

The posterior probabilities of the views depend on the prior π 0 and on the marginal likelihood of the MSAR model under the different views. In particular, the posterior probability for view ϑ 0 K,i is equal to the joint posterior probability of ϑ 0 K,i and the number K of regimes,

i.e. π(ϑ 0 K,i |y) = π(ϑ 0 K,i , K|y), and is given by:

where p(y|ϑ 0

Computing density forecasts from a MSAR model requires three steps. In what follows, let us add a time subscript to the vector of observations y, so that y t = (y 0 , y 1 , . . . , y t ). Also, let us assume that the current time period is T and the forecast horizon is one period. The first step consists in using the MCMC algorithm to sample both the current unobserved regime S T and the MSAR parameters ϑ from the posterior distribution p(S, ϑ|y T ). Let (ϑ (d) , S T +1 is computed using the matrix of transition probabilities ξ (d) , i.e. based on (2) . Third, y

T +1 ). In particular,

Conditional on knowing the state of the economy in the future period T +1, the predictive distribution of y T +1 is a Normal for any given parameter vector. However, since the future state of the economy is unknown, the density forecast of y T +1 produced by the MSAR will be a mixture of the different regime-specific normals, where the mixture weights are given by the probabilities of the economy ending up in the different possible regimes at T + 1. As a result, the MSAR is generally able to produces highly flexible, non-normal forecast distributions.

Also, the predictive densities are non-linear in y T and heteroskedastic (Frühwirth-Schnatter 2006). In addition, Bayesian estimation incorporates the uncertainty on the parameters ϑ into the density forecasts. What is more, considering alternative views allows for an additional degree of flexibility, as formalized below.

Assuming a known number of regimes K and a known parameter vector ϑ, the one-step-ahead density forecast at time T is the following finite mixture of K normal components:

Next, as a result of Bayesian estimation, the density forecast for any given view integrates out parameter uncertainty:

where, as before, ϑ K denotes the parameter vector when K regimes are assumed. Finally, averaging over different views ϑ 0 K,i , we get:

where π K,i depends on the prior probability vector π 0 and on the marginal likelihoods of the different views according to equation (17) . Forecast (21) is a composite forecast in which the weight assigned to the view-specific forecast p y T +1 |y T , ϑ 0 K,i is given by the posterior probability of the view, π K,i . Therefore, (21) is a mixture of mixtures. If we take the set of alternative views as given, the forecast combination weights are unambiguously pinned down by the data y T and by the prior vector π 0 .

In addition to the Bayesian averaging of views in (21) , let us also consider standard non-Bayesian forecast combinations. In this case, let us express a forecast combination of different MSAR views, where the vector of combination weights is denoted by w, as:

where w K,i ≥ 0 is the weight assigned to view ϑ 0 K,i and K

The composite density forecasts from the MSAR with multiple views are optimized with respect to two alternative objective functions, based on statistics that are commonly used to evaluate density forecast performance: the log score and the probability integral transform

The log score is the log of the predictive density function evaluated at the actual realization of the forecast variable. Let y o t+h (where "o" stands for "observed") denote the realization of variable y at time t + h, which is not observed at time t, when the forecast for t + h is produced. Also, let R be the length of the timespan over which forecasts are optimized. The first objective function, denoted by f 1 , is given by the sum of log scores over the period of interest. For combinations using generic weights w as in (22), the sum of log scores at time τ can be expressed as:

For combined forecasts using Bayesian averaging as in (21), the objective function can be written as:

The PIT is the cumulative predictive density function evaluated at the actual realization of the variable. If the density forecast used to compute the PIT corresponds to the true distribution of the variable, then, for h = 1, the PIT values are the realizations of independently and identically distributed (i.i.d.) Uniform (0, 1) variables (Diebold et al. 1998 ). Therefore, a uniformity test on the PITs can be seen as a test of correct specification of the density forecasts (see also Rossi and Sekhposyan 2014) . Accordingly, the second objective function for forecasts of type (22) is given by:

where Φ (·) denotes the cumulative predictive density function, i.e.

while function ks(·) represents the test statistics of a Kolmogorov-Smirnov (KS) test of unifor-mity. Maximizing −ks(·) is equivalent to maximizing the p-value of the KS test. Analogously,

Both the optimization based on f 1 and the one based on f 2 are solved numerically. For each f i , with i = 1, 2, the optimization algorithm delivers two vectors at time τ : the vector of optimal forecast weights w * i,τ for the set of alternative views, i.e.:

and the vector of optimal prior probabilities π 0 * i,τ :

The former represents the typical problem explored in the literature on density forecast combination, whereas the latter can be seen as an empirical method for eliciting priors in the context of Bayesian model averaging. The optimal prior π 0 * i,τ represents the discrete prior probability distribution of views such that the resulting posterior π * i,τ , when used as a vector of forecast weights, maximizes the density forecast performance, based on the selected objective function. In practice, the main difference between (28) and (29) is that the first problem directly delivers weights for forecast combination, while in the second case the actual forecast weights will also depend on the marginal likelihoods of all views, i.e. p(y|ϑ 0 K,i ) ∀K, i.

This section assesses the empirical performance of the approach proposed in the paper. The application deals with density forecasts of U.S. real GDP growth and uses quarterly data from 1948Q1 to 2017Q2 (Figure 1 Let us first consider the Fed-based views. For each of the four stress tests under consideration, two views are constructed, one with K = 3 and the other with K = 5. In the view with K = 3, one of the regimes (which may be called the "normal times" regime), is derived from the Fed baseline scenario, another ("adverse regime") from the adverse scenario and the last one ("severely adverse regime") from the severely adverse scenario. 6 In particular, each regime is "centered" on the corresponding scenario using the following rule. Consider an AR(5) model where the coefficients are given by the k-state-specific hyperparameters of the prior ϑ 0 K,i , i.e.:

In this model, the unconditional expectation of y t is

Then, after making an assumption on the state-independent a The four stress test-based views with K = 5 expand the views with K = 3 by adding two regimes: a regime which we may call "recovery from adverse shock", designed to match the last 4 quarters of the adverse scenario, and a regime of "recovery from severely adverse shock", which matches the last 4 quarters of the severely adverse scenario. This is done in consideration of the fact that growth rates in the last 4 quarters of the adverse and severely scenarios are assumed to be higher than the baseline rates, implying a rebound of the economy after a negative shock. Of course, such regimes may be more generally interpreted as "favorable regimes" characterized by positive shocks and not necessarily as recoveries from recessions.

In the five vague views, all priors on the intercepts are centered on 0 and have a variance of 1 percentage point, while the priors on the autoregressive coefficients are centered on 0.5 for the first lag, on 0 for the higher-order lags, and have a variance of 1. The combination of these assumptions imply a large prior variance on the regime-specific means of the GDP growth rate. In the Fed-based views, the priors for both β and α are strongly informative, so as to ensure that the regime-specific means are tightly centered on the stress test values, based on equation (31). In particular, both priors are assumed to have minimal variance, equal to 10 −5 . For the autoregressive coefficients α, the prior mean is assumed to be 0.9 for the first lag and 0 for higher-order lags, as in the previous example.

No strong assumption is made regarding the regime-switching error variance σ 2 k . Instead, a diffuse hierarchical prior is assumed for all views. Specifically, a Gamma hyper-prior is defined for C 0 : 8

To make the prior on σ 2 k diffuse, the following values are selected for the hyperparameters: c 0 = 3, g 0 = 0.5 and G 0 = 0.5. These imply that σ 2 k has a prior expected value of 0.5 percentage points of GDP and a high prior variance of 1.25 percentage points (see the Appendix for the derivations).

Finally, the hyperparameters for the k-th row of the transition matrix ξ are e kk = 2 and e kj = 1/(K − 1) if k = j, ∀k, j. Given the properties of the Dirichlet distribution, E(ξ kj ) = e kj /( K l=1 e kl ). Therefore, the prior expected probability of remaining in the same state k in the next period is E(ξ kk ) = 2/3 regardless of the number of regimes K, while the probability of moving to a different, specific state j decreases with the number of regimes,

The summary of the alternative views is provided in Table 1 

In the empirical application, a recursive-window estimation scheme is used to generate a sequence of density forecasts. 9 Next, the forecasts are used to carry out the optimization of weights/priors, which is iterated over time. The procedure can be described as follows.

Let us assume that we are at time T w and the forecast horizon is h. For each view under consideration, the MSAR model is recursively estimated using observations between time t 0 and time t, with t = T 0 , T 0 + 1, . . . , T w − h. T 0 is therefore the end period of the shortest estimation sample. Estimates at T 0 are used to make forecasts for period T 0 + h, estimates at T 0 + 1 are used to make forecasts for T 0 + 1 + h, and so on. At time T w , a sequence of past forecasts is available for each view. At this point, the algorithm computes the optimal 8 Accordingly, the independence prior of the MSAR model becomes:

p(α1, . . . , αp, β1, . . . , βK , σ 2 1 , . . . , σ 2

weights/priors based on the last R forecasts, i.e. maximizes the relevant objective function between T w − R + 1 and T w . Once the optimal weights/priors are retrieved, they are used to combine the different view-specific forecasts for the future period T w + h, which is out of the optimization sample. When the actual value of the variable of interest is observed, at time T w + h, the performance of the composite forecast is measured. The index T w runs from

where T is the end of the largest estimation sample. T + 2h is the last available observation for the target variable. Therefore, the period from T 0 + 2h + R − 1 to T + 2h defines the evaluation sample. Figure 2 summarizes the procedure, which closely follows Ganics (2017).

More specifically, the application to U.S. GDP growth sets t 0 =1948Q1, T 0 =1967Q4, R=40

quarters, h=1 quarter and T =2016Q4. Accordingly, the evaluation sample runs from 1978Q1 to 2017Q2. 10 The main results hold true if we set R = 20. Table 3 shows the performance of the optimal forecast weights and optimal priors over the evaluation sample and compares it with five benchmark approaches. The first approach simply uses a linear AR(5) model, corresponding to view no. 1 in Table 1 . The second approach uses an AR model estimated on rolling windows of 80 quarters to accommodate time-varying parameters. 11 The third approach produces forecasts using the individual view that exhibits the highest marginal likelihood, selected recursively across estimation windows. The remaining two approaches consider uniform combination schemes for the alternative views, assigning respectively equal forecast weights and equal prior probabilities to different values of K and,

given K, equal weights/probabilities to the alternative views defined using K regimes. 12 As mentioned in section 2.4, weights w * 1 and priors π * 1 result from the optimization taking the sum of log scores as objective function, while w * 2 and π * 2 are obtained by maximizing the p-value of the Kolmogorov-Smirnov (KS) test of uniformity for the PITs. The table shows the average predictive density (APD) (i.e. the average of the exponential of the log scores) and the p-value of the KS test. Besides, two additional measures of correct specification of density forecasts are taken into consideration, namely the p-values of the Ljung-Box test of 10 We estimate the MSAR model using the MATLAB package bayesf Version 2.0 by Frühwirth-Schnatter (2008). For each MSAR estimate, the MCMC algorithm uses 1000 iterations as burn-in and 1000 iterations to store the results. Starting from the sample of forecasts produced by the MCMC algorithm, a complete probability density function is fitted using standard kernel methods. 11 Using rolling windows of 40 quarters gives similar results. 12 For instance, in the case of equal prior probabilities, it is assumed that π 0 K = 1/K for each K and that π(ϑ 0 K,i |K) = 1/PK for each view ϑ 0 K,i . See (14) and (15) .

serial independence for the first and second moment of the PITs (see Rossi and Sekhposyan 2014 The main result is that optimized regime-switching composite forecasts achieve wellbehaved PITs, unlike all benchmarks considered. The optimization step generates substantial improvements in density forecast performance as measured by the uniformity of the PIT. As can be seen from Table 3 , using the optimal priors π * 2 and the optimal weights w * 2 results in the highest p-values in the KS test of PIT uniformity, 0.32 and 0.21 respectively, while also ensuring that both tests of independence of the PITs do not reject the null hypothesis. By contrast, the recursively estimated linear AR, the two uniform weighting schemes and the approach using the views with the highest marginal likelihood all lead to rejection of the null of uniformity at the 5% level. The AR model estimated on a rolling window gives a p-value of 10% in the KS test, but strongly rejects the serial independence of the second moment of the PITs. In general, for all MSAR-based forecasts the null of independence cannot be rejected, whereas in the case of the linear AR model the independence of the second moment is rejected regardless of the estimation scheme. Interestingly, the weights w * 1 and the priors π * 1 both lead to increases in the KS p-value relative to uniform combinations, even though they are optimized using the log scores as objective function.

Second, the optimization step appears less useful for producing gains in terms of log scores. The APDs of the log-score-optimized forecasts are higher than those achieved by the recursive-window AR, the rolling-window AR and equal forecast weights, but are roughly the same as those obtained by using uniform prior probabilities or by recursively selecting the view with the highest marginal likelihood. Moreover, using the sum of log scores as objective function results in small increases in APD compared to using the KS statistics.

Overall, the comparatively good accuracy in terms of APDs appears to be driven more by the Markov-switching model than by the optimization procedure.

To summarize, optimizing the combinations of views enhances the calibration of density forecasts in terms of PIT uniformity, i.e. improves the specification of the predictive distribution. This, combined with the regime-switching setup, leads to PITs that are not significantly different from i.i.d uniform variables. At the same time, the approach is capable of producing results in terms of log-score accuracy that are roughly in line with the best ones across several benchmarks. Table 3 ), which gives as high APDs as the log-score-optimized weights and priors, never selects any Fed-based views.

When the PIT-based optimization is considered, the contribution of the Fed-based views is much higher. On average, they account for 33% of the combined forecasts in the case of optimal weights and over 20% in the case of optimal priors. In terms of w * 2 , their cumulative weight exceeds 60% in 1982-1983, increases quite rapidly during the period 2007-2009 and remains steadily between 75% and 100% from 2009 to 2017. The Fed-based views also dominate in terms of optimized posteriors for most of the period 2008-2017. Their cumulative posterior probability has a first peak in 1983, while it remains close to zero from 1984 to 2008. It is important to remark that using Fed-based views is not sufficient to achieve wellcalibrated forecasts. None of these views, when considered individually, leads to non-rejection of the PIT uniformity hypothesis in the KS test. Instead, as already stressed, the combination of different views is what drives the good results in terms of calibration.

To evaluate the approach within the broader perspective of non-normal and heteroskedastic models, this section shows the density forecast performance of three alternative models: an AR with Student-t errors, an AR with ARCH errors and an AR with GARCH errors. The models have been estimated on both recursive windows and rolling windows of 40 and 80 quarters. 13 As with the MSAR models, the lag length for the AR component is set to 5 for all three models, while the ARCH and GARCH components have a lag length of 1.

For each model, Table 4 shows the APDs and the p-values for the KS, LB1 and LB2 tests over the same evaluation sample as in the previous section. When estimated on recursive windows, all three models generate non-uniform PITs and lower APDs than any MSARbased method in Table 3 . Their performance considerably improves when rolling windows are used, which accommodate structural instabilities. In particular, the AR with t errors achieves the highest APD (0.37) and generates PITs that do not reject the hypotheses of uniformity and independence in the first moment. Regarding independence in the second moment, the LB2 test rejects the null at the 5% when estimated on 80-quarter windows, whereas it does not reject null at the 5% but rejects it at the 10% level when estimated on 40-quarter windows. The models with ARCH/GARCH errors always reject the hypothesis of second-moment independence and are generally outperformed by the MSAR-based methods in terms of APDs.

The results suggest that, when the PIT optimization is used, the approach proposed in the paper is able to achieve a more reliable specification of the conditional predictive distribution, based on the joint indications offered by the KS, LB1 and LB2 tests. In terms of log-score accuracy, the approach produces results that are close but below the best alternative, namely the AR model with Student-t errors estimated on rolling windows.

This paper has proposed a procedure for constructing reliable density forecasts of economic variables using a regime-switching model. 

The data that support the findings of this study are openly available in the Archival FRED Notes: The figure summarizes the density forecast optimization scheme. First, the MSAR model is recursively estimated on actual GDP data (dark blue bar) using alternative views. The sample start date is denoted with t 0 , the end date runs from T 0 to T . For each sample window, the estimates generate density forecasts with horizon h (light blue bar). A rolling sequence of R forecasts is used to compute optimal forecast weights and prior probabilities (green bar) for the views. The optimal weights/priors obtained in each period are used to combine the view-specific forecasts for subsequent periods. The resulting composite forecasts (dark yellow bar) are evaluated by comparison with the actual data over the period from T 0 + 2h + R − 1 to T + 2h. methods. The optimal pools include log-score-based forecast combinations (optimal weights w * 1 ), log-score-based Bayesian averaging (optimal prior probabilities π * 1 ), PIT-based forecast combinations (optimal weights w * 2 ), where PIT stands for probability integral transform, and PIT-based Bayesian averaging (optimal prior probabilities π * 12 ). Please refer to Section 3.3 in the paper for further details on the forecasting methods compared here. APD denotes the average predictive density, KS denotes AR (5) 2017Q2. The weights (w * 1 ) are obtained using the log-score-based optimization procedure described in the paper. The right panel plots the cumulative weight assigned to the views derived from Fed supervisory scenarios (views 6-13). See Table 1 for the list of views. The weights (w * 2 ) are obtained using the PIT-based optimization procedure described in the paper, where PIT stands for probability integral transform. The right panel plots the cumulative weight assigned to the views derived from Fed supervisory scenarios (views 6-13). See Table 1 for the list of views. 2017Q2. The underlying prior probabilities π 0 * 1 are obtained using the log-score-based optimization procedure described in the paper. The right panel plots the cumulative weight assigned to the views derived from Fed supervisory scenarios (views 6-13). See Table 1 for the list of views. to 2017Q2. The underlying prior probabilities π 0 * 2 are obtained using the PIT-based optimization procedure described in the paper, where PIT stands for probability integral transform. The right panel plots the cumulative weight assigned to the views derived from Fed supervisory scenarios (views [6] [7] [8] [9] [10] [11] [12] [13] . See Table 1 for the list of views.

Based on the properties of the Gamma and inverted Gamma distributions, it holds that:

Given the values for the hyperparameters, c 0 = 3, g 0 = 0.5 and G 0 = 0.5, it follows that:

Var(σ 2 k ) = E(Var(σ 2 k |C 0 )) + Var(E(σ 2 k |C 0 )) = 

Financial conditions and density forecasts for US output and inflation

Autoregressive Moving Average Infinite Hidden Markov-Switching Models

Predictive Density Evaluation

Evaluating density forecasts

Economic Forecasting

2015 Supervisory Scenarios for Annual Stress Tests Required under the Dodd-Frank Act Stress Testing Rules and the Capital Plan Rule

2016 Supervisory Scenarios for Annual Stress Tests Required under the Dodd-Frank Act Stress Testing Rules and the Capital Plan Rule

2017 Supervisory Scenarios for Annual Stress Tests Required under the Dodd-Frank Act Stress Testing Rules and the Capital Plan Rule

2018 Supervisory Scenarios for Annual Stress Tests Required under the Dodd-Frank Act Stress Testing Rules and the Capital Plan Rule

Finite Mixture and Markov Switching Models

Finite Mixture and Markov Switching Models. Implementation in MATLAB using the package bayesf Version 2.0

Optimal Density Forecast Combinations

Optimal prediction pools

Combining density forecasts

Autoregressive conditional density estimation

A new approach to the economic analysis of nonstationary time series and the business cycle

Macroeconomic Regimes and Regime Shifts

Macroeconomic Outlook Alternative Scenarios. Moody's Analytics

Forecasting time series subject to multiple structural breaks

Evaluating predictive densities of US output growth and inflation in a large macroeconomic data set

Determining the Severity of Macroeconomic Stress Scenarios. Federal Reserve Bank Supervisory Staff Reports