key: cord-0681985-2rmp5w3g authors: Božović, Miloš title: Portfolio Tail Risk: A Multivariate Extreme Value Theory Approach date: 2020-12-17 journal: Entropy (Basel) DOI: 10.3390/e22121425 sha: 9ae3943ff9be6e8be7e729b276e6214a7da8d277 doc_id: 681985 cord_uid: 2rmp5w3g This paper develops a method for assessing portfolio tail risk based on extreme value theory. The technique applies separate estimations of univariate series and allows for closed-form expressions for Value at Risk and Expected Shortfall. Its forecasting ability is tested on a portfolio of U.S. stocks. The in-sample goodness-of-fit tests indicate that the proposed approach is better suited for portfolio risk modeling under extreme market movements than comparable multivariate parametric methods. Backtesting across multiple quantiles demonstrates that the model cannot be rejected at any reasonable level of significance, even when periods of stress are included. Numerical simulations corroborate the empirical results. Events of the Global Financial Crisis, the Eurozone sovereign debt crisis, and the ongoing COVID-19 pandemics increase the investors' awareness of the need to assess the risk of extreme losses and the necessity of protection against future market meltdowns. To quantify financial risks, investors and regulators commonly resort to risk measures based on quantiles of distributions of losses or returns, such as Value at Risk (VaR) or Expected Shortfall (ES). However, the most significant problem with conventional approaches to assessing the quantile-based risk measures is their poor performance in characterizing the "tail" behavior of the distribution of returns. Any model aiming to mitigate severe losses should specifically focus on the tail risk. The importance of tail behavior in risk assessment has been emphasized by many authors, such as Poon et al. [1] , Rodriguez [2] , Ning [3] , Okimoto [4] , Ning [5] , Christoffersen et al. [6] , Rocco [7] , Gkillas et al. [8] , and Gkillas et al. [9] , and many distributional assumptions have been introduced so far to improve risk management and asset pricing practices (see, for instance, [10, 11] ). Perhaps the most obvious alternative to the usual parametric approaches is the Extreme Value Theory (EVT) since it concentrates on extreme losses. McNeil [12] , Bensalah [13] , Smith [14] , Nyström and Skoglund [15] , Embrechts et al. [16] , and Daníelsson [17] , among others, provide excellent methodological overviews of EVT and discuss its applications in risk modeling. McNeil [18] , Nyström and Skoglund [15] , Harmantzis et al. [19] , Marinelli et al. [20] , and Lai and Wu [21] show that EVT-based models outperform VaR and ES estimates based on other analytical methods. The discrepancy becomes pronounced at more extreme quantiles. The EVT approach seems like a natural choice for risk modeling: its implementation is relatively easy, and the method is based on a few assumptions required for the asymptotics to work. Although straightforward in the univariate case, EVT becomes of limited practical use when applied to a portfolio of financial instruments or several risk factors at once. Investors usually deal with many risk factors simultaneously, and hence an appropriate method requires a multivariate approach rather than a univariate one based on portfolio-level returns. As shown by Longin and Solnik [22] , Campbell et al. [23] , and many authors since, multivariate modeling is particularly vital during bear markets, when risk factors exhibit extreme co-movements. It also allows for risk attribution and additive decomposition of VaR, which is essential for risk management and economic capital allocation in financial institutions. Defining portfolio-level dynamics of risk factors under extreme market conditions has so far been a daunting task. A seemingly obvious technique involves a multivariate version of the EVT based on the multidimensional limiting relations (see, for example, [14, [24] [25] [26] [27] ). However, the model complexity increases faster than linearly with the number of risk factors in that case. Alternatively, the joint distribution of returns can be seen as a product of marginal distributions and a copula. McNeil and Frey [28] and Nyström and Skoglund [29] , for example, describe the copula approach to the assessment of the extreme co-dependence structure of risk factors. The copula-based methods introduce an additional model risk, inherent in the assumption of a specific analytical form of the co-dependence function. In addition, they become quite intractable with an increase in dimensionality. Moreover, a typical copula method for multivariate EVT, such as the one described in Nyström and Skoglund [29] , requires an additional simulation step to retrieve the residuals from the joint distribution, given the fitted marginals and parameters of the copula. The dimensionality problem in the multivariate setting can be tackled by vine pair copulas, as in Aas et al. [30] , Chollete et al. [31] , Yu et al. [32] , or Trucíos et al. [33] . This paper introduces a multivariate EVT method for risk assessment based on separate estimations of the univariate model. Instead of estimating joint n-dimensional distributions (using multidimensional limiting relations, copulas, or otherwise), the method proposed here works with n orthogonal series that are approximately independent and identically distributed. These series are obtained from the eigenvectors of the joint return series, filtered to remove autocorrelation and heteroskedasticity. The filtering can be achieved, for instance, by assuming that the conditional covariance matrix follows a stationary n-dimensional model from the orthogonal GARCH family, as in Alexander [34] or van der Weide [35] . The tails of n independent univariate series of residuals are then fitted to a generalized Pareto distribution, allowing for closed-form expressions for VaR and ES. As an illustration, the technique is applied to a sequence of daily returns on stocks comprising the Dow Jones Industrial Average. For an out-of-the-sample analysis, daily VaR and ES estimates are compared to the actual portfolio losses over a more extended period, covering the extreme market co-movements during the last quarter of 2008. The results indicate that the method performs well in capturing extreme events jointly across the entire portfolio. The in-sample tests show that the proposed approach is better suited for portfolio risk modeling under extreme market movements than comparable multivariate parametric methods, such as orthogonal GARCH with normal or t-distribution. In addition, backtesting across multiple quantiles demonstrates that the method cannot be rejected at any reasonable level of significance, even for out-of-sample windows that contain stress periods. This paper contributes to the literature in several important ways. To the best of our knowledge, it is the first method that extends the EVT to the multivariate case by combining the relative simplicity of univariate EVT and orthogonalization of return residuals. By doing so, we can capture tail correlations and extreme co-movements. The proposed method is universal, as it can be applied to any asset class, and it can incorporate any volatility model from the GARCH family. The remainder of the paper is organized as follows: Section 2 presents the theoretical background behind the proposed multivariate EVT approach and the estimation methodology used in this paper. Section 3 describes the data, provides an example of VaR and ES estimation, and compares the method's in-sample forecasting ability with other similar approaches. Section 4 illustrates the out-of-sample performance of the method through backtesting. Section 5 demonstrates the model performance on a set of simulated data. Concluding remarks are given in Section 6. Two important theorems are related to the univariate EVT. The first one shows that order statistics of independent and identically distributed (i.i.d.) random variables converge in distribution to the generalized extreme value (GEV) distribution [36, 37] . The usual approach then is to fit the GEV distribution to the sequence of maxima of portfolio returns or their residuals, as described for example in Smith [14] . An alternative method is based on exceedances over threshold. The following theorem, first stated by Pickands [38] , gives the asymptotic form of conditional distribution beyond a threshold: to be the distribution of exceedances of Z t over the threshold u. Let z F be the right endpoint of the distribution F, possibly a positive infinity. Then, if F is such that GEV distribution is well-defined, there are constants ξ ∈ R and β := β(u) ∈ R + such that is the generalized Pareto (GP) distribution, while the subscript + denotes the positive part of a function, i.e., f + (x) ≡ max ( f (x), 0) for any function f . The challenges associated with implementation of EVT-based models are well known in the literature. Since the Theorem 1 defines a limiting distribution for the extreme returns, we need a sufficient number of observations and a large enough threshold u. There are different methods of making this choice, and some of them are examined in Bensalah [13] and Gabaix et al. [39] . In addition, the limit theorems hold only if the extreme observations Z t are i.i.d. Therefore, we cannot apply the results of univariate EVT to returns on financial assets directly without taking into account their autocorrelation and heteroskedasticity. The method works in several steps. We start from a multivariate series of returns in a portfolio consisting of n assets. Next, we apply orthogonal GARCH to obtain a set of random variables that are approximately i.i.d. Then, we fit the tails of n independent univariate residual series to a generalized Pareto distribution to obtain the quantiles for each component. Finally, we substitute these quantiles into a formula for VaR and ES. Sections 2.2.1-2.2.6 describe the methodology in detail, while Section 2.2.7 provides its summary. The ultimate goal is to adapt the EVT approach to a portfolio consisting of n assets in order to capture their tail risk and their extreme co-movements. Before applying any of the results of EVT outlined in Section 2.1, we have to construct a set of cross-sectionally uncorrelated random variables. A natural choice is to work with the eigenvectors of the unconditional covariance matrix of returns. Definition 1. Define ε t to be an n-dimensional random vector whose components ε t,i have zero mean for each i = 1, 2, . . . , n. Let V ∞ = E(ε t ε t ) be the n-by-n unconditional covariance matrix of ε t and P be the corresponding orthogonal matrix of normalized eigenvectors, where the prime symbol ( ) denotes the transpose of a matrix. The eigenvalue decomposition of V ∞ is given by where Λ is a diagonal matrix of the eigenvalues of V ∞ , ordered by descending values, Then, is called the vector of principal components of ε t , for any t. The i-th element of the vector z t is called the i-th principal component of ε t . Note that E (z t ) = 0 and var (z t ) = 1 n , which follows from E (ε t ) = 0 and V ∞ = LL , respectively. Hence, z t are cross-sectionally uncorrelated and each component has a unit variance. Since ε t = Lz t , each coordinate of ε t can be written as a linear combination of the principal components, where L ij are the elements of L. The method can be easily adapted to fewer than n risk factors (see Appendix A). An alternative linear transformation that can be used to separate multivariate returns into additive subcomponents is the independent component analysis. This technique has found its successful applications in finance: Back and Weigend [40] use it to extract the structure from stocks returns, while Moody and Wu [41] and Moody and Yang [42] apply it to foreign exchange rates. The main goal of independent component analysis is to minimize statistical dependence between the transformed vectors. Unlike principal component analysis, these vectors are neither orthogonal nor ranked by the variance they explain, and the linear transformation matrix analogous to L cannot be obtained in a closed form. Therefore, independent component analysis is not entirely suitable for the problem treated in this paper. A good comparison between the two methods can be found in Hyvärinen and Oja [43] . Orthogonalization transforms a cross-sectionally correlated series into a set of uncorrelated ones. We also have to filter out any autocorrelation and heteroskedasticity from the series. As the end-result, we will obtain sequences of conditional residuals that are orthogonal, and (approximately) serially uncorrelated and identically distributed. Specifically, we can assume that, for each asset i = 1, 2, . . . , n, the series of log returns y t,i := ln(S t,i /S t−1,i ), where S t,i denotes the price at time t, follows a process: where µ t,i is the conditional mean of y t,i . Conditionally on the information available at t − 1, the vector of residuals ε t := [ε t,1 ε t,2 . . . ε t,n ] has a zero mean and a conditional covariance matrix V t : where, for any t, the matrix V t is positive definite and measurable with respect to the information set F t−1 , a σ-algebra generated by the past residuals {ε 1 , ε 2 , . . . , ε t−1 }. To capture the volatility clustering effect, we can assume that the conditional covariance matrix follows a model from the GARCH family. The standard GARCH(p,q) is sufficient to capture most of the clustering, and-to some extent-excess kurtosis. The leverage effect can be taken into account, for example, by assuming that the conditional residuals follow an asymmetric distribution, such as skewed Student's t. Alternatively, we can model the asymmetry explicitly in the equation followed by the conditional covariance matrix. Without losing generality of the proposed method, we can assume that the conditional covariance V t follows a multivariate asymmetric GARCH(p,q) of Glosten et al. [44] , also known as the (multivariate) GJR-GARCH(p,q), or simply GJR(p,q): where Ω, A 1 , . . . , A p , Θ 1 , . . . , Θ p , B 1 , . . . , B q are constant, positive semidefinite n-by-n matrices of parameters, and for any t. As usual, the matrices A s in (6) measure the extent to which volatility shocks in previous periods affect the current volatility, while A s + B s measure the rate at which this effect fades away. The terms proportional to matrices Θ s capture the impact of asymmetric return shocks to volatility. For any t, the unconditional covariance matrix of ε t is given by where 1 n denotes an n-by-n identity matrix. Covariance stationarity of the GJR(p,q) process (6) is assured by setting the matrix to be positive definite. Note that the multivariate GJR process (6) contains the ordinary multivariate GARCH as its special case. There are also many plausible alternatives. For example, the APARCH model of Ding et al. [45] includes ARCH, GARCH, and GJR-GARCH models as special cases, as well as four other ARCH extensions. These four extensions are the TS-GARCH model of Taylor [46] and Schwert [47] , Log-ARCH model of Geweke [48] and Pentula [49] , N-ARCH model of Higgins and Bera [50] , and T-ARCH model of Zakoïan [51] . Similarly, the EGARCH model of Nelson [52] can be used to capture the leverage effect in a slightly different manner. The method developed in this paper can be applied straightforwardly with any of the multivariate models from the GARCH family. Cross-sectional correlations are reflected in the off-diagonal terms of matrices V t and E t . This, in turn, makes the parameter matrices non-diagonal. In total, one would have to estimate (1 + 2p + q)(n + 1)n/2 different parameters. Clearly, this number explodes as we increase the number of assets in the portfolio. However, this is only one aspect of the problem. The other is that we cannot apply the results of univariate EVT to conditional residuals ε t,i directly. However, we can switch to the orthonormal basis of principal components by applying the linear transformation (2) to the conditional residuals ε t . In this basis, Equation (6) reads: where M := L −1 M L −1 for any square matrix M. In particular, be the conditional covariance matrix of principal components. As in Alexander [34] , it is reasonable to assume that the matrix V t is diagonal, since the eigenvectors z t are orthogonal. Then, the process given by Equation (7) can be estimated separately for each principal component. This gives a set of n independent scalar equations of the form where, in general, M i := M ii is the i-th diagonal element of the matrix M, i being 1, 2, . . . , n for the first, second, . . ., n-th principal component, respectively. Once we estimate the parameters Ω, for t ≥ max{p, q}, to retrieve the series of conditional covariance matrices in the original basis of log returns. This allows us to forecast VaR and ES in a multivariate framework, and for an arbitrary portfolio. Theorem 1 states that, for a large class of underlying excess distributions, the exceedances over threshold converge in distribution to GP as the sample size increases and the threshold is raised. Thus, the GP is the natural model for the unknown excess distribution above the threshold u, and we may conjecture that for any x satisfying 0 ≤ x < z F − u. Assuming that we have a set of realizations for the variable Z t , we can choose a sensible threshold u and estimate parameters ξ and β. If there are T u out of a total of T observations that exceed the threshold, the GP distribution will be fitted to the T u exceedances. In the literature, several estimators have been used to fit the GP parameters. The most popular ones are the maximum likelihood (ML) and the Hill estimator. The ML estimator assumes that, if the tail under consideration strictly follows a GP distribution, the likelihood function can be written in a closed-form. The estimators of the parameters ξ and β are then obtained using the standard ML approach. Provided that ξ > −1/2, the ML estimator of the parameters is consistent and asymptotically normal as the length of the series tends to infinity. On the other hand, the Hill estimator is based on a combination of the ML method and a semi-parametric result, which describes the scaling of F when ξ > 0 (see, for instance, [53] ). Here, we will use the ML estimator due to its performance and universality, as discussed in Nyström and Skoglund [15] . By combining Theorem 1 and Equation (10), we can write for z > u. The only additional element we require to construct a tail estimator is F(u). The simplest choice would be to use the non-parametric method and take the obvious empirical estimator, F(u) = 1 − T u /T. Alternatively, we can find T u that is closest to a predetermined F(u). Thus for example, in a sample of T = 1000 observations, F(u) = 0.90 will correspond to T u = 100. The threshold is then set to u = Z 900 , if {Z t } T t=1 are ordered in an increasing way. Combining the empirical estimate F(u) with the ML estimates of the GP parameters, we obtain the tail estimator: Note that, when the scale parameter β tends to infinity, G ξ,β (·) vanishes and the tail estimator converges to the empirical one for any z. Thus, the tail estimator in (11) can be viewed as the non-parametric estimator augmented by the tail behavior captured by the GP distribution. We formally define VaR in the following way. Value at Risk is the α quantile of the distribution F: where α ∈ (0, 1) and F −1 is the inverse of F. The usual critique of VaR as a risk measure is that it is not coherent and ignores the structure of losses beyond a specific quantile. (See, for example, Artzner et al. [54, 55] , Acerbi and Tasche [56, 57] and Szego [58] .) In order to overcome these drawbacks, Expected Shortfall (ES) is often used as an obvious alternative. It is defined as the conditional expectation of loss that surpasses a fixed level of VaR. Formally, under the same assumptions as in Definition 2, we can define ES as: As such, ES takes into account tail risk and satisfies the sub-additivity property, which assures its coherence as a risk measure. Let u + ≡ u be the upper-tail threshold, and let the lower-tail threshold u − be defined symmetrically, that is, by F(u − ) = 1 − F(u + ). Then, for a given upper-tail probability α + > F(u + ) or a given lower-tail probability α − < F(u − ), the general form of the VaR estimate is where the subscript + (−) refers to parameters in the upper (lower) tail. Similarly, the general form of the ES estimate is (see, for example, [12] ). As the tail becomes heavier, the tail index increases (equivalently, ξ ± → 0), and ES becomes progressively greater than VaR. Given the upper-and lower-tail quantiles α ± , the quantile of i-th principal component h time steps ahead is given by where τ = T + h and F −1 i (·) is the inverse of the univariate probability function for the set of principal components {z t,i } T t=1 , given by Equations (12) and (13). Our final goal is to estimate VaR and ES for a portfolio of n assets. Denote by a the vector of portfolio positions, in monetary units. This, among other things, facilitates the treatment of portfolios with short positions. Then, h-steps-ahead portfolio VaR is given by where Q ± τ is a real symmetric matrix that can be decomposed as The intuition behind Formula (16) is fairly simple. The first term, a µ τ , represents the forecast of the expected portfolio return. The second term is the upper-or lower-tail bound of the confidence interval for total portfolio return. It is obtained in three steps: first, we use (15) to calculate the quantiles z ± τ,i , i.e., individual contribution to VaR for each principal component, and stack them into a diagonal matrix D ± τ ; second, we use L to transform this matrix back to the original basis of log returns; finally, we take a square root of the matrix "sandwich" consisting of the vector of portfolio holdings a and transformed quantiles Q ± τ to obtain the bounds of confidence intervals around the expected return. If all principal components happen to be identically distributed (which is unlikely, even if all come from the same family of distribution functions), then the square-root term is simply the α-quantile of this distribution times the h-steps-ahead forecast of the portfolio variance. A similar rationale can be applied to derive the forecast of portfolio ES: where, by analogy with Equations (14)- (16), We summarize the methodology through the following procedure: Step 1. Start by performing the orthogonalization on portfolio returns. In other words, use the eigenvalue decomposition of the unconditional covariance matrix V ∞ of return residuals ε t to obtain the matrix L following Definition 1. Then, use Equation (2) to obtain the set of orthogonal vectors z t . This step assures cross-sectional independence and allows working with univariate series. Step 2. Filter out the heteroskedasticity on each component from step 1 to obtain serially i.i.d. random variables. In other words, use the set of orthogonal vectors z t from step 1, calculate E t = z t z t and run a GARCH-like process such as one given by Equation (8) on E t . This results in a set of univariate estimates of the conditional variance V t,i . This step is a prerequisite for using the univariate EVT. Step 3. Use the estimator from Equation (11) to fit the tails of each of the independent series from step 2. This gives the parameters of GP distribution. Step 4. Substitute the parameters from step 3 into Equation (12) or (13). This step allows us to calculate the quantile for each component, Equation (15) . Step 5. Substitute the quantiles from step 4 into the closed-form expressions from Equations (16) and (17) to obtain portfolio VaR and ES. Steps 1 and 2 are analogous to the orthogonal GARCH of Alexander [34] . Steps 3 and 4 are typical for a univariate EVT approach. A combination of steps 1-4 with step 5 is essential for the multivariate EVT approach. As discussed, the method is based only on a few assumptions that allow the exploitation of asymptotical properties. From Theorem 1, we can see that it is sufficient to have a set of independent univariate random variables {Z t } T t=1 drawn from the same distribution function F. We can obtain the set of univariate random variables by merely applying a simple eigenvalue decomposition of the unconditional covariance matrix V ∞ in step 1. This step is well defined as long as the original multivariate series has a finite unconditional covariance matrix, a reasonable assumption for the financial time series. The set of i.i.d. random variables having the same distribution function F can be obtained through a GARCH-like filter in step 2. The filter can be applied provided that the covariance stationarity condition has been met. This condition is merely a restriction on the model parameters. Other assumptions are not required for the method to work since it relies on the transformed random variables' asymptotic property. The empirical results that follow are based on daily closing prices of the thirty stocks that constituted the Dow Jones Industrial Average at the end of 2011, gathered from Thomson Reuters Datastream. The prices are adjusted for stock splits and dividend payments. The observation period covers all trading days between and, including 2 January 2001, and 30 December 2011, there is a total of 2766 returns per stock. Figure 1 displays the returns of an equally-weighted portfolio, which will be used in the later examples. Following the procedure described in Section 2.2, we first estimate the upper-and the lower-tail parameters of the univariate GP distribution, ξ ± and β ± , for each of the principal components. The upper and lower thresholds, u + and u − , are determined by F(u + ) = 0.90 and F(u − ) = 0.10, respectively. (g choice of threshold u ± is of minor importance for this illustration. For instance, using F(u − ) = 0.05 instead of 0.10 at α = 0.001 changes the level of VaR forecast by less than a percent in relative terms.) The lower-tail VaR and ES forecasts can be found easily from Equations (16) and (17) . The conditional covariance matrix V τ,i in (15) is obtained using the steps described in Equations (6)- (8) . Here, we illustrate this procedure, assuming that the conditional covariance matrix in Equation (6) follows a multivariate GJR(1,1) process, i.e., p = q = 1, or its multivariate GARCH(1,1) restriction, in which Θ s ≡ 0. Table 1 shows ten-day VaR and ES forecasts for several confidence levels reported as a percentage of the total portfolio value. (We choose the convention that the negative values obtained in Equation (15) We can compare different multivariate volatility models in terms of their ability to fit the conditional distributions of principal components. Here, we use three different distributional assumptions: normal, Student's t, and GP. The results for normal and Student's t-distribution are calculated from the usual procedure of Alexander [34] . The GP results are based on our multivariate EVT method, as described in Section 2. The results are summarized in Table 2 . The second column of Table 2 shows the cumulative R 2 , i.e., the fraction of variance explained by each of the thirty eigenvectors. The remaining columns contain Kolmogorov-Smirnov test statistics for the univariate probability distribution of standardized eigenvectors z t,i . The methods based on multivariate EVT appear to be the only ones that cannot be rejected at any reasonable significance level for all thirty eigenvectors-everything else being equal. Due to the fat tails of returns, conditional normality can be rejected for almost all eigenvectors. The assumption of Student's t conditional distribution can be safely rejected for the first principal component, which is the one that explains almost half of the variations of returns in this sample. It is also rejected at more restrictive levels of significance for the remaining eigenvectors. To visualize the model's ability to fit the lower tail, we can plot the distribution of exceedances F u (x). Figure 2 shows log 10 F u (x) for the first four principal components, which jointly contribute to more than two-thirds of variations in portfolio returns. Each graph compares the empirical distribution function tails and three different distribution functions obtained from the orthogonal-GJR(1,1) residuals. The fitted GP distribution G ξ,β (·) is obtained from the model proposed in this paper. Consistently with Table 2 , it provides the best fit to the actual data. To analyze the out-of-sample performance of the proposed method, we use a 1000-day rolling-window sequence of one-day VaR forecasts and compare them to the corresponding actual returns of the equally-weighted portfolio for each day. In other words, we use the sequence of returns between 3 January 2001, and 14 January 2008, to obtain the VaR forecast for 15 January 2008. Then, we shift the window by one day and use the sequence of returns between 4 January 2001, and 15 January 2008, to obtain the VaR forecast for 16 January 2008, and so on, until the last sequence becomes the one corresponding to the period between 28 December 2004, and 29 December 2011, which is used to obtain the VaR forecast for 30 December 2011. We use five different VaR quantiles (α = 0.001, 0.005, 0.010, 0.050, 0.100) and perform a backtesting procedure on the same models as in Section 3: orthogonal multivariate GARCH(1,1) and GJR(1,1) processes, under three distributional assumptions (normal, Student's t, and GP). Table 3 compares the actual and expected number of VaR violations for each quantile. We may notice that the number of violations is above the expected value, indicating that the models, on average, underestimate the tail risk. The assumption of conditional normality displays far too many exceptions in the extremes. The models based on the multivariate EVT method with GP distribution have violations whose number is closest to the expected. Table 3 . Actual and expected number of lower-tail VaR violations for an equally-weighted portfolio of 30 stocks. Confidence Level, 1 − α 0.900 0.950 0.990 0.995 0.999 Normal 109 69 24 16 7 Student 118 71 17 10 3 GP 117 68 15 8 3 mv GJR Normal 107 64 25 15 8 Student 117 65 19 10 1 GP 115 62 14 8 0 Expected 100 50 10 5 1 We further follow Campbell [59] and apply a formal backtesting procedure that assesses the forecasting ability of each of the methods across multiple VaR quantiles simultaneously, separated into K bins on the unit interval. The test is based on the usual Pearson's statistic, where N obs k and N exp k are, respectively, the observed and the expected number of violations in the k-th bin. The Q statistic converges in distribution to a χ 2 with K − 1 degrees of freedom. The results of the test are summarized in Table 4 Table 4 indicates that the proposed multivariate EVT method cannot be rejected whenever the comparable methods can. For example, if the orthogonal multivariate GJR process is used, the model based on conditionally normal residuals can be rejected at 5% significance, but the models based on conditionally t-distributed residuals and the multivariate EVT method cannot. However, if the orthogonal multivariate GARCH process is used, then, at 5% significance, the models based on conditionally normal and t-distributed residuals can both be rejected, while the multivariate EVT method cannot. Similar results are obtained with Christoffersen [60] conditional coverage test across separate quantiles (available upon request). In Appendix B, we illustrate the method using the exchange-rate data. We obtain results similar to those shown in Table 4 . To demonstrate the proposed method's ability to incorporate various data patterns, we will apply it to simulated data in this section. To this end, we will simulate co-dependent multivariate series using different copulas: a Gaussian copula with the ''ordinary" (i.e., Pearson's) linear correlation, a Gaussian copula with Kendall's τ correlation, a t copula with Kendall's τ correlation, and three Archimedean copulas-Clayton, Frank, and Gumbel. The copula parameters will be calibrated using the stock returns from Section 3. We will also use the stock data to obtain the marginal distributions, applying kernel smoothing on the empirical distribution functions. For the Gaussian copulas and the t copula, we apply the full 30-dimensional multivariate structure. Since the applied Archimedean copulas are bivariate, we use two equally-weighted sub-portfolios containing 15 return series each. For every copula, we simulate 10,000 data points in each series. We again follow the procedure outlined in Section 2.2, with the same choice of thresholds. We then apply Equations (16) and (17) to estimate lower-tail VaR and ES. For the sake of tractability, we focus on the multivariate GJR(1,1) process for the conditional covariance matrix. Table 5 summarizes the ten-day VaR and ES forecasts for the same confidence levels as in Table 1 , again reported as a percentage of the total portfolio value. We can notice that the numbers do not change drastically with respect to those reported in Table 1 . They are also similar across different joint distributional assumptions. The method's flexibility in fitting the lower tail across various joint distributions of the data can be visualized in Figure 3 . We show the log 10 F u (x) of the first principal component for each of the six copula types obtained from the orthogonal-GJR(1,1) residuals. The GP distribution G ξ,β (·) represents an excellent fit to the simulated data. The out-of-sample performance of the proposed method on the simulated data can be assessed following the same idea as in Section 4. Again, we apply a 1000-day rolling-window sequence of one-day VaR forecasts. We use the same five VaR quantiles as in Section 4 and perform a backtesting procedure using the orthogonal multivariate GJR(1,1) processes, under six joint distributional assumptions (i.e., simulated copulas). Table 6 shows the comparison between the actual and the expected number of VaR violations across quantiles. In all cases, the number of violations is very close to the expected value. We can see this more formally through Pearson's Q test results, shown in Table 7 . The proposed multivariate EVT method cannot be rejected for any of the six joint distributional assumptions used to obtain the simulated datasets. This paper proposes a procedure for assessing portfolio tail risk based on Extreme Value Theory, without the necessity to use multivariate limiting relations. The method applies separate estimations of the univariate EVT model. It works with a set of orthogonal conditional residuals obtained from the principal components of the covariance matrix of returns. Autocorrelation, heteroskedasticity, and asymmetry inherent in the original return series can be removed by assuming a GARCH-type process for conditional variances of principal components. In this way, we can obtain a set of independent and identically distributed random variables, which is a prerequisite for any univariate EVT approach. The tails of the univariate distributions have been modeled by a generalized Pareto distribution of peeks over the threshold, while the interiors are fitted with an empirical distribution function. The estimation of parameters can be obtained using maximum likelihood, keeping the method computationally as fast as its orthogonal-GARCH counterparts. The method also provides closed-form expressions for Value at Risk and Expected Shortfall. As an illustration, the method has been applied to a sequence of daily stock returns of Dow 30 companies. In-the-sample tests show that the proposed approach is more reliable for modeling extreme market movements than comparable multivariate parametric methods, such as orthogonal GARCH-type models with normal or t-distribution. Backtesting on 1000 out-of-the-sample observations indicates that VaR forecasts obtained by the proposed method perform well across multiple quantiles: the values of test statistics suggest that the null hypothesis of an equal number of observed and predicted VaR violations could not be rejected for all relevant VaR confidence levels. The importance of these results is further highlighted because the backtesting period covers extreme market movements that had followed the events of the Global Financial Crisis. Numerical simulations have corroborated the empirical results. Using six different joint distributional assumptions, where various copula families drive the co-dependence in the data, we show that the proposed multivariate EVT method represents an excellent fit for the simulated data tails. The out-of-the-sample performance has also been confirmed via simulations: the method cannot be rejected for any of the six joint distributional assumptions. Even though EVT is a natural choice for tail risk modeling, its major shortcoming is the complexity of extension to the multivariate case. This paper proposes a way to overcome this problem by combining the simplicity of univariate EVT and orthogonal GARCH while capturing tail correlations and extreme co-movements. The proposed method can be applied to any asset class, and it is straightforward to modify its filtering process to include any volatility model from the GARCH family. Hence, it has many potential applications in volatility forecasting, asset allocation under VaR or ES constraints, risk management of complex portfolios, or assessing individual assets' contributions to systemic risk. The author declares no conflict of interest. Definition 1 implies that the fraction of total variation in ε t explained by the j-th principal component is given by This property leads to another convenient feature of the principal component approach. Namely, if low-ranked components do not add much to the overall explained variance, which is often the case when the number of risk factors is relatively low, we can work with a reduced number of m principal components, where m < n. The first m components will then explain ∑ m j=1 λ j ∑ n k=1 λ l 1 of the variation in ε t . In that case, L is replaced by an n-by-m matrix L m , where P m is an n-by-m matrix of the first m normalized eigenvectors, and is a diagonal matrix of the first m eigenvalues. The m-dimensional vector of the first m principal components of ε t is then given by for any t, where L −1 m is the pseudo-inverse of L m . It is straightforward to generalize the multivariate EVT approach presented in the paper to the case of m < n principal components. Using definition (A1), (7), (8) , (15) , and (16) maintain the same form. As an additional robustness check of our method, we present here the results for the exchange-rate data. On average, currency portfolios have lower rates of return than stocks but are significantly more volatile and notoriously fat-tailed. Here, we use daily data on 20 major currencies against the US Dollar (USD), retrieved from Refinitiv-Eikon. We used the following currencies: Australian Dollar, Brazilian Real, Canadian Dollar, Swiss Franc, Chinese Yuan, Euro, Pound Sterling, Hong Kong Dollar, Indian Rupee, Japanese Yen, South Korean Won, Mexican Peso, Norwegian Krone, New Zealand Dollar, Russian Ruble, Swedish Krona, Singapore Dollar, Turkish Lira, New Taiwan Dollar, and South African Rand. The data consists of middle exchange rates between 1 December 2000, and 1 December 2020, a total of 5217 returns per currency. Figure A1 shows the returns of an ''equally-weighted" portfolio, which we will apply in our analysis, analogous to Section 3. We use a 2000-day rolling-window sequence of one-day VaR forecasts to analyze the out-of-sample performance on the exchange-rate data. As in Section 4, we compare these results to the corresponding actual returns of the equally-weighted portfolio for each day using the same five quantiles (α = 0.001, 0.005, 0.010, 0.050, 0.100) and the same multivariate methods. The results are summarized in Tables A1 and A2. Table A1 compares the actual and expected number of VaR violations for each quantile. The multivariate EVT method with GP distribution has the number of violations that is again closest to the expected. We can see this more formally from the backtesting results in Table A2 . Again, the proposed multivariate EVT method cannot be rejected whenever the comparable methods can. This result indicates that the method is robust across different asset classes, observation periods, and backtesting horizons. Extreme value dependence in financial markets: Diagnostics, models, and financial implications Measuring financial contagion: A copula approach Extreme Dependence of International Stock Market New evidence on asymmetric dependence structures in international equity markets Dependence structure between the equity market and the foreign exchange market-A copula approach Is the potential for international diversification disappearing? A dynamic copula approach Extreme value theory for finance: A survey Extreme Correlation in Cryptocurrency Markets Uncertainty in Euro area and the bond spreads Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management Risk and Asset Allocation Extreme Value Theory for Risk Managers Steps in Applying Extreme Value Theory to Finance: A Review; Working Paper; Bank of Canada Measuring Risk with Extreme Value Theory Univariate Extreme Value Theory, GARCH and Measures of Risk Modeling Extremal Events for Insurance and Finance Financial Risk Forecasting Estimating the Tails of Loss Severity Distribution Using Extreme Value Theory Empirical Study of Value-at-Risk and Expected Shortfall as a Coherent Measure of Risk A Comparison of Some Univariate Models for Value-at-Risk and Expected Shortfall Risk Analysis of Rice Losses Caused by Typhoon for Taiwan Extreme correlations of international equity markets Increased correlation in bear markets Financial Modeling under Non-Gaussian Distributions An Introduction to Functional Extreme Value Theory Copulas and Multivariate Extremes. In Multivariate Extreme Value Theory and D-Norms Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach A Framework for Scenario-Based Risk Management; Working Paper Pair-copula construction of multiple dependence Modeling international financial returns with a multivariate regime-switching copula Measuring Value-at-Risk and Expected Shortfall of crude oil portfolio using extreme value theory and vine copula Value-at-risk and expected shortfall in cryptocurrencies' portfolio: A vine copula-based approach A multivariate generalized orthogonal GARCH model Limiting Forms of the Frequency Distribution of the Largest or Smallest Member of a Sample Sur la Distribution limite du Terme maximum d'une Série aléatoir Statistical Inference Using Extreme Order Statistics Institutional investors and stock market volatility A first application of independent component analysis to extracting structure from stocks returns High Frequency Foreign Exchange Rates: Price Behaviour Analysis and 'True Price Models Term structure of interactions of foreign exchange rates. Computational Finance Independent Component Analysis: Algorithms and Applications On the Relation between Expected Value and the Volatility of the Nominal Excess Return on Stocks A long memory property of stock market returns and a new model Why Does Stock Market Volatility Change Over Time? Modeling the Persistence of Conditional Variances: A Comment Modeling the Persistence of Conditional Variances: A Comment A Class of Nonlinear ARCH Models Threshold Heteroskedastic Models Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica Value at Risk and Extreme Returns; FMG Discussion Paper No. 273; Financial Markets Group Coherent measures of risk Expected Shortfall: A Natural Coherent Alternative to Value at Risk On the coherence of expected shortfall Measures of risk A Review of Backtesting and Backtesting Procedures Evaluating interval forecasts