key: cord-0217477-izmntxf4 authors: Burzoni, Matteo; Munari, Cosimo; Wang, Ruodu title: Adjusted Expected Shortfall date: 2020-07-17 journal: nan DOI: nan sha: 9f26abf14ac50dbf75dc934c93e86d29f786ed08 doc_id: 217477 cord_uid: izmntxf4 We introduce and study the main properties of a class of convex risk measures that refine Expected Shortfall by simultaneously controlling the expected losses associated with different portions of the tail distribution. The corresponding adjusted Expected Shortfalls quantify risk as the minimum amount of capital that has to be raised and injected into a financial position $X$ to ensure that Expected Shortfall $ES_p(X)$ does not exceed a pre-specified threshold $g(p)$ for every probability level $pin[0,1]$. Through the choice of the benchmark risk profile $g$ one can tailor the risk assessment to the specific application of interest. We devote special attention to the study of risk profiles defined by the Expected Shortfall of a benchmark random loss, in which case our risk measures are intimately linked to second-order stochastic dominance. In this paper we introduce and discuss the main properties of a new class of quantile-based risk measures. Following the seminal paper by Artzner et al. (1999) , we view a risk measure as a capital requirement rule. More precisely, we quantify risk as the minimal amount of capital that has to be raised and invested in a pre-specified financial instrument (which is typically taken to be risk free) to confine future losses within a pre-specified acceptable level of security. Value at Risk (VaR) and VaR, a financial position is acceptable if its loss probability does not exceed a given threshold. In line with our convention, this means that VaR coincides the lower quantile of the underlying 1 Ruodu Wang is supported by Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-03823, distribution at an appropriate level. Under ES, a financial position is acceptable if, on average, it does not produce a loss beyond a given VaR. In the banking regulatory sector, the Basel Committee has recently decided to move from VaR at level 99% to ES at level 97.5% for the measurement of financial market risk. In the insurance regulatory sector, VaR at level 99.5% is the reference risk measure in the Solvency II and in the forthcoming Insurance Capital Standard framework while ES at level 99% is the reference risk measure in the Swiss Solvency Test framework. In the past 20 years, an impressive body of research has investigated the relative merits and drawbacks of VaR and ES at both a theoretical and a practical level. This investigation led to a better understanding of the properties of these two risk measures at the same time triggering a variety of new research questions about risk measures in general. We refer to early work on ES in Acerbi and Tasche (2002) Acerbi (2002), Frey and McNeil (2002) , and Rockafellar and Uryasev (2002) Bignozzi et al. (2020) , Baes et al. (2020) , and Wang and Zitikis (2020) . For robustness problems concerning VaR and ES, see, e.g., Cont et al. (2010) and Krätschmer et al. (2014) , and for their backtesting, see, e.g., Ziegel (2016) , Du and Escanciano (2017) , and Kratz et al. (2018) . A fundamental difference between VaR and ES is that, by definition, VaR is completely blind to the behavior of the loss tail beyond the reference quantile whereas ES depends on the whole tail beyond it. It is often argued that this difference, together with the convexity property, makes ES a superior risk measure compared to VaR. In fact, this is the main motivation that led the Basel Committee to shift from VaR to ES in their market risk framework; see BCBS (2012) . However, every risk measure captures risk in a specific manner and, as such, is bound to possess some limitations. This is also the case of ES. Indeed, being essentially an average beyond a given quantile, ES can only provide an aggregate estimation of risk which, by its very definition, does not distinguish across different tail behaviors with the same mean. While in specific situations a finer risk classification can be obtained by means of other risk measures, including spectral and deviation risk measures, our goal is to introduce a general class of convex risk measures that help make that distinction by using ES as their fundamental building block. The advantage of this approach is that it can be directly linked to a regulatory framework based on ES. To this end, we construct a risk measure that is sensitive to changes in the ES profile of a random variable X, i.e., the curve of ES p → ES p (X) viewed as a function of the underlying confidence level. More specifically, we "adjust" ES into ES g (X) := sup p∈ [0, 1] {ES p (X) − g(p)} where g : [0, 1] → (−∞, ∞] is a given increasing function. The risk measure ES g is called the adjusted ES with risk profile g and is a monetary risk measure in the sense of Artzner et al. (1999) . Indeed, the quantity ES g (X) can be interpreted as the minimal amount of cash that has to be raised and injected into X in order to ensure the following target solvency condition: In this sense, the function g defines the threshold between acceptable and unacceptable ES profiles. Interestingly, ES g is a convex risk measure but is not coherent unless it reduces to a standard ES. The goal of this paper is to introduce the class of adjusted ES's and discuss their main theoretical properties. In Section 2 we provide a formal definition and a useful representation of adjusted ES together with a number of illustrations. The focus of Section 3 is on some basic mathematical properties. A special interesting case is when the risk profile g is given by the ES of a benchmark random variable. We focus on this situation in Section 4 and show that such special adjusted ES's are strongly linked with second-order stochastic dominance. More precisely, they coincide with the monetary risk measures for which acceptability is defined in terms of carrying less risk, in the sense of second-order stochastic dominance, than a given benchmark random variable. In Section 5 we focus on a variety of optimization problems featuring risk functionals either in the objective function or in the optimization domain and study the existence of optimal solutions in the presence of this type of risk measures. In each case of interest we are able to establish explicit optimal solutions. Throughout the paper we fix an atomless probability space (Ω, F, P) and denote by L 1 the space of (equivalent classes with respect to P-almost sure equality of) P-integrable random variables. For any two random variables X, Y ∈ L 1 we write X ∼ Y whenever X and Y are identically distributed. We adopt the convention that positive values of X ∈ L 1 correspond to losses. In this setting, Value at Risk (VaR) and Expected Shortfall (ES) are respectively defined as The quantities VaR p (X) and ES p (X) represent the minimal amount of cash that has to be raised and injected into X in order to ensure the following target solvency condition (for 0 < p < 1): The VaR solvency condition requires that the loss probability of X is capped by 1 − p whereas the ES solvency condition states that there is no loss on average beyond the (left) p-quantile of X. The focus of the paper is on the following class of risk measures. Here and in the sequel, we denote by G the set of all functions g : [0, 1] → (−∞, ∞] that are increasing (in the non-strict sense) and not identically ∞. Moreover, we use the convention ∞ − ∞ = −∞. Definition 2.1. Consider a function g ∈ G and define the set is called the g-adjusted Expected Shortfall (g-adjusted ES). To best appreciate the financial interpretation of the above risk measure, it is useful to consider the ES profile associated with a random variable X ∈ L 1 , i.e., the function p → ES p (X). From this perspective, the function g in the preceding definition can be interpreted as a threshold between acceptable (safe) and unacceptable (risky) ES profiles. In this sense, the set A g consists of all the positions with acceptable ES profile and the quantity ES g (X) represents the minimal amount of capital that has to be injected into X in order to align its ES profile with the chosen acceptability profile. For this reason, we will sometimes refer to g as the target ES profile or, more generally, the target risk profile. If, for given p ∈ [0, 1], we consider the target ES profile then ES g (X) = ES p (X) for every random variable X ∈ L 1 . In words, the standard ES is a special case of an adjusted ES. The next proposition highlights an equivalent but operationally preferable formulation of adjusted ES's which also justifies the chosen terminology. Proposition 2.2. For every risk profile g ∈ G and for every X ∈ L 1 we have Proof. Fix X ∈ L 1 and note that for every m ∈ R the condition X − m ∈ A g is equivalent to for every p ∈ [0, 1]. For p = 1 both sides could be equal to ∞. However, in view of our convention ∞ − ∞ = −∞, the above inequality holds if and only if m ES p (X) − g(p) for every p ∈ [0, 1]. The desired representation easily follows. Remark 2.3. (i) In line with our main motivation, the adjusted ES is a tool that allows us to distinguish risks with the same tail expectation without leaving the world of ES. In the context of the discussion on tail risk triggered by BCBS (2012), the authors of Liu and Wang (2020) proposed the following way to quantify the degree of tail blindness of a risk measure: For a given p ∈ (0, 1), In this case, ρ does not distinguish between two random losses having the same (left) quantiles beyond level p. It is not difficult to prove that ES g satisfies the p-tail property if and only if g is constant on the interval (0, p). This provides a simple way to tailor the tail sensitivity of ES g . (the so-called benchmark loss distribution) and defines the acceptance set by The corresponding LVaR is given by The quantity LVaR α (X) represents the minimal amount of capital that has to be injected into the position X in order to ensure that, for each loss level x, the probability of exceeding a loss of size x is controlled by α(x). According to Proposition 3.6 in the cited paper, we can equivalently write LVaR α (X) = sup where α −1 + is the right inverse of α. This highlights the similarity with adjusted ES's. Figure 1 : Left: Density function of X 1 (blue) and X 2 (black). The vertical lines correspond to the respective 99% quantiles. Right: Tails of of X 1 (blue) and X 2 (black) beyond the 99% quantile. Below: ES profile of X 1 (blue) and X 2 (black) for p 0.99. To illustrate the functioning of the adjusted ES, we consider the following simple example. Consider two normally distributed random variables X i ∼ N (µ i , σ 2 i ), with µ 1 = 1, µ 2 = 0, σ 1 = 0.125, σ 2 = 0.5. For every probability level p ∈ (0, 1) we have where φ and Φ are, respectively, the density and the distribution function of a standard normal random variable. For p = 99% the ES of both random variables is approximately equal to 1.33. In Figure 1 we plot the two distribution functions. Despite having the same ES, the two risks are quite different mainly because of their different variance: The potential losses of X 1 tend to accumulate around its mean whereas those of X 2 are more disperse and can be significantly higher (compare the tails in Figure 1 ). A closer look at the ES profile of both random variables shows that the ES profile of X 1 is more stable than that of X 2 (see again Figure 1 ). A simple way to distinguish X 1 and X 2 while, at the same time, focusing on average losses beyond the 99% quantile is to consider In this case, we easily obtain (2) The focus of ES g is still on the tail beyond the 99% quantile. However, the risk measure ES g is able to detect the heavier tail of X 2 and penalize it with a higher capital requirement. This is because ES g is additionally sensitive to the tail beyond the 99.75% quantile and penalizes any risk whose average loss on this far region of the tail is too large. We use a similar target risk profile to compare the behavior of the classical ES and the adjusted ES on real data. We collect the S&P 500 and the NASDAQ Composite indices daily log-returns (using closing prices) from January 01, 1999 to June 30, 2020. Each index has 5406 data points (publicly available from Yahoo Finance). We estimate the risk measures using a standard AR(1)-GARCH(1,1) model with t innovations (see Chapter 4 of McNeil et al. (2015) for details). In line with Basel III guidelines, to obtain less volatile outcomes we compute average risk measure estimates based on a 60-days moving window. We consider the risk profile function (2) the COVID-19 crisis in early 2020, ES g is visibly larger than the reference ES. This illustrates that ES g may capture tail risk in a more appropriate way than ES, especially under financial stress. As illustrated above, a key feature of adjusted ES is the flexibility in the choice of the target risk profile g. Indeed, the same random loss can be considered more or less relevant depending on a variety of factors, including the availability of hedging strategies or other risk mitigation tools in the underlying business sector. The choice of g can be therefore tailored to the particular area of application by assigning different weights to different portions of the reference tail. Two examples are especially relevant. On the one hand, we consider a continuous risk profile of the form where L is a benchmark random loss. In this case, we have The associated target solvency condition reads: This choice of g seems appropriate in the context of portfolio risk management. The distribution of the random loss L may belong to a class of benchmark distributions and the adjusted ES corresponds to the smallest amount of cash that has to be raised and injected in the portfolio to shift its profit and loss distribution until the new distribution dominates the benchmark distribution in the sense of second order stochastic dominance. In other words, the above adjusted ES incorporates secondorder stochastic dominance into a monetary risk measure by where SSD denotes second-order stochastic domination. Despite the importance of such a concept, we are not aware of earlier attempts to explicitly construct monetary risk measures whose underlying acceptability condition is based on second-order stochastic dominance. This paper offers first results in this direction thereby preparing the theoretical ground for new contributions to the rich literature on the application of stochastic dominance to portfolio risk management, for which we refer to the survey by Levy (1992) and to the more recent contributions by, e.g., Ogryczak and Ruszczynski (2002) , De Giorgi (2005), and Hodder et al. (2015) . In the second example, we consider a piecewise constant function of the form where 0 = r 1 < · · · < r n−1 < ∞ and 0 < p 1 < · · · < p n < 1. In this case, we have The associated target solvency condition reads: The coefficients r 1 , . . . , r n represent benchmark risk thresholds whereas p 1 , . . . , p n correspond to some pre-specified confidence levels. Note that, by design, we always have This choice of g seems appropriate in the context of solvency regulation. If p 1 coincides with a reference regulatory level, e.g. 97.5% in Basel III and 99% in the Swiss Solvency Test, the adjusted ES is by design as stringent as the regulatory ES and the additional thresholds r 2 , . . . , r n impose extra limitations to the amount of risk that a firm is allowed to take. In particular, different bounds can be imposed for the, e.g., one in a hundred times event, one in a thousand times event, and the one in a hundred thousand times event. These bounds may correspond to suitable fractions of available capital so that, in case of such adverse events, one can directly quantify the necessary cost for covering the underlying losses. In this way, the actual risk bounds would be firm specific but the rule to determine them would be the same for every company. This is reminiscent of the proposal about Loss VaR in Bignozzi et al. (2020) , with ES replacing VaR. It is worth pointing out that imposing additional constraints for higher risks may lead to lower the base regulatory requirement by taking p 1 strictly smaller than the reference regulatory level. By doing so, regulators may avoid penalizing firms that are particularly careful about their tail behavior. A piecewise constant risk profile may be adopted also in other applications. We provide a simple illustration in the context of cyber risk. Differently from other operational risks, cyber risk has a strong geographical component. The empirical study Biener et al. (2015) , which takes into account 22,075 incidents reported between March 1971 and September 2009, reveals that "Northern America has some of the lowest mean cyber risk and non-cyber risk losses, whereas Europe and Asia have much higher average losses despite Northern American companies experience more than twice as many (51.9 per cent) cyber risk incidents than European firms (23.2 per cent) and even more than twice as many as firms located on other continents". A possible reason is that North American companies may be better equipped to protect themselves against such events. Cyber risk cannot be properly managed by a simple frequency-severity analysis. In the qualitative analysis of Refsdal et al. (2015), many additional factors are identified including ease of discovery, ease of exploit, awareness and intrusion detection. The answers may very well depend on the specific sector if not on the specific firms under consideration. The choice of different reference risk profiles g across companies might be a way to apply the theory of risk measures in the spirit of Artzner et al. (1999) to the rather complex analysis of this type of risk. For example, it would be possible to set where Z 1 , Z 2 , Z 3 are suitable benchmark random losses. The resulting adjusted ES is The associated target solvency condition is given by The choice of g should be motivated by specific cyber risk events (see Refsdal et al. (2015) for a categorization of likelihood/severity for different cyber attacks): The one in a hundred times event could be the malfunctioning of the server, the one in a thousand times event the stealing of the profile data of the clients, the one in a hundred thousand times event the stealing of the credit cards details of the customers. Note that it is possible to choose a single benchmark random loss or a different benchmark random loss for each considered incident. This choice could also be company specific so as to reflect the company's ability to react to the different types of cyber attacks. This is in line with Biener et al. (2015) , which says that "Regarding size (of the average loss per event), we observe a U-shaped relation, that is, smaller and larger firms have higher costs than medium-sized. Possibly, smaller firms are less aware of and less able to deal with cyber risk, while large firms may suffer from complexity". While in principle a different risk category may call for a different choice of the acceptable ES profile g, it is sometimes important in practice to ensure a certain degree of comparability across risk assessments. 2 Suppose for example that a bank wants to compare the exposure to different risks X 1 , . . . , X k arising from different business lines. In principle, each business unit may use a specific ES profile g j . However, if the bank requires that g 1 = · · · = g k = 0 on [0, p) for a common p ∈ (0, 1), we can write For each X j , the first component in the decomposition is an ES with common confidence level p, which can be used for comparison. The exceedance term ES g j (X j ) − ES p (X j ) represents the extra amount of capital that is needed to cover the specific risk type. The above decomposition takes a more explicit form if each g j is a piecewise constant function as in (3) with customized parameters r j i 's and p j i 's. If we take p 1 1 = · · · = p k 1 = p, then we obtain In this case, the risk-specific component is activated only when ES p j i (X j ) is larger than the penalized benchmark ES term ES p (X j ) + r j i for some index i. The parameters r j i 's and p j i 's can be tailored, e.g., to the size of the underlying tails. This example can be easily adapted to include a different number of thresholds for each risk class, i.e., n may also depend on j. The choice may depend, e.g., on the size of the available observation sample and the frequency of tail observations. In this section we discuss a selection of relevant properties of adjusted ES. It is a direct consequence of our definition that every adjusted ES is a monetary risk measure in the sense of Föllmer and Schied (2016) , i.e., is monotone and cash additive. The other properties listed below are automatically inherited from the corresponding properties of ES. For every risk profile g ∈ G the risk measure ES g satisfies the following properties: • monotonicity: ES g (X) ES g (Y ) for all X, Y ∈ L 1 such that X Y . • cash additivity: ES g (X + m) = ES g (X) + m for all X ∈ L 1 and m ∈ R. • convexity: • normalization: ES g (0) = 0 if and only if g(0) = 0. Being convex and law invariant, every adjusted ES is automatically consistent with second-order stochastic dominance; see, e.g., Bellini et al. (2021) . In fact, the link between adjusted ES's and stochastic dominance is far stronger. Recall that for any random variables X, Y ∈ L 1 we say that X dominates Y with respect to second-order stochastic dominance, written X SSD Y , whenever the following condition holds: for every increasing and concave function u : R → R. In the language of utility theory, this means that X is preferred to Y by every risk-averse agent (recall that positive values of a random variable represent losses). We refer to Levy (1998) for a classical reference on stochastic dominance. By convexity and law invariance, for every risk profile g ∈ G the risk measure ES g satisfies: • consistency with SSD : ES g (X) ES g (Y ) for all X, Y ∈ L 1 such that X SSD Y . This implies that ES g belongs to the class of consistent risk measures as defined in Mao and Wang (2020) . In fact, it is shown in that paper that any consistent risk measure can be expressed as an infimum of a collection of risk measures which, using the terminology of this paper, are precisely of adjusted ES type. Proposition 3.1 (Theorem 3.1 in Mao and Wang (2020) ). Let ρ : L 1 → (−∞, ∞] be cash additive and consistent with SSD . Then, there exists H ⊂ G such that for every X ∈ L 1 we have The above proposition shows that adjusted ES can be seen as the building block for risk measures that are consistent with second-order stochastic dominance. This class is large and includes, e.g., all law-invariant convex risk measures. It is well known that, in addition to convexity, ES satisfies positive homogeneity. This qualifies it as a coherent risk measure in the sense of Artzner et al. (1999) . In the next proposition we show that ES g satisfies positive homogeneity only in the case where it coincides with some ES. In other words, with the exception of ES, the class of adjusted ES's consists of monetary risk measures that are convex but not coherent. Proposition 3.2. For every risk profile g ∈ G the following statements are equivalent: (a) ES g is positively homogeneous, i.e., ES g (λX) = λ ES g (X) for all X ∈ L 1 and λ ∈ (0, ∞). (b) g(0) = 0 and g(p) ∈ (0, ∞) for at most one p ∈ (0, 1]. (c) ES g = ES p where p = sup{q ∈ [0, 1] | g(q) = 0}. Proof. "(a)⇒(b)": Since ES g is positively homogeneous we have for every λ ∈ (0, ∞). As g(0) < ∞ by our assumptions on the class G, we must have g(0) = 0. Now, assume by way of contradiction that 0 < g(p 1 ) g(p 2 ) < ∞ for some 0 < p 1 < p 2 1. Take now q ∈ (p 1 , p 2 ) and b ∈ (0, g(p 1 )) and set Note that a < 0. Since the underlying probability space is assumed to be atomless, we can always find a random variable X ∈ L 1 satisfying Moreover, for every p ∈ [p 1 , q), the choice of b implies As a result, for every p ∈ [0, q) we obtain Similarly, for every p ∈ [q, 1] we easily see that This yields ES g (X) 0. However, taking λ > 0 large enough delivers in contrast to positive homogeneity. As a consequence, we must have p 1 = p 2 and thus (b) holds. "(b)⇒(c)": Set q = sup{p ∈ [0, 1] | g(p) = 0}. Note that q ∈ [0, 1]. Clearly, we have g(p) = 0 for every p ∈ [0, q) and g(p) = ∞ for every p ∈ (q, 1] by assumption. Take an arbitrary X ∈ L 1 . From the definition of ES and the continuity of the integral, it follows that p → ES p (X) is continuous. As a result, we obtain ES p (X) = ES q (X). An adjusted ES is convex but, unless it coincides with a standard ES, not subadditive. It is therefore natural to focus on infimal convolutions of adjusted ES's, which are important tools in the study of optimal risk sharing and capital allocation problems involving non-subadditive risk measures; see, e.g., Barrieu and El Karoui (2005) , Burgert and Rüschendorf (2008) , Filipović and Svindland (2008) for results in the convex world and Embrechts et al. (2018) for results beyond convexity. Definition 3.3. Let n ∈ N and consider ρ 1 , . . . , ρ n : L 1 → (−∞, ∞]. For every X ∈ L 1 we set S n (X) := (X 1 , . . . , X n ) ∈ L 1 × · · · × L 1 is called the inf-convolution of {ρ 1 , . . . , ρ n }. For n = 2 we simply write ρ 1 ρ 2 . Remark 3.4. Recall that, if ρ 1 , . . . , ρ n are monetary risk measures, then for every X ∈ L 1 is the acceptance sets induced by ρ i for i = 1, . . . , n. This shows that the infimal convolution of monetary risk measures is also a monetary risk measure. We establish a general inequality for inf-convolutions. More precisely, we show that any infconvolution of adjusted ES's can be controlled from below by a suitable adjusted ES. This allows us to derive a formula for the inf-convolution of an adjusted ES with itself. Proposition 3.5. Let n ∈ N and consider the risk profiles g, g 1 , . . . , g n ∈ G. For every X ∈ L 1 In particular, for every X ∈ L 1 n i=1 ES g (X) = ES ng (X). Proof. To show (4), it suffices to focus on the case n = 2. For all Y ∈ L 1 and p ∈ [0, 1] we have by subadditivity of ES. Taking the supremum over p and the infimum over Y delivers the desired inequality. To show (5), note that the inequality " " follows directly from (4). To show the inequality " ", observe that As a result, we infer that This yields the desired inequality and concludes the proof. Remark 3.6. A risk measure that is not subadditive may incentivize the splitting and (internal) reallocation of risk with the sole purpose of reaching a lower level of capital requirements. This is related to the notion of regulatory arbitrage introduced in Wang (2016). In line with that paper, we say that a functional ρ : L 1 → (−∞, ∞] is either free of regulatory arbitrage or has limited or infinite regulatory arbitrage if the quantity (recall our convention ∞ − ∞ = −∞) is null, finite, or infinite for every X ∈ L 1 . Clearly, every risk measure that is not subadditive admits regulatory arbitrage. The preceding result on infimal convolutions allows us to show that an adjusted ES exhibits regulatory arbitrage only in a limited form. More precisely, for a risk profile g ∈ G with g(0) = 0 we have: In particular, to prove (i), it suffices to note that Proposition 3.5 implies for every We conclude this section by focusing on dual representations, which are a useful tool in many applications, notably optimization problems; see the general discussion in Rockafellar (1974) and the results on risk measures in Föllmer and Schied (2016) . In what follows we denote by P the set of probability measures on (Ω, F) and use standard notation for Radon-Nikodym derivatives. Proposition 3.7. Consider a risk profile g ∈ G. For every X ∈ L 1 we have Proof. For notational convenience, for every Q ∈ P ∞ P set Take X ∈ L 1 . The well-known dual representation of ES states that ES p (X) = sup E Q [X] Q ∈ P ∞ P , dQ dP 1 1 − p for every p ∈ [0, 1]; see, e.g., Föllmer and Schied (2016) . Then, it follows that . It remains to observe that the above infimum equals g(1 − dQ/dP −1 ∞ ) by monotonicity of g. In this section we focus on a special class of adjusted ES's for which the target risk profiles are expressed in terms of the ES profile of a reference random loss. As shown below, these special adjusted ES's are intimately linked with second-order stochastic dominance. (1) ρ is called a benchmark-adjusted ES if there exists Z ∈ L 1 such that for every X ∈ L 1 ρ(X) = sup (2) ρ is called an SSD-based risk measure if there exists Z ∈ L 1 such that for every X ∈ L 1 It is clear that benchmark-adjusted ES's are special instances of adjusted ES's for which the target risk profile is defined in terms of the ES profile of a benchmark random loss. The distribution of this random loss may correspond, for example, to the (stressed) historical loss distribution of the underlying position or to a target (risk-class specific) loss distribution. It is also clear that SSD-based risk measures are nothing but monetary risk measures associated with acceptance sets defined through second-order stochastic dominance. The classical characterization of second-order stochastic dominance in terms of ES can be used to show that benchmark-adjusted ES's coincide with SSD-based risk measures. In addition, we provide a simple characterization of this class of risk measures. (i) ρ is a benchmark-adjusted ES. (ii) ρ is an SSD-based risk measure. (iii) ρ is consistent with SSD and the set {X ∈ L 1 | ρ(X) 0} has an SSD -minimum element. Proof. Recall that for all X ∈ L 1 and Z ∈ L 1 we have X SSD Z if and only if ES p (X) ES p (Z) for every p ∈ [0, 1]; see, e.g., Theorem 4.A.3 in Shaked and Shanthikumar (2007) . For convenience, To show that (i) implies (ii), assume that ρ is a benchmark-adjusted ES with respect to Z ∈ L 1 . Then, for every X ∈ L 1 To show that (ii) implies (i), assume that ρ is SSD-based with respect to Z ∈ L 1 . Then, we have It is clear that (iii) implies (ii). Finally, to show that (ii) implies (iii), assume that ρ is an SSDbased risk measure with respect to Z ∈ L 1 . It is clear that Z ∈ A. Now, take an arbitrary X ∈ A. We find a sequence (m n ) ⊂ R such that m n ↓ ρ(X) and X − m n SSD Z for every n ∈ N. This implies that X − ρ(X) SSD Z. Since ρ(X) 0, we infer that X SSD Z as well. This shows that A has an SSD-minimum element. To establish that ρ is consistent with SSD , take arbitrary X, Y ∈ L 1 satisfying X SSD Y . For every m ∈ R such that Y − m SSD Z we clearly have that This implies that ρ(X) ρ(Y ) and concludes the proof. The preceding result delivers an interesting representation of a benchmark-adjusted ES in terms of utility functions which helps highlighting its "risk aversion" nature. More precisely, we show that an adjusted ES with risk profile given by the ES profile of a benchmark random loss Z ∈ L 1 determines the minimal amount of capital that makes every risk-averse agent better off than being exposed to the loss Z. In this sense, one may view a benchmark-adjusted ES as a worst-case utility-based risk measure over all conceivable risk-averse profiles. Recall that, if one moves from utility functions to loss functions, then utility-based risk measures correspond to the so-called shortfall risk measures as defined, e.g., in (Föllmer and Schied, 2016, Section 4.9 ). Proposition 4.3. Let Z ∈ L 1 and consider the risk profile g(p) = ES p (Z) for every p ∈ [0, 1]. Moreover, let U be the family of all (nonconstant) concave and increasing functions u : R → R. Then, for every X ∈ L 1 To establish the claim, we can equivalently prove that for every X ∈ L 1 To this effect, Theorem 4.A.3 in Shaked and Shanthikumar (2007) implies that This implies (6). Indeed, the inequality " " is clear. To show the inequality " ", take any number Then, for every u ∈ U we must have X − k ∈ A u or, equivalently, X − k ∈ A. This yields k inf{m ∈ R | X − m ∈ A}. Taking the infimum over such k's delivers the desired inequality and completes the proof. In light of the relevance of benchmark adjusted ES's, we are interested in characterizing when the acceptable risk profile g of an adjusted ES can be expressed in terms of an ES profile. To this effect, it is convenient to introduce the following additional class of risk measures, which will be shown to contain all benchmark-adjusted ES's. We denote by L 0 the space of all random variables. Definition 4.4. A functional ρ : L 1 → (−∞, ∞] is called a quantile-adjusted ES if there exists Z ∈ L 0 such that for every X ∈ L 1 ρ(X) = sup To establish our desired characterization, for a risk profile g ∈ G we define h g : Here, we set 0 · ∞ = 0 so that h g (1) = 0. Moreover, we introduce the following sets: G VaR := {g ∈ G | g is finite on [0, 1), left-continuous on [0, 1], and right-continuous at 0}, G ES := {g ∈ G VaR | h g is concave on (0, 1) and left-continuous at 1}. (i) g ∈ G VaR if and only if there exists a random variable Z ∈ L 0 that is bounded from below and satisfies g(p) = VaR p (Z) for every p ∈ [0, 1]. (ii) g ∈ G ES if and only if there exists a random variable Z ∈ L 1 such that g(p) = ES p (Z) for every p ∈ [0, 1]. Proof. (i) The "if" part is clear. For the "only if" part, let U be a uniform random variable on [0, 1] and set Z = g(U ). Then, it is well known that VaR p (Z) = g(p) for every p ∈ [0, 1]. Moreover, since g(0) > −∞, we see that Z is bounded from below. (ii) The "if" part is straightforward. For the "only if" part, let U be a uniform random variable on [0, 1]. We denote by h g the left derivative of h g . Then, for every p ∈ [0, 1) we have This shows that, by taking Z = −h g (U ), we have g(p) = ES p (Z) for every p ∈ [0, 1). The left continuity of g and ES · (Z) at 1 gives the same equality for p = 1. As a direct consequence of the previous lemma we derive a characterization of quantile-and benchmark-adjusted ES's in terms of the underlying risk profile. Theorem 4.6. For every risk profile g ∈ G the following statements hold: (i) There exists Z ∈ L 0 that is bounded from below and such that ES g is a quantile-adjusted ES with respect to Z if and only if g ∈ G VaR . (ii) There exists Z ∈ L 1 such that ES g is an benchmark-adjusted ES with respect to Z if and only if g ∈ G ES . Remark 4.7. We infer from Theorem 4.2 and 4.6 that the classical ES does not belong to the class of SSD-based risk measures as the associated risk profile is not in G ES (see also Proposition 3.2). Since we clearly have G ES ⊂ G VaR , it follows from the above results that every benchmark-adjusted ES is also a quantile-adjusted ES. In particular, this implies that, for every random variable Z ∈ L 1 , we can always find a random variable W ∈ L 0 such that VaR p (W ) = ES p (Z) for every p ∈ [0, 1]. In words, every ES profile can be reproduced by a suitable VaR profile. As pointed out by the next proposition, the converse result is, in general, not true. In addition, we also show that an adjusted ES need not be a quantile-adjusted ES. Proposition 4.8. (i) There exists g ∈ G such that ES g = ES h for every h ∈ G VaR . (ii) There exists g ∈ G VaR such that ES g = ES h for every h ∈ G ES . Proof. The second assertion follows immediately from Theorem 4.6 and the fact that the inclusion G ES ⊂ G VaR is strict. To establish the first assertion, fix q ∈ (0, 1) and define g ∈ G by setting It follows that for every X ∈ L 1 . We claim that ES g is not a quantile-adjusted ES. To the contrary, suppose that there exists a random variable Z ∈ L 0 that is bounded from below and satisfies for every X ∈ L 1 . Take r ∈ (q, 1) and X ∈ X such that ES r (X) > ES q (X). Then, for each λ > 0 By sending λ → ∞, we obtain ES q (X) ES r (X), which contradicts our assumption on X. Note that ES is always finite on our domain. Here, we are interested in discussing the finiteness of adjusted ES's associated with risk profiles in the class G VaR and G ES . We show that finiteness on the whole reference space L 1 can never hold in the presence of a risk profile in G ES while it can hold if we take a risk profile in G VaR . Proposition 4.9. Consider a risk profile g ∈ G. If g ∈ G VaR , then ES g can be finite on L 1 . If g ∈ G ES , then ES g cannot be finite on L 1 . Proof. To show the first part of the assertion, set g(p) = 1 1−p for every p ∈ [0, 1] (with the convention 1 0 = ∞). Note that g ∈ G VaR . Fix X ∈ L 1 and note that there exists q ∈ (0, 1) such that sup p∈ This shows that ES g is finite on the entire L 1 . To establish the second part of the assertion, take Z ∈ L 1 and set g(p) = ES p (Z) for every p ∈ [0, 1]. Note that g ∈ G ES by Lemma 4.5. If Z is bounded from above, then take X ∈ L 1 that is unbounded from above. In this case, it follows that If Z is unbounded from above, then take X = 2Z ∈ L 1 . In this case, we have Hence, we see that ES g is never finite on L 1 . The next result improves Proposition 3.5 by showing that the inf-convolution of benchmark-adjusted ES's can still be expressed as an adjusted ES. Proposition 4.10. Let n ∈ N and consider the risk profiles g 1 , . . . , g n ∈ G ES . For every X ∈ L 1 Proof. The inequality " " follows from Proposition 3.5. To show the inequality " ", note that there exist Z 1 , . . . , Z n ∈ L 1 such that A g i = {X ∈ L 1 | X SSD Z i } by Theorem 4.6. We prove that which, together with Remark 3.4, yields the desired inequality. Let U be a uniform random variable and, for any X ∈ L 1 , denote by F −1 X the (left) quantile function of X. Take i ∈ {1, . . . , n} and note . We deduce that each X ∈ A satisfies ES p (X) ES p (Z) for every p ∈ [0, 1], which is equivalent to X SSD Z. Note that Z ∈ n i=1 A g i so that n i=1 ES g i (Z) 0. Since the inf-convolution is consistent with SSD , as shown in Theorem 4.1 by Mao and Wang (2020) , we ES g i (Z) 0, which implies X ∈ n i=1 A g i as desired. Using the characterization of benchmark-adjusted ES's established in Theorem 4.2, many optimization problems related to benchmark-adjusted ES's or, equivalently, SSD-based risk measures can be solved explicitly. In this section, we focus on risk minimization and utility maximization problems in the context of a multi-period frictionless market that is complete and arbitrage free. The interest rate is set to be zero for simplicity. As is commonly done in the literature, this type of optimization problems, which are naturally expressed in terms of dynamic investment strategies, can be converted into static optimization problems by way of martingale methods. Below we focus directly on their static counterparts. For more details we refer, e.g., to Schied et al. (2009) or Föllmer and Schied (2016) . In addition, to ensure that all our problems are well defined, we work in the space L ∞ of P-bounded random variables. In the sequel, we denote by Q the risk-neutral pricing measure (whose existence and uniqueness in our setting are ensured by the Fundamental Theorem of Asset Pricing), by w ∈ R a fixed level of initial wealth, by x ∈ R a real number representing a constraint, by u : R → R ∪ {−∞} a concave and increasing function that is continuous (at the point where it potentially jumps to −∞) and satisfies lim y→−∞ u(y) < x < lim y→∞ u(y), and by ρ : L ∞ → (−∞, ∞] a risk functional. We focus on the following five optimization problems: (A) Risk minimization with a budget constraint: (B) Price minimization with controlled risk: (C) Risk minimization with a target utility level: (D) Worst-case utility with a reference risk assessment: (E) Worst-case risk with a reference risk assessment: where ρ is an SSD-consistent functional that is continuous with respect to the L ∞ -norm. Problem (A) is an optimal investment problem minimizing the risk given a budget constraint. Conversely, problem (B) aims at minimizing the cost given a controlled risk level. Problem (C) is about minimizing the risk exposure with a target utility level, similar to the mean-variance problem of Markowitz (1952) . The interpretation of problems (D) and (E) is different from the first three problems: They are not about optimization over risk, but about ambiguity, i.e., in these problems the main concern is model risk. Indeed, the set L ∞ may represent the class of plausible models for the distribution of a certain financial position of interest. In the case of problem (D), the assumption is that the only available information for X is the risk figure ρ(X), evaluated, e.g., by an expert or another decision maker. In this context, we are interested in determining the worst case utility among all possible models which agree with the evaluation ρ(X) = x (see also Example 5.3 of Wang et al. (2019)). A similar interpretation can be given for problem (E). Proposition 5.1. Each of the optimization problems (A)-(E) relative to a benchmark-adjusted ES ρ = ES g for g ∈ G ES admits an optimal solution of the explicit form Z + z where Z ∈ L ∞ has the ES profile g and z ∈ R. Moreover, Z is comonotonic with dQ dP in (A)-(B), and the (binding) constraint uniquely determines z in each problem. Proof. The result for the optimization problem (A) is a direct consequence of Proposition 5.2 in Mao and Wang (2020) . Let Z be comonotonic with dQ/dP which has ES profile g (comonotonicity is only relevant in problems (A) and (B)). Note that ρ(Z) = 0. For any random variable X ∈ L ∞ , we set Y X = Z + ρ(X). It is clear that ρ(Y X ) = ρ(X) and ES p (Y X ) = g(p) + ρ(X) = g(p) + sup q∈[0,1] {ES q (X) − g(q)} ES p (X). Hence, X SSD Y X . This observation will be useful in the analysis below. (i) We first look at problem (B). First, since both X → E Q [X] and ρ are translation-invariant, the condition ρ(X) x is binding, and problem (B) is equivalent to maximizing E Q [X] over X ∈ L ∞ such that ρ(X) = x. Let X ∈ L ∞ be any random variable with ρ(X) = x and let X be identically distributed as X and comonotonic with dQ/dP. Since X ∼ X, by the Hardy-Littlewood inequality (see, e.g., Remark 3.25 of Rüschendorf (2013)), we have E Q [X] E Q [X ]. Moreover, for any random variable Y ∈ L ∞ that is comonotonic with dQ/dP, we can write (see, e.g., (A.8) of Mao and Wang (2020) ) for some Borel probability measure µ on [0, 1]. Hence, X SSD Y X implies E Q [X ] E Q [Y X ], and we obtain Note also that ρ(Y X ) = ρ(X) = x. Hence, for any random variable X ∈ L ∞ , there exists Z + z for some z ∈ R which dominates X for problem (B). Since both the constraint and the objective are continuous in z ∈ R, an optimizer of the form Z + z exists. (ii) We next look at problem (C). Let X ∈ L ∞ be any random variable such that E[u(w −X)] = x. The aforementioned fact X SSD Y X implies that E[u(w − Y )] E[u(w − X)] = x since u is a concave utility function. Therefore, there exists ε 0 such that E[u(w − (Y − ε))] = x, and we take the largest ε satisfying this equality, which is obviously finite. Let z = ρ(X)−ε. It is then clear that E[u(w − (Z + z))] = E[u(w − X)] = x and ρ(Z + z) = ρ(Y − ε) = ρ(X) − ε ρ(X). Hence, Z + z dominates X as an optimizer for problem (C). Since both the constraint and the objective are continuous in z ∈ R, an optimizer of the form Z + z exists. (iii) Problems (D) and (E) can be dealt with using similar arguments. Remark 5.2. (i) Recall that ES does not belong to the class of SSD-based risk measures. As a consequence, the results in this section do not directly apply to ES. In particular, although ES is consistent with SSD, its acceptance set does not have a minimum SSD element as required by Proposition 4.2. We refer to Wang and Zitikis (2020) for a different characterization of ES. (ii) In the context of decision theory and, specifically, portfolio selection, it is sometimes argued that (second order) stochastic dominance is too extreme in the sense that it ranks risks according to the simultaneous preferences of every risk-averse agent, thus including utility functions that may lead to counterintuitive outcomes. A typical example is the one proposed by Leshno and Levy (2002) . Consider a portfolio that pays one million dollars in 99% of cases and nothing otherwise and another portfolio that pays one dollar with certainty. According to the sign convention adopted in this paper, the corresponding payoffs are given by with probability 1% −10 6 with probability 99% and Y = −1. Even though X does not dominate Y with respect to SSD, most agents prefer X to Y . Thus, the authors argue for the necessity of relaxing SSD in favor of a more reasonable notion. We point out that our approach yields a novel and reasonable generalization of SSD. First, consider the risk profile defined by g(p) = ES p (Y ) = −1 for every p ∈ [0, 1] and note that X is acceptable under ES g precisely when X SSD Y . Note also that ES p (X) g(p) ⇐⇒ p p := 1 − 10 −4 10 6 − 1 ≈ 1 − 10 −10 . This fact has two implications. On the one hand, it confirms that X does not dominate Y with respect to SSD and highlights that this failure is due to the behavior of X in the far region of its left tail. On the other hand, it suggests that it is enough to consider the new risk profile defined by h(p) = g(p) for p p and h(p) = ∞ otherwise to make X acceptable under ES h . In other words, moving from g to h is equivalent to moving from SSD to a relaxed form of SSD that enlarges the spectrum of acceptability in portfolio selection problems. However, note that ES h is not an SSD-based risk measure and, hence, the existence results obtained above do not apply to it. A systematic study of optimization problems under constraints of ES h type requires further research. Spectral measures of risk: A coherent representation of subjective risk aversion On the coherence of expected shortfall Coherent measures of risk Existence, uniqueness, and stability of optimal payoffs of eligible assets Inf-convolution of risk measures and optimal risk transfer Fundamental review of the trading book. Basel Committee on Banking Supervision. Basel: Bank for International Settlements Law-invariant functionals on general spaces of random variables Insurability of cyber risk: An empirical analysis Risk measures based on benchmark loss distributions Allocation of risks and equilibrium in markets with finitely many traders Robustness and sensitivity analysis of risk measurement procedures Reward-risk portfolio selection and stochastic dominance Backtesting expected shortfall: accounting for tail risk Quantile-based risk sharing Optimal capital and risk allocations for law-and cash-invariant convex functions Stochastic Finance: An Introduction in Discrete Time Var and expected shortfall in portfolios of dependent credit risks: conceptual and practical insights Improved portfolio choice using second-order stochastic dominance Unexpected shortfalls of expected shortfall: Extreme default profiles and regulatory arbitrage Comparative and qualitative robustness for law-invariant risk measures Multinomial var backtests: A simple implicit approach to backtesting expected shortfall Preferred by "all" and preferred by "most" decision makers: Almost stochastic dominance Stochastic dominance and expected utility: Survey and analysis Stochastic Dominance: Investment Decision Making under Uncertainty A theory for measures of tail risk. Mathematics of Operations Research, forthcoming Risk aversion in regulatory capital principles Portfolio selection Quantitative Risk Management: Concepts, Techniques, Tools Dual stochastic dominance and related mean-risk models Cyber-risk Management Conjugate duality and optimization Conditional value-at-risk for general loss distributions Mathematical risk analysis Robust preferences and robust portfolio choice. Handbook of Numerical Analysis Stochastic Orders. Springer Series in Statistics Regulatory arbitrage of risk measures Dual utilities on risk aggregation under dependence uncertainty An axiomatic foundation for the expected shortfall. Management Science, forthcoming Solvency ii, or how to sweep the downside risk under the carpet Coherence and elicitability