key: cord-0493701-2b7e4s4g authors: Wang, Liang; Xia, Weixuan title: Power-type derivatives for rough volatility with jumps date: 2020-08-24 journal: nan DOI: nan sha: 696d8d3d9a47abe39886cf52c1ca21dcf7d9c720 doc_id: 493701 cord_uid: 2b7e4s4g In this paper we propose a novel pricing-hedging framework for volatility derivatives which simultaneously takes into account rough volatility and volatility jumps. Our model directly targets the instantaneous variance of a risky asset and consists of a generalized fractional Ornstein-Uhlenbeck process driven by a L'{e}vy subordinator and an independent sinusoidal-composite L'{e}vy process. The former component captures short-term dependence in the instantaneous volatility, while the latter is introduced expressly for rectifying the activity level of the average forward variance. Such a framework ensures that the characteristic function of average forward variance is obtainable in semi-closed form, without having to invoke any geometric-mean approximations. To analyze swaps and European-style options on average forward volatility, we introduce a general class of power-type derivatives on the average forward variance, which also provide flexible nonlinear leverage exposure. Pricing-hedging formulae are based on a modified numerical Fourier transform technique. A comparative empirical study is conducted on two independent recent data sets on VIX options, before and during the COVID-19 pandemic, to demonstrate that the proposed framework is highly amenable to efficient model calibration under various choices of kernels. "Rough volatility" is a relatively new and yet already familiar jargon that has flourished in the financial world since the pioneering research work of [Gatheral et al, 2018 ] [17] , which provided striking empirical evidence suggesting rough sample paths (compared to those of a semimartingale) of volatility observed in high-frequency financial time series and thus the presence of shortterm dependence, while the idea of introducing frictions into volatility quantities goes back to the much earlier work of [Alòs et al, 2007] [2] motivated from observations in option price-implied according to the controlling fraction parameter. For this reason, inclusion of volatility jumps in a model that is already fractional has seemingly been ignored for investigation. Nevertheless, since increased activity levels are an inevitable consequence of increased path roughness, using a fractional Brownian motion with a fraction parameter less than 1 will only increase the activity level of the resultant variance process, which to a degree neglects the empirical findings of [Todorov and Tauchen, 2011] [47] ; see also [Bollerslev and Todorov, 2011] [9] and [Bardgett et al, 2019] [4] for similar confirmations of the necessity of volatility jumps. In connection with this, we expect that replacing the Brownian motion with a purely discontinuous process whose sample paths are less active, combined with suitable modifications, is able to strike a balance between these two important aspects (short-term dependence and volatility jumps) and eventually yield desirable modeling outcomes. These inspire us to take on a new path deviating from the use of a fractional Brownian motion in the establishment of rough volatility and switch to purely discontinuous square-integrable Lévy processes of infinite activity. In more detail, we want to propose a flexible framework for the instantaneous variance based on the sum of a generalized fractional Ornstein-Uhlenbeck process subject to an integrable kernel and an independent bounded process; a key feature of this formulation is that it is not derived from taking logarithms but yet is capable of simultaneously capturing short-term dependence and possible jumps in the instantaneous variance, as well as achieving an arbitrary suitable activity level of the corresponding forward variance curve. It is fundamentally a quasi-linear framework resembling the well-known BNS models of [Barndorff-Nielsen and Shephard, 2001] [5] . We note that models of non-exponential type have also been widely applied in volatility analysis, some recent developments including [Hofmann and Schulz, 2016] [20] and [Issaka and SenGupta, 2017] [22] . Needless to say, despite that inclusion of jumps may not result in significant improvement of the model fit of volatility distributions observed at high frequencies, as noted in [Gatheral et al, 2018, Sect. 6] [17] , it is undoubtedly innocuous and the model distribution in logarithm can also become arbitrarily close to normality by properly tuning scale parameters, thanks to the central limit theorem. On the contrary, introducing jumps into the instantaneous variance gives rise to an analytically tractable structure for the characteristic function of the average forward volatility, facilitating the pricing and hedging of volatility derivatives of interest. In particular, such a structure requires no inexact transformations, such as the geometric-mean approximation adopted in e.g. [Horvath et al, 2020] [21] , for the average forward volatility, which appear to be inevitable under exponential models. Besides, although our main results are given in a general setting, attention will be drawn to three particular types of stationary kernels, all of which are comfortable to work with and have their own advantages. While the first type is recognized for its incommensurable simplicity, the second is compatible with the transformation of the instantaneous variance dynamics into a usual Ornstein-Uhlenbeck process but driven by a fractional Lévy process. The third type is arguably a result of reverse engineering and designed specifically to avoid certain transcendental functions that are relatively costly to implement. In so doing it will be interesting as well to compare the overall suitability of various types of kernels, despite their considerable similarity to each other in shape. As already noted, our ultimate objective in the present paper lies in analyzing European-style financial derivatives written on the average forward volatility, such as the VIX index, including swaps and options. Under the proposed model framework, we obtain pricing-hedging formulae for a more general class of power-type derivatives, which raise the underlying volatility or the standard option payoff to a certain nonnegative power. Noteworthily, derivatives with power payoff functions written on equity have been thoroughly examined in the literature; see [Tompkins, 1999] [48] , [Raible, 2000] [41] , [Macovschi and Quittard-Pinon, 2006 ] [33] , and [Xia, 2017] [51] on single-asset options and [Blenman and Clark, 2005] [8] , [Wang, 2016] [49] , and [Xia, 2019] [52] on exchange options. Similar exchange options on zero-coupon bonds have recently been studied in [Blenman et al, 2020] [7] . Along these lines, consideration of power-type derivatives in the volatility market also has significance in generating nonlinear leverage effects on the investor's risk exposure, and will be conducive to hedging volatility-of-volatility risks. Implementation of the proposed pricing-hedging formulae will heavily rely on numerical Fourier transform techniques. The remainder of this paper is organized as follows. In Section 2 we establish our model framework starting from the instantaneous variance dynamics and provide a comprehensive analysis of its properties, including covariance function and path regularity, and then give an integral representation for the characteristic function of the average forward variance. Some simulation techniques are discussed in Section 3, with pertinent convergence results. Section 4 contains our new pricing-hedging formulae for power-type derivatives that nonlinearly extend standard volatility derivatives, which then initiate a comparative empirical study in Section 5 focused around VIX options, utilizing two independent data sets and two of the proposed kernels. Last but not least, we also provide some insight into how the model framework may be further extended to accommodate the presence of rough volatility of volatility in Section 6 by means of a stochastic time change argument, in catering for the noted finding of [Da Fonseca and Zhang, 2019] [13] . Conclusions and future research directions are outlined in Section 7 and all mathematical proofs presented in the end. We begin by synthesizing some crucial ingredients of a non-Gaussian fractional Lévy process, which are necessary for establishing a model for the instantaneous variance of a risky asset whose sample paths have roughness and jump features. Of course, allowing for positivity of the instantaneous variance the background-driving Lévy process must be nonnegative, i.e., a subordinator. To this end, consider a continuous-time stochastic basis S := (Ω, F , P; F ≡ {F t } t ≥0 ), where the filtration F is assumed to satisfy the usual conditions. Let X ≡ (X t ) be an adapted and squareintegrable Lévy subordinator supported on S, which is exclusively characterized by a Poisson random measure N X defined on (R ++ , R + ). According to the Lévy-Khintchine representation, X 1 has the characteristic exponent where i denotes the imaginary unit and ν X is the intensity measure associated with N X . For practicality we impose the assumption that ν is non-atomic so that the distribution of X 1 is absolutely continuous with respect to Lebesgue measure (see, e.g., [Kohatsu-Higa and Takeuchi, 2019, Theorem 6.3.4] [27] ) and we denote by ξ 1 := E[X 1 ] > 0 and ξ 2 := Var[X 1 ] > 0. Since X has independent and stationary increments, it has the familiar covariance function, for any u > 0, Cov X t , X t +u = ξ 2 t . For a continuously differentiable 2 kernel g ∈ C (1,1) defined in the domain {(t , s) : t > 0, s ∈ [0, t )} satisfying the integrability condition t 0 g 2 (t , s)ds < ∞, ∀t > 0, we then define the fractional Lévy process via the following Volterra-type stochastic integral, which is well-defined P-a.s., and is a.k.a. a Lévy-driven fractionally integrated moving-average process (see [Marquardt, 2006 , Sect. 6.2] [35] ). With this representation, for any t , u > 0, it is easy to see that E X s)ds and, by using the Lévy-Itô isometry (see, e.g., [Lyasoff, 2017, Sect. 16.32] [32] ), the covariance function of the process X (g ) takes the following form, Notably, if lim s t g (t , s) = ∞ for every t > 0, then the integral t 0 g (0,1) (t , s)ds is divergent for every t > 0. In this case, X (g ) exhibits short-term dependence in the sense that there exists ∈ (0, 1) such that Cov X where Var X (g ) t = ξ 2 t 0 g 2 (t , s)ds by the dominated convergence theorem and C (t ) is some constant depending only on t > 0. To for some fraction parameter d ∈ (1/2, 3/2), where 2 F 1 (·, ·; ·; ·) is the Gauss hypergeometric-(2, 1) function ( [Abramowitz and Stegun, 1972, Sect. 15] [1]) and which is non-stationary. We note that the specific form (2.1.2) was initially chosen in [Jost, 2006] [25] in an attempt to match the Weyl integral representation of a fractional Brownian motion living in real-valued time used in [Mandelbrot and van Ness, 1968] [34] . It was shown in [Tikanmäki and Mishura, 2011 ] [46] , however, that such transformation does not necessarily lead to the same finite-dimensional distribution in the more general case of fractional Lévy processes. Another popular choice of g is the following obviously stationary and yet structurally much simpler Riemann-Liouville kernel, 3 In any case, a well-suited candidate for X , having infinitely many jumps on compact time intervals, can be a one-sided tempered stable process, i.e., a tempered stable subordinator, which has three parametersa > 0, b > 0 and c ∈ (0, 1), leading to the following characteristic exponent, so that ν(dz) = ae −bz /z c+1 1 (0,∞) (z)dz, for z > 0, which is clearly an infinite measure. The tempered stable distribution constitutes a fairly general family of infinitely divisible distributions (see [Rosiński, 2007] [42] and [Küchler and Tappe, 2013] [28] , as well as the overview in [Schoutens, 2003, Sect. 5.3] [43]), and it has witnessed many applications in constructing volatility models in discrete time (see, e.g., [Mercuri, 2008] [37] and [Li et al, 2016 ] [30] ). In particular, by taking c 0 and c = 1/2, from X one recovers the well-known gamma process and inverse Gaussian process, respectively; in the former case it is understood that log φ X 1 (l ) = −a log(1 − il /b). With (2.1.5) it is also straightforward to verify that ξ 1 = aΓ(1 − c)/b 1−c and ξ 2 = aΓ(2 − c)/b 2−c . Let us consider, instead of the natural logarithm of the instantaneous volatility of a risky asset, the instantaneous variance process, denoted V ≡ (V t ). Intuitively speaking, our idea is to express V as a Volterra-type stochastic integral, up to shifting and positive scaling, analogous to the fractional Lévy process X (g ) in (2.1.1) with a suitable kernel chosen to allow for short-or long-term dependence as well as long-term mean reversion, which is then supplemented by an independent bounded semimartingale to adjust the vibrancy of the sample paths. This is done by assuming the following quasi-Ornstein-Uhlenbeck structure, whose ingredients are set up as follows: κ > 0 specifies a reversion speed andV ≥ 0 a universal reversion level, h is a continuously differentiable kernel having a power-law left tail and a powerexponential right tail, for a fraction parameter d > 1/2, where (·) + denotes the positive part; Z is a real-valued purely discontinuous Lévy process independent from X , also supported on S, with an absolutely continuous distribution for every fixed t > 0, and ς > 0 is a small (manually adjustable) scaling factor. Subject to the law continuity assumption it suffices that Z 1 admit a characteristic exponent of the form whenever the above integral is finite, where ν Z is the Lévy measure of Z with 0<|z|<1 |z|ν Z (z) = ∞. The condition (2.2.2) subtly embodies the intuition of introducing frictions into V without jeopardizing its mean-reverting property, and it automatically ensures that (2.2.1) is well defined because sup t >0 t 0 h 2 (t , s)ds < ∞. Besides, we assume that V • 0 > 0 has a known value. In fact, if (2.2.2) holds with d ∈ (1/2, 1), then V exhibits mean reversion in the long term but is simultaneously allowed to have short-term dependence. As before, with the Lévy-Itô isometry the process V is seen to have the covariance function equal to for any u > 0, which generally depends on t > 0. Under (2.2.2), if d < 1 then lim s t h(t , s) = ∞ for any t > 0 and there exists C (t ) ∈ R depending only on t such that In this case, the covariance function is rough at the origin and V exhibits short-term dependence. With the right-tail behavior in (2.2.2) V always reverts to a positive mean in the long term, with where we have used lim t →∞ E[cos Z t ] = 0 as |E e iZ 1 | < 1 due to the absolute continuity of the distribution of Z 1 . Observably, our construction of the instantaneous variance V in (2.2.1) consists of two components: a generalized fractional Ornstein-Uhlenbeck process V • and a composite-sinusoidal Lévy process ς(cos Z + 1). Driven by a subordinator (X ), the first component accounts for large upward volatility jumps but depicts significant downward movements as continuous corrections. This is notably in line with recent empirical evidence (see, e.g., [Park, 2016] [39] ) about the relative importance of signed jumps. Meanwhile, the latter component is included for the purpose of capturing small-scale two-sided volatility movements of suitable activity. Simplicity aside, choosing the cosine function has a particular advantage in that it does not alter the nature of (infinite-variation) small jumps of Z , albeit wiping out large values, for, with the Lévy-Itô decomposition in mind, The fractional part of the process V • is actually motivated by the recipe used in [Wolpert and Taqqu, 2004, Sect. 3] [50] based on repeated integration, where h is specialized as the product of a usual exponential kernel and the Riemann-Liouville kernel, which is obviously strictly positive and will be referred to as the type-I kernel. A remarkable difference, however, is that we assume that the instantaneous variance process is only observed starting from time 0. Indeed, the structure (2.2.1) represents a wide range of approaches towards achieving path roughness and mean reversion at the same time, while it gives rise to an Ornstein-Uhlenbeck process driven by a fractional Lévy process (subordinator), i.e., the structure used in [Garnier and Sølna, 2018] [16] , with the choice where g is the kernel mentioned in (2.1.1). With an application of Itô's formula and the Fubini-Tonelli theorem the first equation in (2.2.1) is reformatted into which is the solution of the fractional stochastic integral equation Since the second term on the left-hand side of (2.2.7) is bounded for every fixed t > 0, it holds that lim s t h(t , s)/g (t , s) > 0, provided lim s t g (t , s) = ∞ for any t > 0. Thus, if the sample paths of X (g ) exhibit short-term dependence, then so do those of V , and in fact, their short-term dependence must be of the same degree. If g is further taken to be the stationary Riemann-Liouville kernel, then by straightforward calculations (2.2.7) yields the following type-II kernel which also happens to be stationary, where Γ(·, ·) denotes the upper incomplete gamma function. The correlation structure (2.2.4) can be made more precise with Var X (h) t = ξ 2 t 0 h 2 (s)ds and some which only depends on t . This shows that roughness is established if and only if d ∈ (1/2, 1). However, formed from the Riemann-Liouville kernel the type-II kernel fails to meet the right tail behavior condition imposed in (2.2.2), but obeys instead , thereby indicating that the type-II kernel, unlike the type-I, is not strictly positive when short-term dependence is exhibited with d < 1, but the resultant process V obviously is (by (2.2.8)). With the type-I kernel (2.2.6), it is not possible to interpret V • as an Ornstein-Uhlenbeck process driven by a fractional Lévy process. Nonetheless, V can still have short-term dependence since, for d ∈ (1/2, 1), , as u 0, by using (2.2.10). The long-term mean is however unconditionally finite: As another aspect of our innovation, for d ∈ (1/2, 1) generating short-term dependence we propose to construct h by combining the scaled exponential kernel and the Riemann-Liouville kernel in a piecewise fashion, i.e., where τ > 0 is some time threshold separating the power-law and exponential parts of the kernel and θ > 0 is some scaling factor. To ensure continuous differentiability, τ solves the transcendental equation In order for (2.2.12) to be solvable on R ++ , θ has to satisfy the constraint with W · (·) being the Lambert W function, a.k.a. the product logarithm (see [Corless et al, 1996 ] [12] ). Note that the two solutions in (2.2.14) coincide if and only if equality holds in (2.2.13). We will refer to (2.2.11) as the type-III kernel. 4 Before the type-III kernel can be properly compared to the other two types, we note that the type-I and type-II kernels both have the parametrical limiting property that , which correspond to the Riemann-Liouville kernel without mean reversion and the classical exponential kernel without fractions, respectively. It is desirable that the same hold for the type-III kernel, which is possible by its construction. The particular choice of (τ, θ) that achieves this effect is also not difficult to find by the properties of the Lambert W function, and is given by Then the type-III kernel is uniquely parameterized by κ and d and reads 4 Such a piecewise construction has the same shortcoming as the type-II kernel, namely the restriction of d ∈ (1/2, 1), but instead of sacrificing the mean-reverting property, the type-III kernel is incompatible with differentiability at τ if long-term dependence (d ≥ 1) is required, in which case (2.2.12) has no positive solutions. Using κ = 5 and d = 0.6, Figure 1 below compares the three types of kernels (2.2.6), (2.2.9) and (2.2.15) over the unit time interval. It is clear that they share the same right-tail behavior, generating the same degree of short-term dependence. In fact, with d < 1 the type-I and type-III kernels are both monotone and strictly positive whereas the type-II kernel is not. On a different note, despite that by the formulation (2.2.1) V lacks increment stationarity, there is comprehensibly no negative impact placed on characteristic function-based model calibration. In the next proposition we give some partial results on the regularity of the sample paths of V . Indeed, from its appellation the interpretation of path roughness is not confined to the involvement of short-term dependence, but should be linked to how degree of irregularity of the sample paths. Since cos Z is a purely discontinuous semimartingale, we focus on the generalized fractional Ornstein-Uhlenbeck process V • and consider the case of an infinite Lévy measure ν X allowing for practicality. (iii) If 1/2 < d < 1, then the sample paths of V • are P-a.s. discontinuous and unbounded with P-a.s. infinite quadratic variation over [0, T ]. Notably, for d > 1, the sample paths of V • are smoothed in a way that all the jumps generated by X are expunged and, as will be seen in the proof in Appendix A, they are actually Höldercontinuous for suitable exponents. On the other hand, in the situation of assertion (iii), V can have infinitely large jumps, so that its sample paths form maps from [0, T ] to [0, ∞]. However, this will not be a problem for modeling in practice because V t is a.s. finite for any fixed t ≥ 0 and in fact, it has a finite variance. For the critical value d = 1, V only exhibits mean reversion. For the type-I and type-II kernels it can be easily verified that in the case d = 1 V is exactly the usual Lévy-driven Ornstein-Uhlenbeck process and for the type-III kernel this is also true in the limit as d 1. Letting t 0 ∈ [0, t ] be a fixed time point, we can recast (2.2.1) conditional on F t 0 as or equivalently, The first integral on the right-hand side of (2.2.17) clearly indicates that V cannot be a Markov process or a semimartingale in general. In fact, it is so if and only if h is chosen such that using that both X and Z have independent stationary increments. After constructing the instantaneous variance model with roughness and jumps, we proceed to giving an explicit structure for the forward variance curve, i.e., which in light of (2.2.18) admits the following stochastic representation, On the right-hand side of (2.3.2), is the forward variance rising from the generalized fractional Ornstein-Uhlenbeck process V • , whose fractional integral is associated with the shifted kernel h(t + u, s), and stems from the composite-sinusoidal Lévy process that contains small two-sided volatility jumps. Apart from the contemporaneous instantaneous variance, frictions in the forward variance curve also result from a new fractional Lévy process X (H u ) containing additional information over the entire variance history. In consequence, with a general kernel h the Markov property ofṼ (u) is completely lost. If one prefers to view t + u > t as being time-independent, then (2.3.2) really gives a martingale dynamics for the forward variance curve over [0, t + u], which is similar to the martingale framework developed in [Horvath, 2020, Sect. 3] [21] . However, we deliberately refrain from operating on such a framework as it is primarily beneficial from a simulation-based viewpoint. It is also worth emphasizing that, since lim s t H u (t , s) = h(t + u, t ) = O(1), ∀t , u > 0, the modified process X (H u ) deprives the sample paths ofṼ (u) of any degree of roughness, restoring the semimartingale property ofṼ (u). Under continuous monitoring over a fixed window ∆ > 0, the average forward volatility process is identified as the square root of the ∆-running average of the forward variance process, In the case of the VIX index, for instance, ∆ = 6/73 year (equivalent to 30 days). By using (2.3.2), some direct calculations lead to an integral representation of the corresponding average forward variance as well. For any ∆ > 0 we have using the Fubini-Tonelli theorem that is a ∆-forward integrated kernel. By construction H ∆ is continuously differentiable and both H ∆ and Υ ∆ preserve stationarity from h. After some tedious calculations it can be deduced that, when h is the type-I kernel (2.2.6), then and Similarly, for the type-II kernel (2.2.9) with d ∈ (1/2, 1], and For the type-III kernel in its specialized form (2.2.15) with d ∈ (1/2, 1), we have and Note that all three types of ∆-forward integrated kernels are stationary with constant averaging functions. From a computational viewpoint, a benefit of the piecewise nature of the type-III kernel is that one does not need to deal with incomplete gamma functions, 5 which together yield From the last representation it is also noted that an attendant effect of U t (u), apart from its original purpose of capturing small-scale volatility jumps, is that the resultant average forward variance contains risks that cannot be fully spanned by the forward variance curve, which will only depend on Z , particularly the quadrant of e iZ . 6 5 With the type-III kernel one can also discover a closed-form formula for the conditional characteristic function of the fractional process V • and hence the partial forward varianceṼ • (u), which facilitates derivatives pricing on the instantaneous or forward variances. Details are put into Appendix B as the interest of the present paper is more for derivatives on the average forward variance. 6 We stress that such volatility risks are not an oddity and the same phenomena also prevail when one considers the classical framework of log-volatility without resorting to any geometric average approximation. In the present framework, the tolerance of such risks is directly tied to the auxiliary parameter ς > 0. At this point, we have all the necessary tools to give a comfortable integral formula for the conditional characteristic function of the average forward variance, as the following proposition expounds. In the setting of (2.3.5), given any 0 ≤ t 0 < t ≤ T and ∆ > 0, it holds that where φ Z 1 is recalled to be the characteristic function of the random variable Z 1 , is a kernel-modulated forward variance quantity, and The application of Proposition 2 can be facilitated if the density function of the Lévy process Z , esp. the Fourier inverse (1/π) can be written explicitly, allowing the target characteristic function to involve up to two parallel (multiplied) numerical integrals. Unfortunately, this would be an impractical requirement in consideration of the empirical finding in [Todorov and Tauchen, 2011, Sect. 6] [47] , which suggests that the Lévy measure ν Z of Z typically has a Blumenthal-Getoor index around 1.78, and there is no commonly known Lévy process with such a feature and yet has a closed-form density function. Despite this, benefiting from the composite-sinusoidal structure, Z need not even have a finite variance and we can pick Z from the family of symmetric α-stable processes with the simple characteristic function φ Z 1 (l ) = e −|l | α , for α ≈ 1.78. The associated Lévy measure in (2.2.3) is then ν Z (dz) = − sec(πα/2)/(2Γ(−α)|z| α+1 )dz for z ∈ R \ {0}. Also, since the scale of the composite-sinusoidal process is already incorporated into the parameter ς ∈ (0, 1), it is redundant to introduce additional parameters to Z , and due to symmetry of the assumed stable distribution the effect of the composite-sinusoidal process is solely to generate a suitable activity level of the average forward variance while other large asymmetric movements are taken account of by the partial forward varianceṼ • (u) resulted from the generalized fractional Ornstein-Uhlenbeck process V • . Leastwise, with Proposition 2 one can deduce pricing-hedging formulae for derivatives contracts written on the average forward variance, e.g., the squared VIX index, which will be explained in detail in Section 4. Most importantly, although it is not possible to derive a similar formula for the characteristic function of the average forward volatility, we will demonstrate how this difficulty may be overcome for volatility derivatives by way of power-type extensions. Moreover, by forcing ∆ 0 (2.3.14) is nothing but the conditional characteristic function for the instantaneous variance, i.e., E e ilV t |F t 0 , l ∈ R. Despite general non-stationarity, using (2.2.1) we can still simulate the sample paths of the instantaneous variance process. To do this we discretize the generic global time interval [0, T ] by means of the uniform partition , M ∈ N ++ , M 1. (3.1) Then, using the Gauss quadrature rule ( [Golub and Welsch, 1969 ] [19] ) and the Lévy properties of X , the Volterra-type stochastic integral in the definition of V • in (2.2.1) can be approximated by a finite random sum, Note that the convergence rate is unaffected by the fraction index d of h, which applies to the three types of kernels discussed before. Nonetheless, the above L 2 -convergence fails in the limit as d 1/2. In a similar fashion, we can use the representation (2.3.5) to simulate the sample paths of the average forward variance I 2 (∆) for a given ∆ > 0, by using the estimatoř to which the L 2 -convergence criterion in Proposition 3 naturally applies. Besides, there is no need to discretize the deterministic integrals containing the sinusoidal composites of Z u which can be directly computed according to Proposition 2. In this section we present the main pricing-hedging formulae for European-style derivatives written on the adjusted average forward volatility. The setting of Section 2.2 is adopted throughout, and a fixed maturity date T > 0 is assumed. We start with swaps written on the average forward volatility. The payoff of a power swap to its investor is at T S (p) where p ≥ 0 is a predetermined power coefficient. Here we have disregarded the notional amount for simplicity as it is merely a positive scaling factor. Of course, the extremal case p = 0 corresponds to a fixed cash payment of 1 dollar, while by choosing p = 1 and p = 2, one obtains the standard volatility swap and the standard variance swap, respectively. Convention is that, at inception of trading, the price of the swap is set to achieve a zero fair value, and so the price of the swap at a given time point t 0 before maturity can be simply computed as the expected value of I p T (∆) conditional on F t 0 , i.e., as and is treatable as a potentially fractional moment of I 2 T (∆). Due to the exponential structure (2.3.14) the convergence of (4.1.2) is directly linked to the smoothness of the characteristic function of X 1 and Z 1 (refer to (2.1.5) and (2.2.3)). Comprehensibly, it is far from exorbitant to demand that φ I 2 T (∆)|t 0 (·) ∈ C p/2 +1 (R), which condition remains valid for a wide class of square-integrable Lévy processes X . For instance, for X a tempered stable process and Z a stable process, its characteristic function (2.1.5) immediately renders φ I 2 T (∆)|t 0 (·) ∈ C ∞ (R) so that Proposition 4 is automatically applicable for all values of p ≥ 0. Implementation of (4.1.2) is also nowhere near computationally intense, regardless of specializations of the (∆-)forward integrated kernels, by means of the Gauss quadrature rule for numerical integration and finite-difference approximations for differentiation of integer orders; for example, given the required degree of smoothness, a central approximation reads for > 0 small In particular, by taking p = 1 in (4.1.2) the pricing formula for the standard volatility swap reads whilst that for the corresponding variance swap is none but the F t 0 -conditional mean of I 2 Amidst a non-Markovian setting, it is unrealistic to construct a perfect hedge for these power volatility swaps based on the forward variance curve only. In light of the structure of the characteristic function (2.3.14), the dominant parts of I 2 (∆) not containing the active small-scale jumps of cos Z can leastways be hedged perfectly with the partial forward variance curveṼ • (u), esp. the kernel-modulated forward variance J (T, t 0 , ∆). In other words, we wish to find a partial hedging strategy designated for those dominant parts over the time period [t 0 , T ) which will require the entire forward variance curve {Ṽ t 0 (u) : u ∈ (T − t 0 , T − t 0 + ∆]}, whereas other non-hedged volatility risks are precisely those that cannot be spanned and are all buried in the bounded sinusoidal composites involving the randomness of Z exclusively. To that end we first define the time-indexed differential operator , t ∈ (t 0 , T ], for a fixed t 0 ∈ [0, T ). The first equation in (4.1.4) signifies that if p/2 is an integer, then the size of the (partial) hedge is equal to the spot price of another power variance swap, with decremented power p/2−1. Again, after taking p = 1 in the second equation, we find the hedge for the standard volatility swap as Obviously, for the corresponding variance swap, the hedge is exactly J (T, t 0 , ∆), with T S (2) t 0 = 1. Moreover, we stress that the convergence of (4.1.4) does not rely on integrability of the characteristic function, namely φ t 0 ,T (·; ∆) ∈ L 1 (R). In fact, although this condition is considerably benign and realistic, it is not necessary for the stated results to hold. The only assumption we have made in this regard is that X 1 and Z 1 are continuous random variables, connected with the non-atomic Lévy measures ν X and ν Z . As mentioned since the introduction, our study for asymmetric power options is motivated by the average forward volatility being the square root of the average forward variance. In other words, an option written on the average forward volatility can be effectively treated as a power-type option on the average forward variance, with power exactly equal to 1/2. For convenience and generality we still conduct our analysis subject to a positive power coefficient. Let us consider a European-style put option contract on the average forward volatility I T (∆), having the terminal payoff P (p 1 ,p 2 ,(a)) T where K > 0 is the volatility strike and p 1 , p 2 ≥ 0 are two predetermined power coefficients. We refer to this type as being asymmetric since the power imposed on the strike can differ from that on the average forward volatility. The payoff structure (4. The following proposition is given for arbitrary-time pricing of the asymmetric power option. The price of the asymmetric power put option with terminal payoff (4.2.1) at time t 0 ∈ [0, T ) is given by By taking p 1 = p 2 = 1 one has the pricing formulae for the standard volatility options. In particular, Re K e −iK 2 l + π/2 − Γ(3/2, iK 2 l ) il φ I 2 T (∆)|t 0 (l ) il dl (4.2.5) and, recalling (4.1.3), On the other hand, for the standard put option on the average forward variance with p 1 = 2, there is a significant reduction, The formulae (4.2.5) and (4.2.6) for the standard volatility options can be implemented with substantial efficiency provided that the conditional characteristic function takes the form of (2.3.14) and facilitate calibration of the model on standard option prices. Hedging of the asymmetric power options resembles that of the corresponding power swap, which only makes use of the partial forward variance curve. For the following we adopt the differential operator t for t ∈ (t 0 , T ]. In the setting of Proposition 5, hedges can be constructed as T C (p 1 ,p 2 ,(a)) t 0 = T P (p 1 ,p 2 ,(a)) t 0 is as specified in Corollary 1. Once again, with the choice p 1 = p 2 = 1, the standard volatility options can be hedged in terms of We remark that the hedging strategies constructed in Corollary 1 and Corollary 2 only target the dominant movements in the average forward volatility I (∆) that are governed by the fractional process X (h) , to which the prices of volatility derivatives will be sensitive, whereas the bounded fluctuations controlling the activity level of the I (∆) remain un-hedged. Nevertheless, these partial hedges remain valid with or without rough volatility and can be made perfect when the activity controller process Z is absent. As in the case of equity options, the volatility option investor's risk exposure can also be adjusted by directly forcing a mutual power effect on the standard option payoff (similar to [Raible, 2000, Sect. 3.4] [41] and [Xia, 2019, pp . 120] [52] ). This way of generalization understandably does not build any useful connection between options on the average forward volatility and the corresponding forward variance and is hence considered less important from the viewpoint of this paper's motivation. Nonetheless, for the sake of completeness and our interest we still provide a comprehensive analysis of the pricing-hedging methods for such so-called "symmetric power options." In this connection let a European-style put option contract on I T (∆) have the following terminal payoff, P (p,(s)) T where K > 0 and p ≥ 0. In this structure both the strike price and the average forward volatility undergo the same power impact, and with binomial expansion we can rewrite Besides, we observe that the plots of the payoff functions P (p,p,(a)) T and P (p,(s)) T against I T (∆) are symmetric with respect to the line segment joining the points (0, K p ) and (K , 0) over the interval [0, K ]. For 0 ≤ p < 1, the symmetric power option provides a convex transformation of the standard option payoff whereas its asymmetric power counterpart provides a concave one; for p > 1 one has a reversed relation (see Figure 2) . Therefore, the two types of power put options can be utilized to complement each other in terms of severity of risk adjustment when either deeply in-the-money or closed to at-the-money. However, such effect holds exclusively for put options. a similar symmetric power call option can be decomposed into exactly p weighted power volatility swaps incremented by an infinite sequence of power-type derivatives on the reciprocal average forward volatility I −1 T (∆) conditioned to stay below 1/K , which vanishes if and only if p is an integer. Also, note that in this case the payoff functions C (p,p,(a)) T and C (p,(s)) T are both strictly concave resp. convex in I T (∆) for 0 ≤ p < 1 resp. p > 1 over Based on the two decompositions (4.3.2) and (4.3.3), the next proposition gives the pricing formulae for these symmetric power options in terms of infinite series. Proposition 6. The price of the symmetric power put option with terminal payoff (4.3.1) at time t 0 ∈ [0, T ) is given by dl , (4.3.4) while that of the similar symmetric power call option with (4.3. 3) is Hedges of these symmetric power options using I 2 t 0 (T − t 0 + ∆) also come in similar forms, which yield the next result. Corollary 3. Assume the setting of Proposition 6. Then we have 's, for 0 ≤ k ≤ p, are the contemporaneous power swap hedges as specified in Corollary 2 and In this empirical study we illustrate the performance of our model framework established in Section 2 as well as the pricing-hedging formulae presented in Section 4. Allowing for overall efficiency, we focus on the type-I and type-III kernels, specializing h according to (2.2.6) and (2.2.15), respectively; as explained before, the type-II kernel (2.2.9) derived from the Riemann-Liouville kernel can take negative values and does not possess an exponentially decaying right tail as demanded in (2.2.2), and hence is likely to bring about numerical issues. The Lévy subordinator X is taken to belong to the class of tempered stable subordinators, with characteristic function (2.1.5), while the auxiliary Lévy process Z is as mentioned in Section 2.3 a two-sided 1.78-stable process. 9 Since the VIX index is a popular gauge for the stock market volatility, it would be very interesting to evaluate the model performance under dissimilar market dynamics. For this reason, we have select two independent VIX option price data sets, in the years 2016 and 2020, which correspond to, respectively, normal times and the COVID-19 global pandemic -it is known that in the latter period there has been significantly higher buying pressure into call options amid market fear, generating anomalous trading volumes. For both data sets, strike prices and option prices are quoted in the unit of US$100 (data source: [CBOE Global Markets, Inc., 2020] [11] ) and we adopt t 0 = 0 throughout for simplicity. In more detail, the first data set reflects ordinary market dynamics where the volatility smile is easily justified. It consists of 38 put option prices quoted on Jan 26th, 2016, under four different maturities T = 27, 55, 90, 181 days with the spot price I 0 (∆) = 0.2667 and the strike price K ranging from 0.12 to 0.3. The second data set speaks to an abnormally volatile market dynamics with a conspicuous clustering of call option prices across different maturities. It contains 38 call option prices quoted as of May 11th, 2020, also corresponding to four maturities T = 72, 100, 163, 191 days with the spot price I 0 (∆) = 0.3304 and the strike price K ∈ [0.2, 0.9]. It is clear that with these two data sets we can also simultaneously illustrate the pricing formulae for both call and put options. By the definition of the VIX index we fix ∆ = 6/73 (year). In order to apply Proposition 2 properly, one challenge that immediately comes to attention is the specification of the (F t 0 -measurable) kernel-modulated forward variance J (t , t 0 , ∆) > 0, which arises because of the non-Markovian setting and contains all the information about the spot price of the VIX at time t 0 . Understandably, it would be undesirable to treat the entire quantity J (t , t 0 , ∆) as an independent parameter to be calibrated, which can cause severe instability by disregarding the base level of the spot price. To overcome this difficulty, we employ an expansion argument to transform J (t , t 0 , ∆) into a linear combination of the square of the spot price (i.e., I 2 t 0 (∆)) and a time-dependent remainder term with refined domains. Such an operation can significantly stabilize calibration by permitting the use of VIX index data. In particular, the relations (2.3.2) and (2.3.4) permit writing where the remainder, being bounded using the time-to-maturity and the scaling factor ς, is to be calibrated independently in the domain [−(t − t 0 ) 2 /∆ − 3ς, (t − t 0 ) 2 /∆ + 3ς]. With (5.1.1) and (5.1.2), the corresponding formula (2.3.14) for the characteristic function of the squared VIX is transformed into The calibration of the reversion levelV is automatically encoded into that of the remainder r (t 0 , t ). Again, we will specify the foregoing transformed formulae with t 0 = 0 and t = T for implementation and write for simplicity r (T ) ≡ r (0, T ). On a second look at the standard option pricing formulae (4.2.5) and (4.2.6), there are ultimately four numerical integrals to evaluate, having domains of integration (0, ∞) , R x, (0, T ] ≡ (t 0 , t ] s, and (0, ∞) l , respectively. The first two form a repeated integral, and are then multipled by the third, whose product is nested with the fourth. This structure may seem intimidating at first glance; however, let us observe that: (i) the first integral (1/π) ∞ 0 Re e −i x φ T Z 1 ( ) d is none but the probability density function of the 1.78-stable random variable Z 1 , which are builtin in most software packages using efficient numerical Fourier inversion techniques; (ii) the scaling factor ς has limited contribution since its only purpose is to ensure that small volatility jumps have the desired activity index, 1.78; (iii) for a fixed small value (e.g., 0.01) of ς, the second integral with respect to x can be approximated with arbitrary precision using a cubic smoothing spline thanks to ψ being uniformly bounded by 1. With these observations in mind, we write f Z T (x) := (1/π) ∞ 0 Re e −i x φ T Z 1 ( ) d and proceed to fixing ς = 0.01 to carry out the following approximation of the second x-integral to boost computational efficiency, where f β (l |ς = 0.01) is a cubic spine function defined with parameter β = (β 0 , β 1 , β 2 , β 3 ) which is specified by solving the following minimization problem, where n is the size of the partitioned l -domain and λ ≥ 0 is an auxiliary smoothing parameter. The role of the smoothing parameter is to balance between the fidelity to the true function and the roughness of the estimator by incorporating a penalty term linked to the curvature, and is set to be 0.0295 by default. 10 The l -domain is partitioned by n = 3, 000 evenly distributed points over [−10 4 , 10 4 ], which is verified to be sufficient to produce a mean squared error estimate of 7.6×10 −11 and 8.2×10 −11 for the real and imaginary part, respectively. The effect of approximation for the real part, imaginary part and real-imaginary combined part are visualized in Figure 3 . Calibration is carried out jointly taking into account all four different maturities for each complete data set. With the aid of (5.1.3) and (5.1.4), there are a total of six parameters to calibrate, (a, b, c, d , κ, r (T )), among which the last one with time dependence needs to be considered separately for different maturities and for this purpose we adopt the notation T 1 < T 2 < T 3 < T 4 . More specifically, the objective is to minimize the mean squared error (MSE) between the observed market prices of the VIX options and the corresponding model prices, so that the optimal parameter set is given by In order to solve (5.2.1) efficiently, first we adopt the genetic algorithm, which conduces to locate a decent initial value ϑ 0 of the parameters of interest. Then, we apply the pattern search algorithm taking ϑ 0 as the initial value to continuously refine the parameter set. Both of these algorithms are parallelized so that the computation speed largely scales with the number of cores available. 11 Figure 4 : VIX put and call option prices (market vs model) 11 In general, while the genetic algorithm search takes up to 8 hours on a 32-core processor, implementation of the pattern search with the initial value ϑ 0 is much faster and the parameters can be refined within 2 hours. For this reason, for industry applications we recommend using the genetic algorithm to find the initial value ϑ 0 once and for all, and subsequently implement the pattern search algorithm given ϑ 0 as an initial point for daily updates. Table 1 reports the calibrated parameter values for both data sets under both types of kernels, along with root mean squared errors (RMSE) at the level of the option prices. The model fits are further visualized in Figure 4 in four separate panels. From Figure 4 it is immediately visible that during the COVID-19 pandemic market prices of call options are squeezed over different maturities, unlike the sparse price distribution of the put options in 2016. Despite this unusual discrepancy, the applied models have successfully captured the VIX option price styles for both periods. To be more precise, the calibration results seem promising for options with either short or long maturities, and for both in-the-money and deeply out-of-the-money options. From a comparative viewpoint, the model fits are much better for the first data set (pre-pandemic) which did not undergo abnormal trading volumes, and the same can be said about the model under the type-III kernel than the type-I. The latter observation may be partially explained by the piecewise construction of the type-III kernel, which precludes transcendental functions in the ∆-forward integrated kernel (H ) and thus facilitates numerical computations. As a result, the model under type-I kernel has performed relatively poorly with regard to those long-maturity options. In interpreting the calibrated parameter values, from Table 1 it is clear that all the applied models are able to imply fast mean-reverting volatility, with κ ranging from 3 to 6.4. The shape and scale parameters a and b of the base process X are both close in value, speaking to the stability of calibration, while the family parameter c, with calibrated values around 0.5, suggests that a typical inverse Gaussian-driven (with c = 1/2) Ornstein-Uhlenbeck process could be a suitable model for the instantaneous volatility process to capture volatility jumps. As for the fraction parameter d , since its value significantly differs from 1, a direct implication is that short-range dependence is indeed prevalent in volatility dynamics, even under unusual market environments during the pandemic. In this section we investigate the sensitivity of the pricing of power-type volatility derivatives for the VIX index with respect to varying power coefficients, by using the general formulae proposed in Section 4. For succinctness we only look at one strike price K = 0.25 and one maturity T = 90 days for the first data set on put options and K = 0.35 and T = 100 days for the second on call options, adopting the parameter values calibrated under the type-III kernel in the third and fifth rows of Table 1 , respectively. First, to ease comparison between asymmetric power and symmetric power types we assume for the power coefficients that p 1 = p 2 = p ∈ [0.8, 1.2] and plot the price changes (C ) of corresponding power-type derivatives in Figure 5 . For the infinite series in Proposition 6 we use the approximation 4 k=0 which universally leads to a global error less than 10 −4 . Plots for power swaps are excluded as they are already reflected in the call option pricing formulae thanks to the put-call parity. It is seen that, in terms of leverage exposure, the symmetric power options are able to provide much severer leverage effect for the VIX index compared to the asymmetric power options given the same power coefficients, despite that the latter are much easier to handle in general. Of course, for an asymmetric power option, by letting the two power coefficients vary independently we can generate a power surface for its price (C (p 1 ,p 2 ,(a)) 0 and P (p 2 ,p 2 ,(a)) 0 ), as shown in Figure 6 . Apart from showing the magnificent impact of powers on the VIX option price, these are also a reliable indicator that the general pricing-hedging formulae can be implemented fairly efficiently. [13] provides yet another very interesting implication, that the volatility of the average forward volatility, such as the VVIX index, also exhibits short-term dependence. Although it is noticeably challenging to establish a comfortable framework coalescing both aspects of roughness, we will briefly discuss how the foregoing pricing problems may be tackled inheriting the structure of (2.2.1). For that purpose we recall the setting of Section 2.3 and define the composite process where T 0 = 0 and Y ≡ (Y t ) is an F-adapted square-integrable Lévy subordinator and η a kernel which respectively resemble X and h up to different parameters. The construction (6.1) emulates the initiative work of [Carr and Wu, 2004] [10] on stochastic time change, which has a fundamental root in the famous Dambis-Dubins-Schwartz theorem. Clearly, Y (η) = · 0 η(·, s)dY s is a meanreverting process that introduces frictions into the volatility of the average forward volatility and, if η is associated with a fraction parameter less than 1, say d ∈ (1/2, 1], then Proposition 1 informs that Y (η) also captures volatility-of-volatility jumps. By convention independence between X and Y is assumed. Under (6.1), we are able to at least write the unconditional characteristic function 12 of the time-changed average forward variance in terms of nested integrals. where φ 0,y (l ; ∆) is as given in (2.3.14). Note that the innermost integral in (6.3) can be expressed explicitly if η is any of the three types of kernels specified before, while the other three integrals remain numerical in nature. In particular, the outer two integrals are generally non-interchangeable, i.e., Fubini's theorem is not applicable and they should be computed in sequence. Regardless, we can put (6.3) into the pricing formulae proposed in Section 4 to compute the prices of power volatility derivatives at time 0. However, since there are at least six numerical integrals (some being parallel) involved, coming up with a robust calibration scheme will be an arduous task. As a means of reducing calibration burden towards that end, one possibility is to conduct characteristic function-based estimation (see [Yu, 2004] [53] ) based on volatility-of-volatility index data for the parameters of Y beforehand. The modeling of short-term dependence in instantaneous volatility is far from deep-rooted in Brownian sample paths and can be alternatively realized by way of a purely discontinuous Lévy process, which is able to capture volatility jumps as well. The latter approach is largely motivated from a balance between the two major empirical findings of [Todorov and Tauchen, 2011] [47] and [Gatheral et al, 2018 ] [17] . The pricing-hedging framework presented in this paper is tailored for volatility derivatives and built upon a generalized Lévy-driven Ornstein-Uhlenbeck process with a suitable squareintegrable kernel to establish short-term dependence. An independent two-sided Lévy process in sinusoidal form is utilizable to help attain an empirically evidenced activity level of the averageforward variance. The elegance of the present framework lies in the unconditional positivity of the instantaneous variance, which eludes use of the logarithm, so that integration is largely facilitated leading eventually to a semi-closed characteristic function for the conditional forward variance. This signifies that there is no need of inexact transformations by geometric means (as in [Horvath et al, 2020] [21] e.g.) whose levels of imprecision are, if possible at all, difficult to justify if there is no intention of running simulations. In other words, our approach is inherently analytical, on which basis various advanced numerical integration methods can be developed and directly applied. At the same time, we have discussed three types of kernels, the third of which, being completely new, is argued to be the most computation-friendly by involving only elementary functions and is hence recommended as an ideal substitute for the (commonly used) Riemann-Liouville kernel. Since a volatility derivative is the same as a similar derivative written on the corresponding variance raised to the power 1/2, it is natural to think of a wider class of power-type volatility derivatives also as a means of introducing leverage effect into the option payoffs. This is very much comparable to the original invention of power options written on equity or fixed-income instruments, in terms of functionalities. The general pricing-hedging formulae proposed in Section 4, being entirely analytical in their own right, should be interpreted as model-independent requiring nothing more than a semi-closed characteristic function of the underlying quantity in order to operate. Besides, these new formulae can be thought of as power-exponential analogs of the well-known equity-option pricing formulae with exponential structures that initially appeared in [Bakshi and Madan, 2000] [3] -more precisely, resultant exponential functions are mostly replaced by incomplete gamma functions. Speaking of outcomes, our empirical study on VIX options has demonstrated that the proposed model framework and pricing formulae are generally highly stable and efficient for applications. Combined they are expected to fit considerably well for both short-and long-maturity, in-and deeply out-of-the-money options, and more so when calibrated under the type-III kernel, which is hence preferred over the first type. All the model parameters can be reliably calibrated under this framework, even including the fraction index d and the tempered-stable family parameter c that are tied to highly nonlinear relations, while the activity level of the VIX index is set to be α = 1.78 to allow for properly active small-scale fluctuations in the light of [Todorov and Tauchen, 2011 ]'s [47] discovery. The calibrated parameter values are in keeping with economic interpretations, confirming the prevalence of fast mean reversion (κ) and short-term dependence in volatility (d ). The value of c points out that large volatility jumps can be suitably dealt with by something close to an inverse Gaussian model, also signaling to some degree deviation from the exclusive use of a Brownian motion. In particular, the remarkable fits for the short-maturity options in the second data set, observed during the COVID-19 pandemic when trading volumes witnessed abnormal increases, would not likely have been achieved had (upward) volatility jumps been completely turned aside. Overall, our methodology has provided new insights into how to think about the pricing problem of volatility derivatives allowing for both rough volatility and volatility jumps, by making full use of characteristic functions. With the foregoing remarks in mind, future research could be for instance devoted to understanding the advanced valuation of VIX derivatives under rough stochastic volatility of volatility (mentioning [Da Fonseca and Zhang, 2019] [13] again), following the idea of Section 6 via a temporal composition process. The exploration of the significance of rough volatility in pricing derivatives linked to cryptocurrencies (mentioning [Takaishi, 2020] [45]) would also be an interesting subject. where g is the Riemann-Liouville kernel (2.1.3) with the same fraction parameter d . This relation enables us to restrict our analysis to the Riemann-Liouville fractional Lévy subordinator X (g ) without mean reversion. The expectation on the right-hand side of (A.1.1) is then by the Lévy-Itô isometry In particular, for d > 2 there is the fundamental Riemann integral representation X Suppose d ∈ (1, 2] . We observe that the uniformly continuous map is increasing and convex for every t > 0, hence the only need to compare its left tail behavior against power-law tails. It is not difficult to see that as u 0, [26] ) to conclude that X (g ) , and hence V due to (A.1.1), admits an a.s. continuous modification over R + ; in particular, the modification is guaranteed to be a.s. locally Hölder-continuous for every exponent in (0, min{d − 1, 1/2}). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables On the short-time behavior of the implied volatility for jumpdiffusion models with stochastic volatility Spanning and derivative-security valuation Inferring volatility dynamics and risk premia from the S&P 500 and VIX markets Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics Pricing under rough volatility Bond power exchange options Power exchange options Tails, fears, and risk premia Time-changed Lévy processes and option pricing Historical options data On the Lambert W function Volatility of volatility is (also) rough Perfect hedging in rough Heston models The characteristic function of rough Heston models Option pricing under fast-varying and rough stochastic volatility Volatility is rough. Quantitative Finance Affine forward variance models Calculation of Gauss quadrature rules. Mathematics of Computation A general Ornstein-Uhlenbeck stochastic volatility model with Lévy jumps Volatility options in rough volatility models Analysis of variance based instruments for Ornstein-Uhlenbeck type models: swap and price index Affine Volterra processes On VIX futures in the rough Bergomi model. Quantitative Finance Transformation formulas for fractional Brownian motion Brownian Motion and Stochastic Calculus Jump SDEs and the Study of Their Densities Tempered stable distributions and processes Remarks on the relation between fractional moments and fractional derivatives of characteristic functions Pure jump models for pricing and hedging VIX derivatives Rough volatility: Evidence from option prices Stochastic Methods in Asset Pricing On the pricing of power and other polynomial options Fractional Brownian motions, fractional noises and applications Fractional Lévy processes with an application to long memory moving average processes Fractional absolute moments of heavy tailed distributions Option pricing in a Garch model with tempered stable innovations Gaussian stationary processes with asymptotic power spectrum The effects of asymmetric volatility and jumps on the pricing of VIX derivatives Positive-part moments via characteristic functions, and more general expressions Lévy Processes in Finance: Theory, Numerics, and Empirical Facts. Dissertation zur Erlangung des Doktorgrades der Mathematischen Fakultät der Albert-Ludwigs-Universität Freiburg i. Br. 169 pages Tempering stable processes. Stochastic Processes and their Applications Lévy Processes in Finance: Pricing Financial Derivatives Product Integration, Its History and Applications. Nečas Center for Mathematical Modeling & History of Mathematics Rough volatility of Bitcoin Fractional Lévy processes as a result of compact interval integral transformation Volatility jumps Power options: hedging nonlinear risks Pricing power exchange options with correlated jump risk Fractional Ornstein-Uhlenbeck Lévy processes and the Telecom process: Upstairs and downstairs Pricing exotic power options with a Brownian-time-changed variance gamma process A stochastic-volatility model for pricing power variants of exchange options Empirical characteristic function estimation and its applications which eventually yields (B.4). Putting things together we arrive at (B.2) as required.We remark that, with X 1 being tempered-stable distributed, if h is a kernel of either type I or type II, no such explicit expression exists for the second Riemann integral in (B.5), the computation of which has to resort to numerical methods such as the Gauss quadrature rule. The reason behind this problem is clear: the type-I and type-II kernels are both formed by multiplying power and exponential functions but no elementary substitution works for integrals of the form (1 − e −κs s d −1 ) c ds, for d > 1/2, κ > 0 and c ∈ (0, 1). For V • defined in (2.2.1) we focus on the stochastic integral part, namely X (h) , the rest being obviously continuous and of finite variation. Since h is continuously differentiable and h(t + u, t ) = O e −κu u (d −1) + = o(u d −1 ) as u → ∞ for any t ≥ 0, by the extreme value theorem there exist two positive constants b h ≥ a h > 0, which depend only on the parameters of h including κ and d , such that, for any u > 0 and t + u ∈ [0, T ], With ν(R ++ ) = ∞, it is a familiar result (see again [Lyasoff, 2017, Sect. 16 ] [32] ) that PE = 1. Then we observe that In consequence, there must exist some t > 0 such thatTo put it another way,then h is uniformly bounded so that h(t , t −) > 0 for any t > 0, and hence (A.1.5) proves the a.s. discontinuity of the sample paths of X (h) . In this case, since X (h) has no Brownian part, its quadratic variation is given by the sum of its squared jumps,where the inequality follows from the finiteness of t ∈[0,T ] (X t − X t − ) 2 > 0. On the other hand, if d < 1, then h(t , t −) = ∞ for any t > 0, which with (A.1.5) gives the a.s. discontinuity and unboundedness of the sample paths of X (h) , and hence V , over [0, T ]. An immediate implication is therefore that the sample paths of V have infinitely large squared jumps over [0, T ] P-a.s., from which follows its (P-a.s.) infinite quadratic variation. Therefore assertions (ii) and (iii) are proved.According to the conditional representation (2.3.13), we write for fixed 0 ≤ t 0 < t ≤ Twhere J (t , t 0 , ∆) is as defined in Proposition 2 and preliminarilyDue to the infinite divisible distribution of Z 1 ,Because the process · t 0 H ∆ (t , s)dX s is additive (or has independent increments) on (t 0 , T ] with t treated as fixed and Z is a Lévy process having an absolutely continuous distribution, using in proper order the independence lemma (for X and Z ), the infinite divisibility of the distribution of X 1 , and the inverse Fourier transform, we have forwhere · · stands for the geometric integral operator (see, e.g., [Slavík, 2007] [44] ) and f Z t −t 0 (x),x ∈ R, denotes the density function of the random variable Z t −t 0 . After rearrangement the desired integral representation (2.3.14) is therefore obtained. First notice that2) constitutes a conventional rectangular Riemann sum approximation and it is familiar thatAlso, E cos n(t ,M ) k=1Ž k − cos Z t = O(M −1 ) as the cosine is continuous differentiable. These show asymptotic unbiasedness. In addition, using the relation for proving the L 2 -convergence rate it is sufficient to note that, in the same vein, The relationship between fractional moments and the characteristic function of a real-valued random variable has been well established. In particular, knowing thatIf p is even, then (A.5.1) is understood as a conventional derivative corresponding to the first equation in (4.1.2). Otherwise, it represents a fractional derivative and can be written ([Laue, 1980,Since φ I 2 T (∆)|t 0 (·) ∈ C p/2 (R), we can apply the dominated convergence theorem together with the substitution l → −l to recast (A.5.2) asUsing thatand the Hermitian property of the characteristic function we arrive at the second equation in (4.1.2). Based on (2.3.14), we have for p/2 ∈ N thatso that Sending l → 0 gives the first equation in (4.1.4).For the second equation in (4.1.4) we must justify that differentiation under T can be done inside the integral. Interchange with the real part is then simply allowed thanks to the Hermitian property. Since φ I 2 T (∆)|t 0 (·) ∈ C p/2 (R) and φ∆)|t 0 (l )) as l → ∞, we only need to check integrability of the tails of Re[φ I 2 T (∆)|t 0 (l )]/l p/2− p/2 for l ≥ 0. From (A.2) we know that J (t , t 0 , ∆) > 0 and is measurable with respect to F t 0 , allowing us to rewritewhere ϕ(l ; T, t 0 , ∆) denotes the characteristic function of the random variablewhose distribution admits a well-defined density (recall that H ∆ is continuous, ν X and ν Z are both non-atomic and infinite). Hence, we have ∞ 0 Re e il J (t ,t 0 ,∆) ϕ(l ; T, t 0 , ∆) dl = 0, which with p/2 − p/2 ∈ (0, 1) for any p ∉ 2N implies the desired tail integrability. Let f (x; T, t 0 , ∆) and F (x; T, t 0 , ∆), for x > 0, respectively denote the density function and the distribution function of I 2 T (∆)|F t 0 , which exist because the distributions of X 1 and Z 1 are both absolutely continuous. For the price of the asymmetric power put option on I T (∆) at t 0 ∈ [0, T ), we adoptp = p 1 /2 to rewrite its terminal payoff so that P (p 1 ,p 2 ,(a)) t 0Further denoteK = K p 2 /p . Using the Fourier inversion formula we havewhere the second equality uses the Fubini theorem since the integral in x is taken over a finite interval. To evaluate the inner integral in (A.7), we apply the substitution x → il x and observe thatwhere the second equality follows because the integrand is analytic over the horizontal half-strip {x : Rex > 0, Imx ∈ (0,K l )} with l > 0. This establishes (4.2.3) after rearrangement. The pricing formula for the similar asymmetric call option results from a standard parity argument that IT (∆) − K p 2 , together with Proposition 4. It is important to note that a single integral representation for the call price is inaccessible due to inapplicability of the Fubini theorem when integration acts over [K , ∞) x. We simply use the bounded-ness and Hermitian property of φ I 2 T (∆)|t 0 (·) in order to apply T to (4.2.3) inside the real part of the integral. For the call option we use the parity relation (4.2.4). First consider the symmetric power put option with the payoff decomposition (4.3.2), so that we may writeand for every k ∈ N ++ using the argument in the proof of Proposition 5 we havēwhich obviously allows the series to be augmented to k = 0 and completes the proof of (4.3.4) after simplification.For the similar symmetric call option price, we rely on the decomposition (4.3.3) to writewhere all the summands with index k > p in conditional expectation are put into Σ Here the Fubini theorem applies because k − p > 0. At this point it suffices to observe thatwhich is well-defined as p − k cannot be an even number. The proof is similar to that of Corollary 2, except that T acts on (4.3.4) and (4.3.5) termwise, where the interchange of integration and differentiation is permitted for the same reason. By mimicking the steps in the proof of Proposition 2, it can be deduced from (6.2) that the characteristic function of T t for a fixed t > 0 is given byBy assumption the process Y has its own filtration {σ((Y s ) s∈[0,t ] )} t ≥0 independent from that of X . Therefore, via subsequent conditioning we have 13Since the distribution of the time change is absolutely continuous, (A.11) can be written using inverse Fourier transform asand this is exactly the same as (6.3). In this section we prove a closed-form formula for the conditional characteristic function ofṼ • t (∆)given F t 0 for fixed 0 ≤ t 0 < t ≤ T and ∆ ≥ 0 using the tempered-stable distribution and the type-III kernel; note thatṼ • t (0) ≡ V • t . The result will be useful for those who are interested in the pricing and hedging of derivatives contracts on the instantaneous or the forward variance under shortterm dependence and volatility jumps, to which all the formulae presented in Section 4 apply. To that end let us recall that the partial forward varianceṼ • t (∆) has the general representatioñ andProof. Following the proof of Proposition 2 it is readily established from (B.1) thatRewriting the type-III kernel asstraightforward integration over the interval [t 0 , t ] thus leads towhere the second equality uses the substitution s → t − s + ∆ and max{τ,∆} ∆ ≡ 0 if ∆ ≥ τ. Similarly, for the second Riemann integral in (B.5) we have with the characteristic exponent (2.1.5) that ξ 1 = aΓ(1 − c)/b 1−c and that t t 0 log φ X 1 (l h(t − s + ∆))ds where (·) · denotes the Pochhammer symbol, a.k.a. the rising factorial, and which after simplification leads to (B.3). The case of I 2 is slightly more involved but can be proved in a similar fashion, and we obtain