key: cord-0523930-nq9kyt2m
authors: Fosset, Antoine; Bouchaud, Jean-Philippe; Benzaquen, Michael
title: Non-parametric Estimation of Quadratic Hawkes Processes for Order Book Events
date: 2020-05-12
journal: nan
DOI: nan
sha: 408d0e8bedca04d5f9118ac785f7cc2db9754ccb
doc_id: 523930
cord_uid: nq9kyt2m

We propose an actionable calibration procedure for general Quadratic Hawkes models of order book events (market orders, limit orders, cancellations). One of the main features of such models is to encode not only the influence of past events on future events but also, crucially, the influence of past price changes on such events. We show that the empirically calibrated quadratic kernel is well described by a diagonal contribution (that captures past realised volatility), plus a rank-one"Zumbach"contribution (that captures the effect of past trends). We find that the Zumbach kernel is a power-law of time, as are all other feedback kernels. As in many previous studies, the rate of truly exogenous events is found to be a small fraction of the total event rate. These two features suggest that the system is close to a critical point -- in the sense that stronger feedback kernels would lead to instabilities.

The accumulation of empirical clues over the past few years provides mounting evidence that most of market volatility is of endogenous nature [1] [2] [3] [4] [5] . This obviously does not mean that significant news, such as the very recent Covid-19 crisis, do not impact financial markets, but rather that these only account for a small fraction of large price moves. Think for example of the S&P500 flash crash of May 6th, 2010 [6] , see also [7] , which has not been triggered by any outstanding piece of news. Furthermore, while one may argue that in some cases large drops are exogenously triggered, their amplification is often due to endogenous mechanisms [8] .

The behaviorally supported idea that agents tend to overreact, especially during crises, has driven the market modeling community to fall back on self-exciting processes, better known as Hawkes processes [9] . The latter have proven to be extremely efficient to tackle the intricate dynamics of the order flow and other self-excited effects in financial markets [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] . Nonetheless, linear Hawkes processes are unable to account for an empirical finding essential to our eyes to tackle endogenous instabilities: the Zumbach effect [27] [28] [29] [30] . The latter states that past price trends increase future activity, regardless of their sign. Quadratic Hawkes processes (Q-Hawkes), inspired by quadratic ARCH processes [28, 31] , were recently introduced to circumvent this issue [29, 32] , and have proven key to understand fat-tails in the distribution of returns, as well as spread, volatility and liquidity dynamics [5] .

In our recent paper [5] we indeed argued that price or spread jumps could be the result of endogenous feedback loops that trigger liquidity seizures, see also [33] . In particular, we empirically showed that Zumbach-like effects exist in order book data, i.e. past trends and volatility tend to promote future activity, and in particular cancellations that diminish liquidity and fragilise the system, possibly leading to a liquidity crisis. Combining Q-Hawkes processes with a stylized order book model [34, 35] revealed an interesting scenario with a second order phase transition between a stable regime for weak feedback and an unstable regime for strong feedback, in which liquidity crises arise with high probability. However, for such a scheme to be relevant for financial markets, the system must sit very close to the instability threshold (perhaps as the result of "self-organised criticality"). As an alternative scenario, we also proposed a non-linear Hawkes process which exhibits liquidity crises as occasional "activated" events, separating locally stable periods of normal activity.

In the present paper, we calibrate on real market data a version of the generalized Q-Hawkes process proposed in our recent work [5] . We provide convincing evidence for the price/liquidity feedback mechanism described above and quantify its implications. In section 2 we briefly recall the ingredients of the model and present the non-parametric calibration procedure, inspired by the methods introduced by Bacry et al. [36] [37] [38] . We apply such calibration to order book data on the EURO STOXX and BUND futures contracts. In section 3 we present an alternative method that needs fewer assumptions to compute the overall effect of past price moves on future liquidity flow. We introduce a low rank (Zumbach-like) approximation that allows us to denoise the feedback kernels and separate the effects of trend and volatility, and apply it to our futures contracts. In section 4, we focus on the liquidity flow and analyse spread time series in relation with adequate trend and volatility signals. Results appear to favour the "self-organized criticality" scenario over the metastable, "activated" scenario discussed above and in [5] . In section 5 we conclude.

We present a simplified version of the Generalized Quadratic Hawkes process (GQ-Hawkes) introduced in [5] , where the influence of the size of the queues on event rates is neglected. Consider a 6-dimensional process

counting six types of order book events: limit orders (LO), cancellations (C), and market orders (MO), for both the bid (b) and ask (a) sides of the order book; we consider best quotes only. We further assume that the process N t is coupled to the past price process P t <t in the following way. Denoting λ t the intensity of the the 6-dimensional process N t we let:

with φ, L and K causal decaying kernels. One can always choose K (u, s) = K (s, u) without loss of generality. Note that φ is a 6×6 matrix, whereas L and K are 6-dimensional vectors. The intensity λ t is the sum of four different contributions, from left to right in the RHS of Eq. (1), one has the base rate α 0 , the standard linear Hawkes contribution, followed by the linear and the quadratic contributions of price fluctuations. As pointed out in [5, 29] , assuming that P t is a martingale makes analytical calculations, and numerical calibration, much more congenial. Finally, assuming as we shall do hereafter that a stationary state is reached allows us to replace the lower bound of the integrals in Eq. (1) by −∞.

Here we introduce a non-parametric scheme to calibrate Eq. (1) to real market data. Our method is an extension of the second moment method introduced by Bacry et al. in [36, 37] , see also [28] .

Before deriving the equations that will be used for the calibration, we introduce the following averages and covariances:

where we have assumed for simplicity that the jumps of P and N are not simultaneous. Note that while price jumps can only occur if one order book event triggers them, the relative frequency of the latter is so much larger that this approximation is fully justified. Combining Eq. (1) with Eqs (2) yields the following set of equations for the first and second moments of the processes. Introducing the notations || f || = f (t)d t and K d (t) := K (t, t) the diagonal part of K , one obtains for t, x > 0 with t = x:

Provided the number of events generated by price fluctuations is small compared to that generated by the linear Hawkes contribution, i.e. i,k ||φ ik ||Λ k i ||K i d ||∆ 2 , Eq. (3b) conveniently simplifies to:

This approximation is relatively well supported by real data for short enough times (see below). It is essential at this stage as it allows us to decouple the estimation of the Hawkes kernel from that of L and K : one can first estimate φ from Eq. (4) and then compute L and K from Eqs. (3c), (3d) and (3e). The base rate is finally obtained from Eq. (3a). Note that while in principle an exact calibration of Eqs. (3) is possible, it does not perform well on real data -but see section 3 below.

In section 2 we stressed that the point process P t needs to be a martingale for Eqs. (3) to be valid. Yet, it is well established that the mid-price in financial markets displays substantial mean-reversion at short timescales. To circumvent this issue we consider the volume weighted mid-price, sometimes called the micro-price, P micro t , known to be closer to a martingale at high frequency [39, 40] . 1 It is defined as:

where v b t , v a t denote the available volume at the best bid b t and ask a t respectively. To enforce further the martingale property we use the so-called surprise price, that we shall henceforth denote by P t , and which consists in subtracting to the price its (linear) statistical predictability. Mathematically speaking, this reads:

where ρ P (t − s) := Cor d P micro t , d P micro s denotes the price auto-correlation function. We also note that the intensity of order book events exhibit an intraday U-shape, very much like the well known U-shaped volatility pattern. Computing the total intensity of events Λ tot = i Λ i over 5min bins and averaging over trading days, a U-shape is clearly visible. To avoid spurious effects related to these intraday seasonalities, we rescale time flow by this average pattern to enforce a constant rate of events in the new time variable.

In order to estimate the kernels from real order book data, one must choose a time grid t H n with weights w H n for kernel φ, such that ||φ|| ≈ n φ(t H n )w H n . We decide to use quadrature points [37] to ensure a good approximation of the integrals with a minimal number of points. Further, given that we expect power-law kernels, see e.g. [18, 29, 37] , we choose a linear scale at short times that switches to logarithmic at longer times. Finally, given that typical timescales are usually quite different (see below), we choose a different time grid t n , w n for the kernels L and K . See Appendix A for more details.

Finally, the empirical covariances are usually very noisy, so we choose to smooth them using a convenient fitting function in order to obtain better behaved kernels. Concerning the volatility covariance χ P 2 P 2 , it is found to behaves like a power law at large times so the chosen fitting function is 2 A(1 + t/B) −C We also fit the logarithm of χ N P (t), χ N P 2 (t) by a polynomial in log t, and smooth the off-diagonal kernel, see 3.2 for details. Plots of the "raw" kernels obtained without smoothing fits are provided in Appendix. B. Apart from being more noisy, as expected, these raw kernels are very similar to the smoothed ones. The calibration recipe then amounts to the following steps.

• Compute the surprise price from the micro-price using Eqs (5) and (6).

• Rescale time by the typical daily pattern of Λ tot = i Λ i .

• Estimate ∆ k , Λ and the covariances χ P 2 P 2 , χ N N , χ N P , χ N P 2 and χ N P P from the data using Eqs (2) Further details on how to solve these equations in practice are provided in Appendix A.

We now apply the calibration procedure presented above to the EURO STOXX futures contract in the period 2016/09/12 to 2020/02/07. For this contract, the average time between two order book events is τ e ≈ 0.03s, two orders of magnitude below the average time between two price changes τ P ≈ 7s, indicating that the range of the kernels L and K is likely to be greater than that of φ, and allowing one to choose discretisation time grids accordingly. We also apply the procedure to the BUND futures contract but do not show all the (redundant) results for the sake of readability; summarising results are displayed in Fig. 5 and Tables 1, 2 and 3. As specified in section 2.2.2, we start with the calibration of the Hawkes kernel φ. The results are displayed in Fig. 1 for the norms of the kernels, and in Fig. 7 in the Appendix for the full timedependence. The temporal decay of the kernels appears to be power law with exponent ≈ −1.5, consistent with previous reports [18, 29, 37] .

The calibration leads to a stable Hawkes process with spectral radius of ||φ|| (computed over 1000s) found to be ≈ 0.75 for the EURO STOXX contract and ≈ 0.74 for the BUND [10, 11] . The results show that the expected bid-ask symmetry holds with a high level of accuracy (see [5] ), such that one can average the kernels accordingly to improve the statistics without loss of information.

Plugging the obtained Hawkes kernels into Eqs. (3c), (3d) and (3e) allows us to calibrate the kernels L and K , see Fig. 2 . Again the expected bid-ask symmetry properties hold rather well: while the linear kernel L is anti-symmetric (the effect of the positive trend on the bid is the same as that of a negative trend on the ask), the quadratic kernel K is bid-ask symmetric. We will therefore not distinguish further bid and ask events in the following. Figure 2 (c) shows that the quadratic contribution cannot be reduced to the diagonal part K d only. Indeed, the off-diagonal contribution of the kernel is non-zero and rather long-ranged. The decay of the diagonal contribution is a power law with exponent ≈ −1. Such a decay is very slow and means that ||K d || is logarithmically sensitive to long timescales, for which we do not have much information since we only use data belonging to the same trading day to avoid the thorny discussion of overnight effects and how to treat them. Finally, while the Hawkes and price feedback effects are difficult to compare as they do not operate on the same timescales, one can argue that the approximation presented at the end of Sec. 2.2.1 is well supported by data: considering a cut-off of 1000 seconds to compute the norms, one finds:

. Another useful piece of information is the global effect of the quadratic term on order book events, measured by i ||K i d ||∆ 2 , which must be compared to the total activity i Λ i . The ratio of these two quantities is found to be 5% for the EURO STOXX and 7% for the BUND (see Table 3 for more details). Although not dominant, this feedback is clearly not negligible. Together with the standard Hawkes contribution, this means that the exogenous contribution α to the total activity is only 19% of the total for the EURO STOXX (17% for the BUND). Note that this fraction is expected to decreases further as the upper cut-off of the slowly decaying kernels is extended beyond 1000 seconds (see e.g. [18] ).

Here we present a framework which improves the above calibration in a threefold manner. As we shall see, (i) it allows to circumvent the approximation given in Eq. (4) which, we recall, is not perfectly satisfied by real data, (ii) it helps cleaning further the noisy off-diagonal contribution of the quadratic kernel, and (iii) it gives a more relevant measure of the global effect of price fluctuations on event rates with no longer having to consider, nor calibrate, the Hawkes contribution.

Using the resolvent method, see [37, 41] , one can rewrite Eq. (1) as:

with M a martingale satisfying

= n≥1 φ * n the resolvent,L = L + * L and

The kernelsL andK account for the overall feedback effect of P t , including all subsequent Hawkes self-excited events that are induced by price fluctuations. The remarkable property of such kernels is that they solve a much simpler set of equations:

where we have again enforced thatK is symmetric. The results obtained from the inversion of Eqs (8) for the EURO STOXX futures contract are displayed in Fig. 3 . These lead to similar, though slightly cleaner, conclusions to Fig. 2 . In particular, the values of i ||K i d ||∆ 2 are compatible with those obtained above (taking into account the 1 − ||φ|| factor, see Table 3 ).

Here we further dissect the results of the calibration presented in the previous section, with the objective in particular of separating the contributions of trend and of volatility to the quadratic feedback. A meaningful approximation for the quadratic kernelK was introduced in [29] , as the sum of a purely diagonal matrix and a rank-one contribution: 3

The first term on the right hand side of Eq. (9) reflects feedback of past volatility on current order book events. Its contribution in Eq. (7) can indeed by written as: The second term is in turn a reflection of the effect of past trends, as measured in Eq. (7) by [µ i (t)] 2 , where:

This last term is reminiscent of the so-called Zumbach effect: past trends, regardless of their sign, lead to an increase in future activity. An altenative interpretation is that [µ i (t)] 2 is a local measure of a lowfrequency volatility, to be contrasted with [σ i (t)] 2 which is a local measure of high-frequency volatility. Note that the kernels ψ and Z are normalised:

such that the overall strength of the volatility contribution isK d while that of the trend contribution is K 1 . While in practice such an approximation is of course not perfect, one can check that including higher rank contributions is unessential as the latter do not carry much additional signal. The rank-one kernel is obtained by minimizing

which consists in finding the first eigenvector of a well chosen linear map, see [42] for more details. The ψ contribution is then obtained by taking the diagonal ofK i and subtractingK i 1 Z i (t) 2 . Figure 4 displays the kernels φ and Z as function of time for the EUROSTOXX futures contract. As one can see, while the volatility kernel decays roughly as 1/t, although some curvature can be observed. The Zumbach counterpart decays as 1/t, regardless of event types (by that justifying the choice made in Fosset et al. [5] , where the same functional form for all event types was assumed).

So far we have focused on the impact of past price moves one event rates. Here we wish to go on step further and estimate the effect of past price changes on liquidity, i.e. volume weighted events. For this one needs to consider order volumes. The average volumes are given in Tab. 1 for the different types of orders. Assuming bid/ask symmetry (consistent with the empirical results), Fig. 5 displays the amount 

1 are obtained as explained in the previous section, V i are given in Tab. 1, and ∆ 2 is defined in Eq. (2a). 4 Introducing the overall average quadratic liquidity flux as:

one consistently finds that the quadratic (price) feedback has an overall negative effect on liquidity JK < 0, most of it associated to volatility, see Fig. 5 (c). 5 In other terms, the quadratic feedback tends to decrease liquidity on average. Figure 5 (b) shows that both the volatility and Zumbach terms have an average negative impact on liquidity (i.e. the green bars represent less than 50% of the total contribution). The Zumbach term is responsible for non-trivial long-range liquidity anomalies. In particular, Blanc et al. [29] showed that the price process resulting from a quadratic Hawkes process follows is diffusive with fat tailed stochastic diffusivity at large times, which can be attributed to the Zumbach effect, rather than its volatility counterpart (see also the discussion in [32] ). In any case, we believe that the quadratic feedback of price trends on order book events is a crucial ingredient to understand liquidity crises. In the next section we provide a direct test of this hypothesis.

With the aim of making contact with our previous work [5] , we now focus on the analysis of spread dynamics. Since the EUROSTOXX futures is a large tick contract (the spread is equal to one over 99% of the time and seldom higher than two), we characterize the dynamics of liquidity using an effective spread S eff t which is defined as follows. Calling v a t (x) (resp v b t (x)) the ask (resp bid) volume at price level x, we construct cumulative volumes as Q a

We then choose the average volume at best V best as a reference volume, and define: 6

where Q a/b t −1 denotes the inverse function of Q a/b t . The effective spread is a natural proxy for liquidity in the close vicinity of the midprice: when the liquidity is close to its average, the effective spread coincides with the regular spread; but when liquidity is low, it can be much larger as aggregating the volume of several queues is needed to recover the reference volume V best . Figure 6 (a) displays the survival function of the effective spreads, revealing that (S eff ) ∼ (S eff ) −5 . This power-law tail is interesting for the following reason: the effective spread can be seen as a proxy for the size of latent price jumps, i.e. the jumps that are likely to happen if an aggressive market order hits the market. Hence, one expects the distribution of effective spread is not far from the distribution of price returns r, which is well known to decay as (r) ∼ r −4 .

Let us now study the relation between effective spread, square volatility σ 2 and square trend µ 2 , as defined in Eqs. (10) and (11) . Figures 6(b) , (c) and (d) display the correlation functions C µ (τ) := Cor µ(t + τ) 2 , S eff (t) , C σ (τ) := Cor σ(t + τ) 2 , S eff (t) and C (τ) := Cor (t + τ), S eff (t) respectively, with = µ 2 /σ 2 . Note that a causal positive impact of past trends on future spreads should translate as a strong contribution to C µ (τ) for negative τ. Interestingly, this is compatible with Fig. 6(b) , which confirms in a model-free fashion that the Zumbach-like coupling is important: past square trends increase future effective spread, or equivalently decrease future liquidity. While also slightly asymmetric, the volatility/spread correlation C σ (τ) does not reveal such a level of asymmetry (see Fig. 6 (c)). Fig. 6(d) shows an even more pronounced asymmetry when we rescale the trend by the local volatility:

is a proxy of the autocorrelation of returns, independently of their amplitude. In this sense, it is a better signature of trend behaviour, as the volatility aspect of recent price changes is discarded.

In this work, we have proposed several actionable procedures to calibrate general Quadratic Hawkes models for order book events (market orders, limit orders, cancellations). One of the main features of such models is to encode not only the influence of past events on future events but also, crucially, the influence of past price changes on such events. We have shown that the empirically calibrated quadratic kernel (describing the part of the feedback that is independent of the sign of past returns) is well described by the shape postulated in [5, 29, 32] , namely:

• a diagonal contribution that captures past realised volatility, and • a rank-one contribution that captures the effect of past trends.

The latter contribution can be interpreted as the microstructural origin of the Zumbach effect: past trends, independently of their sign, tend to reduce the liquidity present in the order book, and therefore increase future volatility. As we have shown in our companion paper [5] , such coupling can in fact be strong enough to destabilise the order book and lead to liquidity crises.

One of the perhaps unexpected result of our calibration is that the Zumbach kernel is found to be a power-law of time for the futures contracts studied here, and not an exponential as was found in [29] for US stock prices. Hence, all Hawkes kernels in our study are found to be power-laws of time. Furthermore, as in many previous studies [11, 18, 25] , the rate of truly exogenous events is found to be much smaller than the total event rate, typically 1/5 when all kernels are truncated beyond 1000 seconds, and probably even smaller when longer lags are taken into account, due to the slow decay of the kernels. These two features suggest that the system is close to a critical point -in the sense that stronger feedback kernels would lead to instabilities. In our setting, we have shown that the effective spread (which is a measure of the (il-)liquidity of the order book) has itself a power-law tailed distribution, which we see as a precursor of the famous "inverse cubic" power-law tails of the return distribution (in the present context, see e.g. [28, 29] ). Such a power-law is not compatible with the alternative "activated" scenario proposed in [5] , which would rather suggest a bimodal distribution with a hump at large effective spreads. Hence, we favour at this stage the scenario of markets poised close to a point of instability, although the detailed mechanisms that lead to such a fine tuning are still somewhat obscure. We note that the near-criticality has also been argued to be crucial to understand the "rough" nature of volatility [32, 43, 44] . We believe that understanding these mechanisms is probably one of the most intellectually challenging (and exciting) issue for microstructure theorists. 

What moves stock prices

Events that shook the market

Stock price jumps: news and volume play a minor role

The endogenous dynamics of markets: price impact and feedback loops

Endogenous liquidity crises

The flash crash: High-frequency trading in an electronic market

Back to the future: lessons from the forgoten 'flash crash' of 1962

Trades, quotes and prices: financial markets under the microscope

Spectra of some self-exciting and mutually exciting point processes

An introduction to hawkes processes with applications to finance. Lectures Notes from Ecole Centrale Paris

Hawkes processes in finance

State-dependent hawkes processes and their application to limit order book modelling

The role of volume in order book dynamics: a multivariate hawkes process analysis

Modelling systemic price cojumps with hawkes factor models

Queue-reactive hawkes models for the order flow

Hawkes model for price and trades high-frequency dynamics

Modelling microstructure noise with mutually exciting point processes

Critical reflexivity in financial markets: a hawkes process analysis

Modeling foreign exchange market activity around macroeconomic news: Hawkes-process approach

Simulating and analyzing order book data: The queue-reactive model

Extension and calibration of a hawkes-based optimal execution model

Dynamic optimal execution in a mixed-market-impact hawkes price model

Analysis of order book flows using a non-parametric estimation of the branching ratio matrix

Collective synchronization and high frequency systemic instabilities in financial markets

Quantifying reflexivity in financial markets: Toward a prediction of flash crashes

The statistical physics of discovering exogenous and endogenous factors in a chain of events

Volatility conditional on price trends

The fine-structure of volatility feedback i: Multi-scale self-reflexivity

Quadratic hawkes processes for financial prices. Quantitative Finance

The zumbach effect under rough heston

Quadratic arch models

From quadratic hawkes processes to super-heston rough volatility models with zumbach effect

How does latent liquidity get revealed in the limit order book

Quantitative model of price diffusion and market friction based on trading as a mechanistic random process

Statistical theory of the continuous double auction

First-and second-order statistics characterization of hawkes processes and non-parametric estimation

Estimation of slowly decreasing hawkes kernels: application to high-frequency order book dynamics

Non-parametric kernel estimation for symmetric hawkes processes. application to high frequency financial data

The micro-price: a high-frequency estimator of future prices

Queue imbalance as a one-tick-ahead price predictor in a limit order book

Limit theorems for nearly unstable hawkes processes. The annals of applied probability

Weighted low-rank approximations

Rough fractional diffusions as scaling limits of nearly unstable heavy tailed hawkes processes

No-arbitrage implies power-law market impact and rough volatility

Here we show how to practically estimate the kernels presented in section 2.2 from empirical data. First, we detail the empirical estimators for averages and covariances, then focus on the time grids used for estimation, and finally discuss the numerical discretisation of Eqs. (3).

We assume that we have a sample of events of type i that happen at times T i n n , with i = P for the price process. Calling T the total length of observation, the estimators of the average intensities read:For the covariance estimators, we use a classical approach for asynchronous data. Denoting ∆t, ∆x the time steps associated with times t and x, one has:Note that, as mentioned above, one can choose different time grids for the Hawkes and price contributions. Symmetry properties of the covariances enable us to estimate them only for positive times:One can reasonably assume that the covariances are 1 except in zero. [37] . Indeed, quadrature points in log-scale are well suited to accurately account for long range behaviour in the norm of the kernels. Consistently, it is advised to have time intervals increasing at the same rate as the grid of points we use. On the other hand, taking disjoined intervals [t − ∆t/2, t + ∆t/2] enables fast computations of the covariances. To enforce all of this, we compute the differences between the quadrature points, sort them and take the cumulative sum. This gives the disjoined time intervals suited for fast computations. Then, with linear interpolation, we obtain the final values on the quadrature points. Discretisation Equations (3) can be discretised in two different ways, using properties of the covariances and time grids. To show how to approximate the integrals, we provide an example of discretisation of + f (s)ds for an arbitrary function f using the time grid (t n ). The two possibilities are:

• The quadrature technique: + f (s)ds ≈ n f (t n ) w n .• The piece-wise 1 approximation: The first approximation is very efficient to compute Tr K or ||φ|| using (t h n ) and (w h n ). The second handles very well the behavior around zero and can be useful to solve Eq. (4).