key: cord-0913850-yvoakjl1 authors: Smirnov, Veniamin; Ma, Zhuanzhuan; Volchenkov, Dimitri title: Extreme events and emergency scales() date: 2020-05-22 journal: Commun Nonlinear Sci Numer Simul DOI: 10.1016/j.cnsns.2020.105350 sha: 7c5c3542337e3d2fceda7c257d4d79dcb7196583 doc_id: 913850 cord_uid: yvoakjl1 An event is extreme if its magnitude exceeds the threshold. A choice of a threshold is subject to uncertainty caused by a method, the size of available data, a hypothesis on statistics, etc. We assess the degree of uncertainty by the Shannon’s entropy calculated on the probability that the threshold changes at any given time. If the amount of data is not sufficient, an observer is in the state of Lewis Carroll’s Red Queen who said “When you say hill, I could show you hills, in comparison with which you’d call that a valley”. If we have enough data, the uncertainty curve peaks at two values clearly separating the magnitudes of events into three emergency scales: subcritical, critical, and extreme. Our approach to defining the emergency scale is validated by 39 years of Standard and Poor’s 500 (S&P500) historical data. Not a single day passes by without hearing about extreme events which surround us almost everywhere. On one hand, climate change results in droughts, heat waves, tornadoes, storms etc; movement of the tectonic plates is responsible for earthquakes and volcano eruptions. On the other hand, certain aspects of human activities may crash stock markets, influence tensions not only among groups of people in a country, but also between countries sometimes leading to military confrontations and mass migrations, etc. While there is no single definition of extreme events [1] , they are considered events that cause infrastructure failures, economic and property losses, risk to health and life. In order to quantify extreme events, practitioners developed several scales. For instance, Modified Mercalli Intensity scale [2] (describes the intensity of visible damage of earthquakes), Beaufort Wind scale [3] (measures speed and observed effects the wind has on the sea and land), Saffir-Simpsons Hurricane scale [4] (measures wind speed), Fujita scale [5] (rates the intensity of a tornado after it has passed), US Homeland Security Terror Alert scale [6] (measures five color-coded terror alert levels), U.S. Climate Extremes Index [7] , etc. Rohn Emergency Scale [8] unites emergency scales using three independent dimensions: (i) scope; (ii) topographical change (or lack thereof); and (iii) speed of change. The intersection of the three dimensions provides a detailed scale for defining any emergency [8] . In some papers, the threshold for an extreme event is related to the number of standard deviations from the average amplitude [9, 10] . However, existing empirical scales tend to describe the characteristics of the event itself rather that the consequences; such scales are ill-suited to describe emergencies in a way that is meaningful for response [11] . For instance, extreme events of different magnitudes in financial markets range from global recessions (defined by a global annual GPD growth rate of 3.0 percent of less) that happened in 1975, 1982, 1991, 2009 to "flash crashes" (for instance, on May 6, 2010, the S&P500 declined 7% in less than 15 minutes, and then quickly rebounded). Unlike the Richter magnitude scale, the severity of flash crashes in financial markets is defined by the measures that need to be taken to ease panic such as halting trading. Under 2012 rules, market-wide circuit breakers (or 'curbs') kick in when the S&P 500 index drops 7% for Level 1; 13% for Level 2; and 20% for Level 3 from the prior days close. A market decline that triggers a Level 1 or 2 circuit breaker before 3:25 p.m. Eastern Time will halt trading for 15 minutes, but will not halt trading at or after 3:25 p.m. Circuit breakers can also be imposed on single stocks as opposed to the whole market. Under current rules, a trading halt on an individual security is placed into effect if there is a 10% change in value of a security that is a member of the S&P 500 Index within a 5-minute time frame, 30% change in value of a security whose price is equal or greater than $1 per share, and 50% change in value of a security whose price is less than $1 per share [14] . Ironically, in August 2015, single stock circuit breakers produced unprecedented disruption as 327 exchange-traded funds experienced more than 1000 trading halts during a single day. For the short-term reactions of stock markets, measured in terms of returns, there exist several approaches to defining their severity. For instance, some authors [12] define extreme events of stocks if prices change greater than 2.5%, for both positive and negative returns. However, others [13] suggest a threshold of 10%. In general, statistics of extreme events show that the extreme events are found in the tails of probability distributions (i.e. the distributions extremities). Inference over tails is usually performed by fitting an appropriate limiting distribution over observations that exceed a fixed threshold. The choice of the threshold over which to fit a probability distribution is hard and arbitrary. Although tools to guide this choice exist, inference can greatly vary for different thresholds. In addition to that, different distributions may admit asymptotic tails of the same appearance (a power law). In our paper we apply two approaches existing in practical extreme value analysis to study S&P 500 time series in the period from January 2, 1980 till December 31, 2018. The first one relies on deriving block maxima series as a preliminary step. We fit the actual data to the Generalized Extreme Value Distribution (GEVD). Our results (Sec. 4.1) show that the distribution of block maxima is a composition of several distributions (Fig. 6b) . The second approach relies on extracting the peak values reached for any period during which values exceed a certain threshold (falls below a certain threshold). We will cover major methods of choosing the threshold in Sec. 4.2. Usually with the second approach the analysis may involve fitting two distributions: one for the number of events in a time period considered and a second for the size of the exceedances. But in practice a suitable threshold is unknown and must be determined for each analysis [15] . We demonstrate several methods of choosing a threshold. An example of an empirical method, the rule of thumb, is studied in Sec. 4.2.3. A choice of the threshold can be made using statistical analysis tools, such as graphical approaches (Sec. 4.2) and automatic methods (Sec. 4.2.2). Rather than characterize extreme events solely in terms of the distribution of weekly maxima, threshold methods take account of all events beyond a given high threshold. Every decision we make or approach we choose is made in the face of uncertainty. For instance, as of March 25, 2020, 414,179 cases and 18,440 deaths due to coronavirus disease 2019 , caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), had been reported worldwide [16] . In attempts to slow down spread of the disease, the economies of many counties slammed brakes and financial markets plummeted in the light of uncertainty, but the above does not describe severity of this particular extreme event, but rather, a reaction of the mankind to COVID-19. One natural characteristic of severity of pandemic is death rate, which in case of SARS-CoV-2 ranges from 0.66% [17] to 4% [18] . Mortality rate is higher among people with underlying medical conditions, sometimes, deaths partly occur due to coronavirus. Therefore, some believe that this pandemic is poised to be more dramatic than anything we have seen in our lifetimes [19] , but other compare the situation to an elephant being attacked by a house cat. Frustrated and trying to avoid the cat, the elephant accidentally jumps off a cliff and dies [20] . An uncertainty depends on amount of data considered, a method used, etc. Moreover, once new data becomes available, we cannot guarantee sustainability of the previously chosen threshold value. In addition to that, assessment of the severity of the extreme event is also ruled by consequences, which may be revealed ex post facto. In Sec. 4.3 we presented statistics of extreme events under threshold uncertainty. We analyzed the degree of uncertainty of the threshold value (chosen with the rule of thumb) based on the probability of its change at any given day. Fig. 16 demonstrates the degree of uncertainty that depends on both the amount of data available and the values of thresholds. We observed several cases: (i) If the amount of data is not sufficient (a solid line corresponding to an 18-day window of obervation in Fig. 16 ), then the uncertainty curve forms a skewed profile attaining a single maximum for some value of the threshold. In this situation, an observer's perception of events reminds Red Queen from "Through The Looking-Glass and What Alice Found There" by Lewis Carroll [21] who said "When you say hill, I could show you hills, in comparison with which you'd call that a valley". Our understanding of the events whether they are extreme or not is very limited and uncertainty is blurry. As events become more severe, our uncertainty that the events are extreme decreases. The observer realizes that the events are extreme, but a precise point at which the events turn to be severe cannot be determined. This case is called the Red Queen State. (ii) As the window of observation becomes larger, the uncertainty curve exhibits two maxima indicating that the amount of data is sufficient. As we further extend the window the curve torrents into sharp peaks (Fig. 16 ). The latter ones clearly separate the threshold values into three regions: three levels of emergency. (I) In the region I (subcritical ), the threshold values are small to raise concern about extreme events. Then we can observe a spike with the degree of uncertainty attaining its first maximum. This extremum indicates a transition to the next kind of uncertainty. (II) In the region II (critical ), uncertainty is conceptualized with a question whether a magnitude of the event is already critical, extreme or not yet. Further, we see another jump of uncertainty. At this point we certain that events are not regular anymore. (III) We consider all events in this region extreme with our uncertainty decreasing. If these events are not extreme then what are they? We consider the Red Queen state and three scales, which are based on the degree of uncertainty, our contribution to the discussion on the extreme events and emergency scales. We conclude in the last Sec. 5. The analysis of extreme events in the S&P 500 time series of logreturn has been performed based on the data collected during 9,835 trading days (39 years), in the period from January 2, 1980, till December 31, 2018 (see Fig. 2 ) acquired using the publicly available source at Yahoo Finance (https://finance.yahoo.com/quote/ %5EGSPC/). The data set contains S&P 500 index at market open, highest point reached in the day, lowest point in the day, index of the stock at market close, number of shares traded. We have used index at market close since this information was always present in the data set. Computations were made using Pythons numerical libraries, such as NumPy and Pandas, as well as R-language to perform statistical analysis. Due to complexity of financial data, in order to analyze dynamics of the given financial time series, we computed log return, denoted R ln , according to the following formula [22] : where S&P Index(t) is the value of S&P 500 at market close (Fig. 3) . The lowest drop of log-return in Fig. 3 happened Monday, October 19, 1987 known as Black Monday when the S&P500 shed value of nearly 23% [23] . Another significant drop occurred in 2008 when S&P 500 fell 38.49%, its worst yearly percentage loss [24] . In September 2008, Lehman Brothers collapsed as the financial crisis spread. However, on Oct 13, 2008: S&P 500 marks its best daily percentage gain, rising 11.58% and registers its largest single-day point increase of 104.13 points [24] . Distribution of R ln values (Fig. 4) is asymmetrically skewed, with fat heterogeneous right and left tails. The log-return time series has a scale invariant structure when the structure repeats itself on sub-intervals of the signal, X(ct) = c H X(t), where the Hurst exponent H characterizes the asymptotic behavior of the auto-correlation function of the time series [25, 26, 27] . The larger Hurst exponent is visually seen as more slow evolving variations (i.e., more persistent structure) of the time series [26, 28, 29] . Processes with 0 < H < 0.5 exhibit antipersistence, with an increase in the process is likely to be followed by a decrease in the next time interval resulting in sample paths with a very rough structure [26, 28, 29] . On the contrary, values 0.5 < H < 1 lead to long-range dependence ("long memory") in the time series, with more likely the same sign of successive increments (persistence) and smoother sample trajectories. Finally, the time series constitutes random walks when H > 1 that have more apparent slow evolving fluctuations [26, 28, 29] . The q-order Hurst exponent H q is only one of several types of scaling exponents used to parameterize the multifractal structure of time series [26, 30] . The log-return time series for S&P 500 exhibits local fluctuations with both extreme small and large magnitudes, as well as short-and long-range dependences on different time scales [31, 32] ; it is not normal distributed and all q-order statistical moments should to be considered to describe the spatial and temporal variation that reveals a departure of the log-return time series from simple random walk behavior [26, 28] . The q-order weights the influence of segments with large and small fluctuations. The negative q s are influenced by the time series segments with small fluctuations, and large fluctuations influence the time series segments for positive q's. In our work, we use the standard multifractal detrended fluctuation analysis (MFDFA) algorithm [26, 30] for estimating the q-order Hurst exponents and the multifractal spectra directly from the time series: (1) The original time series x k , k = 1, . . . , N is aggregated by computing the cumulative sums Y (n) = n k=1 (x k − x ) , n = 1, . . . , N, where x denotes the sample mean; (2) The aggregated data is divided into N/s non-overlapping segments of length s; (3) The maximum likelihood estimator of the residual variance in segment ν, ν = 1, . . . , N s , where y ν (i) is the m−degree polynomial fitted the aggregated observations in the segment; (4) For each segment of length s and for each positive or negative values of the moment order q, the q-order fluctuation function, , are calculated. The local fluctuations F q with large and small magnitudes is graded by the magnitude of the negative or positive q-order, respectively; (5) A linear regression of ln F q (s) on ln s for all s is performed, and the slope of the linear function ln F q (s) ∝ H q ln s is used as an estimator of the q-order Hurst exponent H q for each q-order fluctuation function F q . The fractal structures of the positive and negative log-return time series and its deviations within time periods with large and small fluctuations are assessed by the q-order Hurst exponents (see Fig. 5 ). The There are two primary approaches to analyzing extreme values (the extreme deviations from the median of the probability distributions) in data: (i) The first and more classical approach reduces the data considerably by taking maxima of long blocks of data, e.g., annual maxima. The GEVD function has theoretical justification for fitting to block maxima of data [33] . (ii) The second approach is to analyze excesses over a high threshold. For this second approach the generalized Pareto distribution (GPD) function has similar justification for fitting to excesses over a high threshold [33] . Generalized extreme value distributions. The GEVD is a flexible three-parameter continuous probability distributions that was developed with extreme value theory to combine the Gumbel, Fréchet, and Weibull extreme values distributions into one single distribution [34, 35] . The GEV distribution has the following pdf [36] : and µ ∈ R is the location parameter, σ > 0 is the scale parameter, and ξ ∈ R is the shape parameter. When the shape parameter ξ is equal to 0, greater than 0, and lower than 0 [33] , the GEV distribution is equivalent to Gumbel [37] , Fréchet [38] and "reversed" Weibull distributions [39] , respectively. The Gumbel distribution, also named as the Extreme Value Type I distribution, has the following pdf and cdf : where x ∈ R, µ is the location parameter, β > 0 is the scale parameter. Specially, when µ = 0 and β = 1, the distribution becomes the standard Gumbel distribution. Generalizations of the Gumbel distribution, which are of flexible skewness and kurtosis due to the addition of one more shape parameter are widely used for extreme value data as they better fit data [40] . The distribution in (4.1) has been employed as a model for extreme values [41, 42] . The distribution has a light right tail, which declines exponentially, since its skewness and kurtosis coefficients are constant. The Fréchet distribution, also known as the Extreme Value Type II distribution, has the following pdf and cdf , respectively: where α > 0 is the shape parameter and β > 0 is the scale parameter. The Weibull distribution is known as the Extreme Value Type III distribution. The pdf and cdf of a Weibull random variable are shown as follows, respectively: where λ > 0 is the scale parameter and k > 0 is the shape parameter. Further we show the application of the GEV model to the stock market close price using the weekly-return data that was calculated by (maximum close price of week t) − (maximum close price of week (t − 1)) (maximum close price of week (t − 1)) . The results of fitting the GEV distribution to (weekly) block maxima data is presented in Fig. 6 and Table 1 that present the Quantilequantile plot (QQ-plot), quantiles from a sample drawn from the fitted GEV pdf against the empirical data quantiles with 95% confidence bands. The maximum likelihood estimators of the GEV distribution are the values of the three parameters (µ, σ, ξ) that maximize the loglikelihood. The magnitude along with positive sign of ξ indicates the fat-tailness of the weekly-return data, which is consistent with the quantile plot. Based on the statistical analysis presented above (Fig. 6a) , we see that the distribution of the weekly-return data can be described by a combination of different distributions. The density plot (Fig. 6b) having two humps validates the idea of a mixture of distributions. How to choose a threshold. The classical approach for modeling extreme events is based on the GPD. It was proved [43] that if a threshold u is chosen and X 1 , X 2 , . . . X n are observations above u, then the limiting distribution for excess over threshold is indeed GPD. In applications, the GPD is used as a tail approximation [44] of values x − u exceeding the threshold u. The GPD is determined by scale and shape parameters σ u > 0 and ξ, respectively, or in terms of threshold excess x − u, producing the following formula where f + = max(f, 0). When ξ > 0, it takes the form of the ordinary Pareto distribution. This case is the most relevant for financial time series, since it is heavy-tailed. For security returns or high-frequency foreign exchange returns, the estimates of ξ are usually less than 0.5. When ξ = 0, the GPD corresponds to the exponential distribution [45] . There are several properties of GPD [43] , such as, 'threshold stability' property: if X is GPD and u > 0, then X − u provided X > u is also GPD. Therefore, a Poisson process of exceedance times with generalized Pareto excess implies the classical extreme value distributions [46] . The above suggests that generalized Pareto distribution is a practical tool for statistical estimation of the extreme values, given a sufficiently high threshold. The rest of this chapter is devoted to a question about how high a threshold should be. 4.2.1. Graphical approaches to estimate threshold. One of the most common ways to determine a suitable threshold is to graphically inspect data. This approach [44] requires substantial expertise, that can be subjective and time consuming. In some cases, when dealing with several data sets, a uniform threshold may be proposed and kept fixed making the entire evaluation even more subjective. The most common graphical tools are: mean excess plot [47] , threshold stability plot [44] , QQ-plot [48] , Hill plot [49] , return level plots [44] , etc. The mean excess plot is a tool widely used in the study of risk, insurance and extreme values. One use is in validating a generalized Pareto model for the excess distribution. The distribution of the excess over a threshold u for a random variable X with distribution function F is defined as This excess distribution is the foundation for peaks over threshold modeling which fits appropriate distributions to data on excesses and widespread with many application in hydrology [50, 51] , actuarial science [52, 53] , survival analysis [54] . This modeling is based on the GPD that is suitable for describing properties of excesses. The mean excess (ME) function is one of the most common tools to determine a suitable threshold u. The ME function of a random variable X is defined as provided EX + < +∞, which is also known as mean residual life function. As Ghosh [47] noted for a random variable X ≈ G ξ,β , E(X) < +∞ if and only if ξ < 1 and in this case, the ME function of X is linear in u: where 0 ≤ u < +∞ if ξ ∈ [0, 1) and u ∈ 0, − β ξ if ξ < 0. The linearity of the ME function characterizes the GPD class [47] . Davison and Smith [46] developed a simple graphical tool that checks data against a GPD model. Let X 1 ≥ X 2 ≥ . . . ≥ X n be the order statistics of the data, then ME plot depicts the points {X k ,M (X k )|1 < k ≤ n}, wherê M is the empirical ME function defined aŝ If the ME plot is close to linear for sufficiently large values of the threshold then there is no evidence against use of a GPD model. Another problem is to obtain a natural estimationξ of ξ. There are several methods to estimateξ, such as: (i) Least squares [44] , (ii) Maximum likelihood estimation [46] , (iii) the Hill estimator [49] , (iv) the Pickands estimator [55] , (v) quantile-quantile plot (QQ-plot) [44] , (vi) the moment estimator [56] . For example, the QQ plot depicts the points Q m,n = − log i m , log X i Xm 1 ≤ i ≤ m where m < n and ξ > 0. In case ξ < 0, QQ plot is the plot of points Q m,n = whereξ is an estimate of ξ based on m upper order statistics. Recently, new graphical diagnostic tools have been introduced: a new multiple-threshold GP model with a piece-wise constant shape parameter [57] ; plots measuring a surprise at different candidates to select threshold, using results of Bayesian statistics [58, 59] ; structure of maximum likelihood estimators have been studied to develop diagnostic plots with more direct interpretability [60] . With this choice for a threshold, u = 0.016, we get 507 exceedances with the empirical life becoming close to linear above this choice of the threshold (Fig. 7) . Similarly, we find a threshold for negative returns. In this case, all computations were repeated for absolute values of negative returns (Fig. 8) . With this choice for the threshold u = 0.017, there are 462 exceedances. One can see that empirical MRL becomes almost linear above u = 0.017. Ghosh and Resnick [47] noted that despite graphical diagnostic is a tool commonly accepted by practitioners, there are some problems associated with the methods mentioned above, such as: (i) an analyst needs to be convicted that ξ < 1 since for ξ ≥ 1 random sets are the limits for the normalized M E plot. Such random limits lead to wrong impressions. Certain methods described above work with ξ defined on specific intervals; (ii) in this case distributions are not close to GPD can mislead the mean excess diagnostics. Based on the graphical approach, the threshold for negative return of S&P 500 was chosen at u = −0.017 and for positive return at u = 0.016. Distribution of exceedances over respective thresholds are shown in the Fig. 9 . The QQ-plots shown in the Fig. 10 demonstrate that with a few exceptions, exceedances follow the GPD. Automatic methods to estimate thresholds. As was mentioned above, the graphical approaches as well as rules of thumb can be highly objective, time consuming, and require certain professional background. Thus some authors have proposed automatic selection methods that can treat chunks of Big Data: a pragmatic automated, simple and computationally inexpensive threshold selection method based a.) b.) Figure 10 . a.) Quantile-Quantile plot with maximum likelihood estimation for the negative threshold; b.) QQplot with maximum likelihood estimation for the positive threshold. on the distribution of the difference of parameter estimates when the threshold is changed [61] : it was shown that better performance is demonstrated by graphical methods and Goodness of Fit metrics that rely on pre-asymptotic properties of the GPD [62] using weighted least squares to fit linear models in the traditional mean residue life plot; the recently developed stopping rule ForwardStop [63] , which transforms the results of ordered, sequentially tested hypotheses to control the false discovery rate [64] that provides reasonable error control [58] . A particular interest has a method that suggests a way to determine threshold automatically without time consuming and rather subjective visual approaches based on L-moments of GPD that summarize probability distributions, perform estimation of parameters and hypothesis testing [58] . Probability weighted moments, defined by Greenwood [65] , are precursors of L-moments. Sample probability weighted moments computed from data values X 1 , X 2 , . . . X n arranged in increasing order, are given by L-moments are certain linear combinations of probability weighted moments that have simple interpretations as measure of location, dispersion and shape of the data sample. The first few L-moments are defined by (the coefficients are those of the shifted Legendre polynomials). The first L-moment is the sample mean, a measure of location. The second L-moment is (a multiple of) Gini's mean difference statistic, a measure of the dispersion of the data values about their mean. By dividing the higher-order L-moments by the dispersion measure, we obtain the L-moment ratios, τ r = λr λ 2 , r = 3, 4, . . .. These are dimensionless quantities, independent of the units of measurement of the data. τ 3 is a measure of skewness and τ 4 is a measure of kurtosis -these are respectively the L-skewness and L-kurtosis. They take values between −1 and +1. For random variable with GPD with ξ < 1, the particular relationship between L-skewness and L-kurtosis is defined as Given a sample x 1 , x 2 , . . . , x n , the Automatic L-moment Ratio Selection Method (ALRSM) works as follows [58] : 1. Define the set of candidate thresholds {u i } I i=1 as I = 20 sample quantiles, starting at 25% by steps of 3.7%. 2. Compute the sample L-skewness and L-kurtosis for the excess over each candidate threshold (τ 3,u i , τ 4,u i ) and determine d u ithe Eucledian distance: 3. The threshold after which the behavior of the tail of the underlying distribution can be considered approximately GPD is then automatically selected as that is, the level above which the corresponding L-statistics fall closest to the curve. Figure 11 . Value of the thresholds for positive and negative log-return based on L-moments. The solid black and blue lines correspond to negative and positive log return thresholds, respectively, and based on a window of 100 trading days. The dotted black and blue lines correspond to negative and positive log return thresholds, respectively, and based on a window of 400 trading days. Using the L-moments method, we computed thresholds for S&P 500 log-return depending on an observation period, 100 trading days and 400 trading days. The Fig. 11 indicates that a threshold not only time dependent, but also depends on the size of data set used. Once again, we cannot choose one value of the threshold that can be absolutely accurate. Rules of thumb to choose a threshold. As earlier noted, the threshold sequence is a function of the properties of the GPD provided that a population is in the domain of attraction of the GPD. In case a distribution function F is known, derivation of the threshold selection is possible, however, in practice if F is unknown then there is no general form for the threshold sequence [44] . Practitioners often use so-called rules of thumb, many of them have little to know theoretical justification, for instance, a fixed quantile rule [66] : the upper 10% rule or its derivative 5% rule; the square root rule k = √ n [67] . There is a procedure that tries to find a region of stability among the estimates of the extreme value index [68] . This method depends on a tuning parameter, whose choice is further analysed in [69] . Unfortunately, no theoretical analysis exists for this approach [70] . More comprehensive reviews of the threshold selection methods can be found in [44] . Even though most of methods mentioned above have no theoretical justification for an exact value of a threshold, we can find an approximate location for a threshold given a data set R that has M values of log-return, R = [x 1 , x 2 , . . . , x M ]. 1. Let R n be a list of n consecutive values of log-return, R n = [x 1 , x 2 , . . . , x n ]. Split R n into two parts: R n+ = {x ∈ R n |x ≥ 0} and R n− = {x ∈ R n |x < 0}; then sort them in an increasing order such that . where p, q ∈ N such that p + q = n and n is the width of observation or a window of observation. 2. Next, we compute medians of R n+ , R n− , call them x m+ and x m− . These values will be a lower bound for a positive threshold and an upper bound for a negative threshold, respectively. 3. At this step, an upper bound for a positive threshold x u and a lower bound for a negative threshold x l are estimated using the rule of thumb, namely, the fixed quantile rule with upper 10% for positive log-return values and lower 10% for negative log-return values. With this we have . The indices u, l can be found as u = p/1.111 , l = q/10 . 4. A threshold for negative log-return values ranges from x l to x m− and a threshold for positive values is within x m+ and x u , based on n observations from R n . Further, we shifted R n to R n = [x 2 , x 3 , . . . , x n+1 ] to estimate new values of thresholds based on previous n observations. The process repeated until we exhausted the entire data set R. We chose a window of 300 days that was moving over the entire dataset producing a domain for threshold existence as shown in the Fig. 12 . It is clear from the Fig. 12 that certain values of thresholds cannot sustain the entire period and must be updated from time to time. Similarly, the window of 300 trading days, the Fig. 13 demonstrates how ranges of thresholds change as we move a window of 600 trading days across available data. Once again, some values of thresholds can exist for almost entire period, while other can exist a few months and then must be replaced with an updated value. Based on Figs. 12, 13 we can see that a choice of the threshold depends on a size of the data set used. Moreover, the rules of thumbs alike graphical approaches require practitioners' involvement in studying data before making a final choice for thresholds. Equipped with these results, we can compute a validity period τ (u) for a threshold u. Let n be the window of observation. By moving the window over the data set containing M values of log-returns we obtain containing medians of positive and negative log-returns, upper 10% cut-offs and lower 10% cut-offs for positive and negative log-returns, respectively, as computed previously. 1. First, we compute the k threshold candidates u i for positive log-return as {u + i ∈ R| min(X m+ ) + i k (max(X u ) − min X m+ ), i = 1, . . . , k}. In a similar fashion, we compute a set of threshold candidates for negative log-returns, . . , k}. 2. For each threshold candidate u i found in the previous step we compute its validity duration, i.e. a number of days N u i a candidate would fall within admissible ranges. Set N u i = 0. For j = n . . . M if x m+,j < u i < x u,j , then N u i = N u i + 1 for positive threshold candidates, similarly, if x l,j < u i < x m+,j , then N u i = N u i + 1 for negative threshold candidates. 3. The probability that a threshold candidate will be changed on any given day is η(u i ) = 1 − N u i /(M − n + 1). Figure 14 . The probability that a threshold of the logreturn values will be changed on any given day calculated over the different data windows ranging from 25 to 6000 trading days. The Fig. 14 shows probabilities that a particular choice of the threshold can be changed on any day depending on the size of the dataset used, which is important to know especially when new data becomes available and is included for consideration. The more data we use to estimate a value of a threshold, the more likely the threshold will stay unchanged, however, as we add data, the threshold should be reconsidered. It also brings another issue: in many cases, statistical analysis is performed on a historical dataset that does not reflect a phenomenon we study at the present time. The statistics of extreme events under threshold uncertainty can be described with the help of a simple model, in which the log-return of the index and the value of threshold are treated as random variables that yet can change inconsistently. The model that we are going to adopt and modify had been put forward by us for the first time in [71] to describe the behavior of systems close to a threshold of instability and used later in [72, 73] to model survival under uncertainty and the events of mass extinction. In the model of extreme events under threshold uncertainty, the current value of the log-return is quantified by a random number x ∈ [0, 1] drawn accordingly some probability distribution function Pr{x < z} = F (z). The threshold value that might change any time once the new data become available is another random number y ∈ [0, 1], which is drawn from another probability distribution function, Pr{y < z} = G(z). We assume that the rate of daily variations of the log-return values is greater than or equal to that of the threshold values, ultimately determining whether the current log-return value is extreme or not. In fact, it is the relative rate of random updates of x and y described in our model by the probability of inconsistency η ≥ 0, that actually determines the statistics of extreme events. At time t = 0, the value of log-return x is chosen with respect to the probability distribution function F , and the value of the threshold y is chosen with respect to the probability distribution function G. If y ≥ x, the event is regular and the process keeps going to time t = 1. At time t ≥ 1, either, with probability η ≥ 0, the value of log-return x is drawn anew from the probability distribution function F, but the threshold keeps the value y it had at time t − 1, or with probability 1 − η, the value of log-return x is updated anew from the probability distribution function F , and the level of supply y is updated either with respect to the probability distribution function G. As long as the value of threshold is not exceeded (x ≤ y), the event is classified as regular, but the event is extreme whenever x > y. The value of probability η > 0 can be interpreted as the reciprocal characteristic time interval, during which the threshold level remains unchanged, and vice versa the probability that a threshold of the logreturn values will stay unchanged on any given day can be calculated as the inverse of the time interval during which the threshold value stays put. The probability that a threshold of the log-return values will stay unchanged on any given day calculated over the different data windows ranging from 25 to 6000 trading days is shown in Fig. 14 . We are interested in the probability distribution P η (t) of the interval duration t between the sequential extreme events for some probability distribution functions F and G and a given value of the probability η ≥ 0. A straightforward computation [71, 72] , shows that, independently upon the value of η, the initial probability of choosing the level of demand below the supply level (to start the subsistence process) is 1 0 dG(z)F (z). The general formulas for P η (t) can be found in [71, 72] . When η = 0, the values of log-return and the threshold value are updated coherently the resulting probability function, decays exponentially fast with t, for any choice of the probability distribution functions F and G. In particular, if the threshold level and log-returns are drawn uniformly at random, dF (z) = dG(z) = dz, over the interval [0, 1], the occurrence of an extreme event is statistically equivalent to simple flipping a fair coin, for which head and tail come up equiprobably, (4.9) P η=0 (t) = 1 2 (t+1) . On the contrary, when the threshold level is kept unchanged, η = 1, the statistics of intervals between the sequential extreme events, decays asymptotically algebraically as t 1 [71, 72] . For example, in the special case of uniformly random updates of the threshold and log-return values, the probability function (4.10) decays algebraically as (4.11) P η=1 (t) = 1 (t + 1)(t + 2) For a general family of invariant measures of a map of the interval [0, 1] with a fixed neutral point defined by the probability distributions F and G, absolutely continuous with respect to the Lebesgue measure, i.e., (4.12) dF (z) = (1 + α)z α dz, α > −1, Eq.(4.10) gives the probability function that exhibits the power law asymptotic decay for t >> 1 [71, 72, 73] : The asymptotic decay of (4.13) seems to be algebraic, (4.14) P η=1 (τ ) 1 τ 2+β , Figure 15 . The statistics of time intervals (in days) between the sequential extreme events for the fixed threshold values, u = 0.016 and u = −0.017, for positive and negative fluctuations of the log-return respectively. The solid line ∝ t −2 corresponding to the asymptotic quadratic decay (4.11) is given for reference. for β > −1, for any choice of the distributions F and G although it is mainly the character of the probability function G that determines the rate of decay of P η=1 (t) with time. In the limiting case when the support of the probability distribution G(x) determining the choice of the supply level is concentrated close to x = 1, i.e., is zero everywhere in the interval [0, 1], except for a small interval of length ε up to 1, the Zipf power law asymptote ∝ t −1−ε , ε > 0, follows directly from (4.13) [71, 72, 73] . A possible modeling function for such a bountiful probability distribution, forming a thin spike as x → 1, can be chosen in the form, with the probability density in the interval [0, 1[, The straight line shown in Fig. 15 represents the hyperbolic decay of time intervals between the extreme events. Once we assume statistics for log-returns, or ranges for threshold values are defined using different methods shown in Sec. 4.1, we introduce uncertainty to the threshold value that depends on the method itself and the amount of data we use. We measure uncertainty to justify an emergency scale to represent the extreme events. Using the rule of thumb (Sec. 4.2.3), we determined the ranges for the threshold values depending on the window of observation (Fig. 12) for both positive and negative log return values. For each admissible positive choice of a threshold u, we determine the probabilities η(u) that the threshold u can be changed on any given day (Fig. 14) , degrees of uncertainty have been assessed by the means of the Shannon's entropy for each threshold value and the window of observation [73] (5.1). The case with the negative values of the threshold is analogous. where η(u) is the probability that the threshold will be changed on any given day (Fig. 14) . We observed the Red Queen State and three emergency scales can be readily interpreted. (i) If amount of data is not sufficient (a solid line corresponding to the 18-day window of obervation in Fig. 16 ), then the uncertainty curve forms a skewed profile attaining a single maximum of 0.69295 for the threshold value 0.008272. In this situation, an observer's perception of the events reminds Red Queen from "Through The Looking-Glass and What Alice Found There" by Lewis Carroll [21] who said "When you say hill, I could show you hills, in comparison with which you'd call that a valley". Our understanding of the events whether they are extreme or not is very limited and uncertainty is blurry. As events become more severe, our uncertainty that the events are extreme decreases. The observer realizes that the events are extreme, but a precise point at which the events turn to be severe cannot be determined. This case is called the Red Queen State. (ii) As the window of observation becomes larger, 25 days, for instance, the uncertainty curve exhibits two maxima indicating that the amount of data is sufficient. This happens because H(η) attains its maximum for η = 1/2 and the η curve admits a value 1/2 twice (Fig. 14) . As we further extend the window the curve torrents into sharp peaks (Fig. 16) . The latter ones clearly separate the threshold values into three regions: three levels of emergency. The locations of peaks of the curves are summarized in Table 2 . The location of two peaks is not sensitive to the window of observation. to raise concern about extreme events. Then we can observe a spike with the degree of uncertainty attaining its first maximum. This extremum indicates a transition to the next kind of uncertainty. For the window of 2000 days, the region I lies in the interval [0, 0.00534) (see Fig. 16 , Table 2 ). (II) Critical. In the region II, the interval [0.00534, 0.01672) for the 2000-day window, uncertainty is conceptualized with a question whether a magnitude of the event is already critical, extreme or not yet. Further, we see another jump of uncertainty reaching 0.69034. At this point we are certain that events are not regular anymore. (III) Extreme. In the case of the window of 2000 days, the interval [0.01672, ∞) constitutes the region III. We consider all events in this region extreme with our uncertainty decreasing as threshold values increase. With the analysis presented above, we define an emergency scale of three levels based on the three regions of the threshold values corresponding to three peaks of the uncertainty curve. This emergency scale is not sensitive to the size of the window of sufficient amount of data considered. The S&P500 times series in the period from January 2, 1980 till December 31, 2018 exhibits an asymmetrical skewness of the distribution with the right and left power law tails. Multifractal detrended fluctuation analysis of log return time series for S&P 500 index reveals a scale invariant structure for the fluctuations on both small-and large scale magnitudes, as well as its short-and long-range dependence on different time scales. Moreover, the segments with small fluctuations have a random walk like structure whereas segments with large fluctuations have a noise like structure. We have reviewed different methods of threshold selection and studied the extreme events presented in the time series using different statistical approaches. We found that the distribution of the weeklyreturn data can be described by a combination of different distributions. Based on a graphical approach for threshold selection, we chose separate thresholds for the positive and negative values of the log return, 0.016 and −0.017, respectively. With this choice, we registered 507 instances of extreme events corresponding to raise of market and 462 extreme events related to market declines. With a few exceptions, exceedances over (under for negative log return values) the threshold follow the GPD. The rule of thumb showed that a threshold value depends on the width of observation window, and the threshold can change at any moment, once new data become available. Uncertainty of the threshold values can be determined by the probability of changing the threshold on any given day. The moment we assume statistics of distributions or the dataset is fixed, it leads to uncertainty of the threshold value which can be resolved by the emergency scales rigid to variation on the size of the dataset. We suggested a statistical model that describes registration frequency of extreme events under threshold uncertainty. Our model fits well the statistics of occurrence of the extreme values in the S&P 500 time series. Tolley's Handbook of Disaster and Emergency Management: Pronciples and Best Practices; Disaster and Emergency Management Systems A Unified Localizable Emergency Events Scale Rogue Waves in a Multistable System Extreme events in epileptic EEG of rodents after ischemic stroke Furthering Development of a Unified Emergency Scale Using Thurstones Law of Comparative Judgment: A Progress Report ABSTRACT Risk aversion, uncertain information and market aptitude Risk aversion, uncertain information and market aptitude. Reexamining the evidence Bayesian threshold selection for extremal models using measures of surprise WHO, Coronavirus disease 2019 (COVID-19) situation report43 Estimates of the severity of coronavirus disease 2019: a model-based analysis WHO, Coronavirus disease 2019 (COVID-19) situation report46 Tackling COVID-19: A Problem So Big, You Can See It From Space A fiasco in the making? As the coronavirus pandemic takes hold, we are making decisions without reliable data Through The Looking-Glass and What Alice Found There Calculating and Comparing Security Returns is Harder than you Think: A Comparison between Logarithmic and Simple Returns Dynamic asset trees and Black Monday Anatomy of a meltdown: The risk neutral density for the S&P 500 in the fall of Statistics for Long-Memory Processes Introduction to multifractal detrended fluctuation analysis in matlab Multifractal detrended fluctuation analysis: Practical applications to financial time series Multifractality in asset returns: theory and evidence Multifractal detrended fluctuation analysis of nonstationary time series Five Years of Phase Space Dynamics of the Standard & Poors Multifractal analysis of financial markets: a review An Introduction to Statistical Modeling of Extreme Values Limiting forms of the frequency distribution of the largest and smallest member of a sample Sur la distribution limite du terme maximum d'une série aléatoire Analysis of Extreme Value at Risk to Amazon Stocks The beta Gumbel distribution Comparison of estimation methods for Frechet distribution with known shape The Weibull distribution: a handbook A comparative review of generalizations of the Gumbel extreme value distribution with an application to wind speed data Extreme Value and Related Models with Applications in Engineering and Science Small-sample one-sided testing in extreme value regression models Statisical inference using extreme order statistics A review of extreme value threshold estimation and uncertainty quantification EVIM: A Software Package for Extreme Value Analysis in MATLAB Models for exceedance over high thresholds (with discussion) A discussion on mean excess plots The QQ-estimator and heavy tails How to make a Hill plot Some problems of flood analysis. Water Resources A stochastic model for flood analysis Conepts, Techniques, and Tools Mean Residual Life: Theory and Applications, Defence Technical Information Center Refined estimators of the extreme value index A moment estimator for the index of an extreme-value distribution Improved threshold diagnostic plots for extreme value analyses L-moments for automatic threshold selection in extreme value analysis GPD threshold estimation using measure of surprise Exploiting Structure of Maximum Likelihood Estimators for Extreme Value Threshold Selection Automated threshold selection methods for extreme wave analysis Threshold detection for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database Sequential selection procedures and false discovery rate control Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate Probability weighted moments: Definition and relation to parameters of several distributions expressible in inverse form Estimating the stable index α in the order to measure tail thickness: A critique On optimising the estimation of high quantiles of a probability distribution Statistical Analysis of Extreme Values: With Applications to Insurance, Finance, Hydrology and Other Fields Reiss and Thomas' automatic selection of the number of extremes Threshold Selection in Univariate Extreme Value Analysis A System close to a threshold of instability Survival under Uncertainty An Introduction to Probability Models of Social Structure and Evolution Grammar of Complexity: From Mathematics to a Sustainable World