key: cord-0747336-idxv8koy authors: Teplova, Tamara; Gurov, Sergei title: Nonlinear intraday trading invariance in the Russian stock market date: 2022-04-10 journal: Ann Oper Res DOI: 10.1007/s10479-022-04683-7 sha: cc6b87a3586af779196d83404c5a65a9e92d3f7b doc_id: 747336 cord_uid: idxv8koy Using high-frequency transaction-level data for liquid Russian stocks, we empirically reveal a joint nonlinear relationship between the average trade size, log-return variance per transaction, trading volume, and the asset price level described by the Intraday Trading Invariance hypothesis. The relationship is also confirmed during stock market crashes. We show that the invariance principle explains a significant fraction of the endogenous variation between market activity variables at the intraday and daily levels. Moreover, our tests strongly reject the mixture of distributions hypotheses that assume linear relationships between log-return variance and transaction intensity variables such as trading volume or the number of transactions. We demonstrate that the increase in the ruble risk transferred by one bet per unit of business time was accompanied by the rise in the average spread cost. Different aggregation schemes are used to mitigate the impact of errors-in-variables effects. Following the predictions of the Information Flow Invariance hypothesis, we also study the relationship between trading activity and the information process approximated by either the flows of news articles or Google relative search volumes of Russian stocks over the 2018–2021 period. The evidence suggests that a sharp increase in the number of retail investors who entered the Moscow Exchange in 2020 entailed a higher synchronization between trading activity and search queries in Google since February 2020, in contrast to the arrival rates of news articles. The changes are driven by the increasing influence of the trading behavior of individual investors using Google Search rather than professional news services as the main source of information. In this paper, we test the implications of the Intraday Trading Invariance (ITI) hypothesis (Andersen et al., 2020) -the extension of the Market Microstructure Invariance (MMI) hypothesis about the variations in microscopic and macroscopic market characteristics over long intervals. Using tick-by-tick transaction-level data for the most liquid stocks of Russian issuers included in the MOEX Russia Index from January 2014 to July 2018, we test the quantitative predictions regarding the average trade size, log-return variance per transaction, trading volume, and the asset price level over five-minute intervals. Since we use the log-linear regression specifications, we include stocks with the highest level of liquidity in the sample to minimize the number of intervals without any trading activity. In addition, we use the quantitative predictions of the Information Flow Invariance (IFI) hypothesis (Kyle & Obizhaeva, 2017b) to investigate possible differences in relations between trading activity in the Russian stock market and different flows of information used in investors' decision-making processes before and during the Covid-19 crisis. The main contribution of this study lies in estimating potential deviations from invariance during major stock market falls from January 2014 to July 2018. In particular, we analyze the plunges in stock prices on March 3, 2014, during the Russian ruble crisis in the middle of December 2014, and during the stock market crash on April 6 and 9, 2018. To the best of our knowledge, the paper also presents the first study of the impact of the increased number of active retail investors in a stock market using the quantitative predictions of microstructure invariance theory about the relationship between information flows and trading flows. The sample period spans from the first week of August 2018 to the last week of June 2021. According to the client statistics provided on the Moscow Exchange (MOEX) website, there was a sharp increase in individual investor participation in 2020 (+ almost 5 million retail investors over 2020). Moreover, the total inflow of private investors' funds increased by 301 billion rubles (around 4.2 billion dollars according to the average U.S. dollar/Russian ruble exchange rate in 2020) from January 2020 to December 2020. The use of invarianceimplied methodology leads to new insights about the changes in the relationship between the trading and the information processes during the significant price/volume events in financial markets. The main quantitative prediction of the ITI hypothesis is as follows: log-return variation per transaction is proportional to the-2 power of the average trade size times the stock price. Andersen et al. (2020) confirm this relationship using data on the E-mini S&P 500 futures market. report the results supporting this nonlinear relationship for individual U.S. stocks. Bae et al. (2020) show that invariance principles also hold for the Korean stock market. Kyle and Obizhaeva (2019) show that the invariance-implied market impact cost model gives more accurate estimates of price declines during the bet-induced stock market crashes compared to some alternative models (e.g., Frazzini et al., 2018; Wurgler & Zhuravskaya, 2002) . Using several high-frequency measurement techniques, we test the log-linear regression specifications related to various assumptions about the relationship between trading variables. According to our results, the invariance theory explains the relations between market activity variables for Russian liquid stocks over short time intervals much better than alternative hypotheses assuming linear relationships between such variables. Applying the aggregation schemes that mitigate the errors-in-variables effects leads to an increase in the coefficients of determination while maintaining the economic closeness to the invariance-implied quantitative predictions. We also confirm the ITI relationship for the set of trading days characterized by the largest price declines (between January 2014 and July 2018). Additionally, we briefly discuss the intraday dynamics of bid-ask spread. Market Microstructure Invariance hypothesis allows us to look at the Russian stock market from another angle. In this paper, we examine whether there was a changepoint (an abrupt shift) in the dynamics of the estimates characterizing the relationship between public information flows (the number of news articles provided by Thomson Reuters Eikon or Google relative search volumes of Russian stocks) and trading activity. We focus on these two sources of information by making the following assumptions. First, the same business-time clock R governs both Russian retail investors' aggregate behavior and Google search activity of Russian stocks. Second, the professional market participants' aggregate activity and the arrival rate of news articles provided by Thomson Reuters Eikon unfold in the same business-time clock P. Third, the market-wide business-time clock M is a linear combination of two clocks M α R · R +α P · P with positive time-varying coefficients α R , α P : α R +α P 1 depending on external factors. Intuitively, our conjecture about the possibility of a significant growth of α R relative to α P is primarily because retail investor participation in the Russian stock market was drastically increased in 2020. Since local individual investors and professional market participants (e.g., institutionals or asset managers) tend to use different sources of financial information, 1 as we have already noticed, we expect that market-wide trading activity started to be more synchronized with Google search activity of Russian stocks. Our regression analysis confirms this finding. We approximate the information flows by the negative binomial process and show that an abrupt shift upwards in the estimates characterizing this relationship occurred in the last week of February 2020, when the Russian stock market faced a huge inflow of private investors' funds. We next review the literature that analyzes the link between the price and information processes and trading activity. Since the seminal paper by Bachelier (1900) , the connection between the fluctuations of asset prices and several characteristics of the trading process has been under scrutiny. Bachelier's ideas that the movement of stock prices takes the form of a random walk and price volatility is proportional to the square root of time were the main assumptions in the initial attempts to model price-transaction intensity relations. Osborne (1962) was the first to formulate the hypothesis that the return variance is proportional to trading volume and the number of transactions. Many subsequent studies (e.g., Clark, 1973; Crouch, 1970; Epps & Epps, 1976; Jain & Joh, 1988; Tauchen & Pitts, 1983; Wood et al., 1985) also found positive correlations between trading volume and the absolute value of price changes. 2 One of the main theoretical explanations for this phenomenon was the assumption that price changes are selected from a set of distributions characterized by different variances. This speculation became known as the "mixture of distributions hypothesis" (MDH). As a mixing variable, researchers used trading volume V (e.g., Westerfield (1977) ) or the number of transactions N (e.g., Ané & Geman, 2000; Jones et al., 1994) . In the previously mentioned article by Andersen et al. (2020) , the specifications were referred to as the Mixture-of-Distributions-Hypothesis in Volume (σ 2 ∼ V ) and the Mixture-of-Distributions-Hypothesis in Transactions (σ 2 ∼ N ). Andersen et al. (2020) also first extrapolate the invariance principles to an intraday dimension. The market structure of the Moscow Exchange provides an opportunity to develop the topic introduced in Andersen et al. (2020) by turning to the panel analysis. Like the CME Group Globex platform, the Moscow Exchange is a centralized marketplace. The tick sizes and margin requirements are adjusted by exchange officials promptly, and small lot sizes prevent significant market frictions. The relationship between macroscopic market characteristics and information available in financial markets was investigated in numerous papers. In some articles, small or even insignificant relationships between public information and market activity was found (e.g., Berry & Howe, 1994; Mitchell & Mulgerin, 1994) . Several other papers documented the significant relationship between different information flows and several measures of trading intensity (e.g., Da et al., 2011; Heston & Sinha, 2017; Tetlock, 2007) . Kyle and Obizhaeva (2017b) were the first to derive empirically testable predictions about the trading and information processes. The authors demonstrate that trading activity and information flow approximated by the arrival rate of news articles are highly synchronized. The researchers implemented count regression tests and showed that the number of bets per news item about a given firm is approximately constant across U.S. stocks. In other words, they proved that an invariant "amount of money changes hands on average per news article." In this paper, we do not test this prediction using news and transaction-level data for Russian stocks; rather, we use this methodology to assess the impact of the significant rise in the number of retail investors in 2020. In the next section, we provide a more detailed formulation of the Intraday Trading Invariance hypothesis and the two specifications of the Mixture-of-Distributions-Hypothesis: MDH-V and MDH-N. Before formulating the main testable hypotheses, we introduce the necessary notations, following the methodology described in Andersen et al. (2020) . The sample starts at time 0 and contains D trading days. Each consists of T intraday intervals of length t 1/T. Thus, the sample contains τ 1, . . . , D · T non-overlapping intervals. For further convenience, we also introduce a double-index notation: d ∈ D 1, . . . , D for a trading day and t ∈ T 1, . . . , T for an intraday interval. A double-index notation can be converted to a single-index notation according to the following rule: τ (d − 1) · T + t. For each stock i ∈ I and each interval τ from a sample, we define random realizations of some variable by placing the "tilde" sign on top of it: Q iτ is the is the average number of shares in one transaction for the interval τ; σ 2 iτ is log-return variance for the interval τ (per unit of time); iτ is the number of transactions per unit of time; V iτ is the cumulative trading volume (in the number of shares per unit of time); P iτ is the average trading price for the interval τ (in rubles per share). Lower case letters, in turn, denote the logarithms of these variables: q iτ ln Q iτ ; s iτ ln σ 2 iτ ; γ iτ ln iτ ; v iτ ln V iτ ; p iτ ln P iτ . We next assume that unobservable bets volume coincides with observable trading volume. 3 Given this assumption, the following equalities automatically hold: V iτ Q iτ · iτ and v iτ q iτ + γ iτ . When determining the expectation of a random variable on the interval τ (conditional on the available information at time τ − 1), we remove the "tilde" sign. For instance, s iτ E τ−1 { s iτ }. The methodology for estimating these conditional expectations is based on the multiplicative error model (MEM). 4 The dynamics of strictly positive random variables Y iτ is defined as follows: whereŨ iτ are strictly positive independent and identically distributed random variables with This approach assumes that for each stock i and each interval τ, the estimate Y iτ is unbiased and has finite variance. According to Andersen et al. (2020) , the corresponding logarithmically transformed quantities y iτ ln Y iτ can be represented as where y iτ is the conditional expectation of the logarithmic value of a variable, and e iτ is a random variable with E{ e iτ } 0 and V ar{ e iτ } < ∞. According to Jensen's inequality, E{ln U iτ } < ln E U iτ . Therefore, a constant c E{ y iτ − y iτ } < 0. Equation (2) implies that significant errors-in-variables problems may arise if variance of an error term V ar{ e iτ } is large. To mitigate these effects, we aggregate the variables through the aggregation scheme (3), resulting in D daily observations for each stock i ∈ I . We also use the aggregation technique (4) to obtain T separate time-of-day observations for each stock i ∈ I : According to the law of large numbers, the contribution of an error term e iτ should considerably decrease after temporal aggregation. The regression-based framework for testing the Intraday Trading Invariance hypothesis and the MDH specifications is described in Appendix 8.2. We use tick-by-tick data on trades and bid-ask quotes ("Top of the book" specification) from January 6, 2014 to July 31, 2018 provided by the Moscow Exchange. The sample consists of only those 32 shares of Russian issuers included in the MOEX Russia Index during all 55 months. We emphasize once again that such a strict limitation is caused by testing loglinear relations over short intervals. We eliminate those observations that do not belong to the continuous trading period of the main trading session. We aggregate the observations over five-minute intervals for each stock and each trading day. For each interval τ and each stock i, we estimate market activity variables in the following way. P iτ is calculated as the average trading price of stock i over τ ; Q iτ is the average number of stocks i in one transaction over τ; iτ is the number of transactions of stock i over the interval τ; V iτ is the cumulative trading volume (in the number of stocks i) over τ. We use a standard unbiased high-frequency estimate of the realized return variance (see Andersen et al., 2003) . We first calculate the midpoint of ask and bid quotes at the end of each minute and then sum the five consecutive squared one-minute log-returns to obtain an estimate of σ 2 iτ for each five-minute interval. Since we use the log-linear regression specifications, we remove all stock-interval observations with zero trading volume or the realized variance estimate close to 0 (less than 10 −30 in absolute value). Table 1 presents summary statistics for trading variables after data cleaning. Though the percentage of all omitted observations is 65.71% of the initial sample, the final sample consists of a significant number of stock-interval observations: 1,279,833. In this section, we test the quantitative predictions of invariance hypotheses, as well as two specifications of the-mixture-of-distributions hypothesis (MDH-N and MDH-V) over the 2014-2018 period. We test the intraday trading invariance hypothesis over the 2018-2021 period in Sect. 7. The main testable log-linear specification is as follows: where k is an index for τ (without aggregating five-minute observations), d (averaging the five-minute observations for each activity variable and each stock across intervals within a given trading day), or t (averaging the observations for each activity variable, each stock, and each intraday interval across trading days). To test the MDH hypotheses, we exclude the price level, so the regressand is s ik − γ ik in this case. In the first step, we perform a regression analysis exploiting five-minute observations. Figure 1 shows a cloud of 1,279,833 points representing ln σ 2 iτ P 2 iτ iτ against ln Q iτ for 32 stocks included in the MOEX Russia Index from January 2014 to July 2018. Different colors represent different stocks henceforth. We also add a line s iτ − γ iτ + 2p iτ 2.364 − 2q iτ , where the slope is fixed at -2 and the intercept is estimated by OLS regression. We can see that observations cluster along the plotted line. The fitted line is s iτ − γ iτ + 2 p iτ 2.388 − 2.003q iτ with Driscoll-Kraay standard errors equal to 0.025 and 0.003, respectively. The coefficient of determination R 2 is 0.779. It is worth noting that the hypothesis that the slope is equal to − 2 is not rejected at the 5% significance Fig. 1 The figure plots s iτ − γ iτ + 2p iτ on the vertical axis against q iτ on the horizontal axis for each of the 32 stocks of Russian issuers included in the MOEX Russia Index from January 2014 to July 2018, where s iτ is the logarithmic value of log-return variance, γ iτ is the logarithmic value of the transaction rate, p iτ is the logarithmic value of the average price level, q iτ is the logarithmic value of the average number of stocks per transaction. All values are computed at five-minute sampling frequency. The solid line is s iτ − γ iτ + 2p iτ 2.364 − 2q iτ , where the intercept is estimated from an OLS regression with the slope fixed at − 2 level. The results of testing this relationship for individual stocks are the following: all 32 slope coefficients are significantly higher than − 2; they range from − 1.105 to 0.207. 5 The average is − 0.177 and the median equals − 0.099. 6 As a robustness check, we test the invariance hypothesis using high-frequency data for each month in our sample separately. Figure 2 displays the slope coefficients on the q iτ variable across months. All coefficients are economically close to the theoretical value of − 2. Figure 3 shows relationships between the logarithms of log-return variance s iτ and the number of transactions γ iτ (a) and the logarithms of log-return variance s iτ and trading volume v iτ (b) for 32 stocks included in the MOEX Russia Index from January 2014 to July 2018. We should notice that the linear relationships between the corresponding variables predicted by the MDH-N and MDH-V hypotheses are not observed in the cross-section. For the entire sample, the fitted line is s iτ −18.235+0.419γ iτ with Driscoll-Kraay standard errors equal to 0.021 i 0.004, respectively. The coefficient of determination R 2 equals 0.048. Fig. 2 The figure plots the slope coefficient β for OLS regressions s iτ −γ iτ +2 p iτ c + β · q iτ + e iτ , estimated for each month separately. The confidence intervals are computed as ± 2 Driscoll-Kraay standard errors. The green dashed line indicates the theoretical value predicted by the invariance hypothesis For individual stocks, the slope coefficients range from 0.254 to 1.113. The mean value is 0.741, and the median is equal to 0.809. At the 5% significance level, the MDH-N hypothesis is not rejected for only 4 out of 32 stocks. For the entire sample, the fitted line is s iτ −17.109+0.101v iτ with Driscoll-Kraay standard errors equal to 0.019 i 0.001, respectively. The coefficient of determination R 2 is 0.017. The slope coefficients vary between 0.129 and 0.754 for individual stocks. The mean is 0.497, and the median equals 0.542. The MDH-V hypothesis is rejected for all 32 stocks. As we previously mentioned, such estimates can be noisy due to the influence of error terms. Therefore, we next perform an analysis using different aggregation schemes. To reduce the influence of errors-in-variables effects, we next use the aggregation scheme (3). Figure 4 shows a cloud of 35,707 points, displaying ln σ 2 id P 2 id id versus ln Q id for 32 stocks included in the MOEX Russia Index from January 2014 to July 2018. For comparison, we add the line s id − γ id + 2p id 2.075 − 2q id . The slope is fixed at − 2 as predicted by the invariance hypothesis, and the intercept is estimated using OLS regression. The fitted line is s id − γ id + 2 p id 3.287 − 2.090q id with Driscoll-Kraay standard errors equal to 0.127 i 0.014, respectively. The coefficient of determination R 2 equals 0.933. It is worth noting that the hypothesis that the slope is equal to − 2 is rejected due to small standard errors. Nevertheless, the coefficient remains economically close to the value implied by the invariance hypothesis. For individual stocks, the slope ratios range from − 2.027 to 0.582. The mean is − 0.311, and the median is equal to − 0.233. At the 5% significance level, the invariance hypothesis is not rejected for only 1 stock in our sample. We next perform a regression analysis for each month from our first sample. Figure 5 displays the slope coefficients on the q id variable across months. We can observe similar dynamics that we found earlier in Fig. 2 : a fall to the level of − 2.3 and subsequent fluctuations in a range from − 2.3 to − 2.1. Figure 6 shows graphical relationships between the logarithms of log-return variance s id and the number of transactions γ id (a) and the logarithms of log-return variance s id and trading volume v id (b) for 32 stocks of Russian Fig. 4 The figure plots s id − γ id + 2p id on the vertical axis against q id on the horizontal axis for each of the 32 stocks of Russian issuers included in the MOEX Russia Index from January 2014 to July 2018. s id is the logarithmic value of log-return variance, γ id is the logarithmic value of the transaction rate, p id is the logarithmic value of the average price, q id is the logarithmic value of the average number of stocks per transaction. The values are averaged in accordance with the scheme (3). The solid line is s id − γ id + 2p id 2.075 − 2q id , where the intercept is estimated from an OLS regression with the slope fixed at − 2 issuers included in the MOEX Russia Index from January 2014 to July 2018. Here we also do not observe linear relationships between the corresponding variables predicted by MDH-N i MDH-V in the cross-section. For the entire sample, the fitted line is s id −15.458 − 0.027γ id with Driscoll-Kraay standard errors equal to 0.135 and 0.023, respectively. The coefficient of determination R 2 is 0.001. For individual stocks, slope coefficients range from − 0.420 to 1.662. The mean equals 0.577, and the median is 0.552. At the 5% significance level, the MDH-N hypothesis is not rejected for only 2 out of 32 stocks. For the entire sample, the fitted line is s id −15.928+0.026γ id with Driscoll-Kraay standard errors equal to 0.107 and 0.008, respectively. The coefficient of determination R 2 is 0.007. For individual stocks, slope coefficients range from − 0.403 and 1.101. The mean is 0.444, and the median equals 0.463. At the 5% significance level, the MDH-V hypothesis is not rejected for only 1 out of 32 stocks. In summary, none of the three tested hypotheses show high statistical significance in the cross-section and in the case of considering individual stocks. Nevertheless, when we test the ITI relationship, we find an increase in the coefficient of determination and the retention of the economic significance of the slope coefficient, which is close to − 2. In this subsection, we use the second aggregation scheme, represented by formula (4). Averaging over a small number of five-minute intervals for the sample covering several years of observations undoubtedly yields fewer points compared to the aggregation scheme (3). For each stock i and each five-minute interval t, we obtain the value of each variable by taking the arithmetic average across all trading days d ∈ D. where the slope is fixed at level − 2 and the intercept is estimated using OLS regression. We can observe that a significant part of the Fig. 7 The figure plots s it − γ it + 2p it on the vertical axis against q it on the horizontal axis for each of the 32 stocks of Russian issuers included in the MOEX Russia Index during January 2014-July 2018. s it is the logarithmic value of log-return variance, γ it is the logarithmic value of the transaction rate, p it is the logarithmic value of the average price, q it is the logarithmic value of the average number of stocks per transaction. The observations are averaged in accordance with the scheme (4). The solid line is s it − γ it + 2 p it 2.185 − 2 q it , where the intercept is estimated from an OLS regression with the slope fixed at − 2 points lies very close to this line. The fitted line is s it − γ it + 2p it 3.068 − 2.126q iτ with Driscoll-Kraay standard errors equal to 0.045 and 0.002, respectively. The coefficient of determination R 2 equals 0.982. It is worth noting that the hypothesis that the slope is equal to − 2 is rejected due to small standard errors. However, the coefficient is economically close to the value predicted by the invariance hypothesis. We note that the increased variation of the slope coefficients β for individual stocks is caused by a decline in the number of observations. The minimum (maximum) value is − 2.887 (1.295). The mean equals − 0.395, and the median is − 0.355. We next investigate the intraday dynamics of the slope coefficient β (Fig. 8 ) by running the OLS regression for each five-minute interval. As we can observe, there is a small variation in this parameter without a pronounced time trend: the β coefficient ranges from − 2.16 to − 2.07. Figure 9 depicts the relationship between the logarithms of log-return variance s it and the number of transactions γ it (a) and the logarithms of log-return variance s it and trading volume v it (b) for 32 stocks included in the MOEX Russia Index during January 2014 − July 2018. As we can see, there is no explicit linear relationship between the variables in the cross-section in either case. For the entire sample, the fitted line is s it −15.352−0.089γ it with Driscoll-Kraay standard errors equal to 0.059 and 0.013, respectively. The coefficient of determination R 2 is equal to 0.026. For individual stocks, the slope coefficients vary between 0.609 and 2.351. The mean is 1.270, and the median equals 1.260. At the 5% significance level, the MDH-N hypothesis is not rejected for 21 out of 32 stocks (mainly due to large standard errors). For the entire sample, the fitted line is s it −15.764−0.006v it with Driscoll-Kraay standard errors equal to 0.050 and 0.001, respectively. The coefficient of determination R 2 is 0.002. For individual stocks, the slope coefficients vary between 0.203 and 1.379. The average is 0.833, the median reaches 0.859. At the 5% significance level, the MDH-V hypothesis is not rejected for 20 out of 32 (also mainly due to large standard errors). In summary, we have shown that the Intraday Trading Invariance hypothesis has the highest explanatory power among the three different models. As an additional check, we repeat the analysis by aggregating the variables at the twenty-minute sampling frequency. To obtain a high-frequency unbiased estimate of the realized variance, we sum consecutive squared oneminute log-returns over the corresponding twenty-minute intervals. We exclude the intervals with no trading volume and the realized variance estimate less than 10 −30 . Finally, we obtain the sample, consisting of 307,968 observations (32.96% of the initial sample). In this paper, we do not present the results of our tests using twenty-minute sampling frequency. Nevertheless, we note that no substantial differences in the estimates have been found. Significant deviations from the invariant ratios can occur during various market crises. 1. A breakdown in the ITI relationship may be associated with a sharp decline in market makers' and arbitrageurs' activity. The demonstrative example is the events in the E-mini S&P 500 market on May 6, 2010, which took place within minutes before reaching the nadir of the crash at 13:45 Central Time. As shown by Andersen et al. (2020) , the collapse in the provision of liquidity was accompanied by a sharp rise in the values of ln I dt calculated at a granularity of one minute. Due to the absence of arbitration activity, the values of ln I dt were at peak levels during those most turbulent minutes. 2. A (fast) execution of large bets can create a short-term but significant impact on asset prices. According to the estimates presented in Obizhaeva (2016) , the two-day destabilization in the Russian foreign exchange market in mid-December 2014 was an example of such a collapse. It was triggered by the execution of a large bet to sell rubles of about $6-$9 billion over a short period. Due to the spillover effect, the Russian stock market was also in a crisis: the RTS index lost more than 9.3% on December 15 and more than 11.4% on December 16. A few months earlier, on March 3, 2014, there was a massive sale of Russian stocks amid political tensions in Ukraine and the decision of the Bank of Russia to raise the key rate by 1.5 percentage points. As a result, the RTS index fell by more than 10% on that trading day. The third most turbulent episode over the period under review was the collapse of stock prices on April 6 and 9, 2018, associated with the introduction of new U.S. sanctions against some Russian officials and businesspersons, as well as with the aggravated situation in Syria. In two trading days, the RTS index fell in aggregate by more than 11.9%. In this section, we examine the degree of deviation from invariance relationships during the deepest market price falls. As we noted earlier, the invariance principle is a benchmark against which it is convenient to monitor market dynamics during stressful episodes. Such analysis may be of interest not only to market participants with short-term trading strategies but also to financial regulators. 7 This part of the study aims to estimate the average differences in the logarithms of the invariant ι ≡ ln I p +q +0.5s −0.5γ for turbulent days, characterized by the most significant market declines, and other periods. Earlier, we have already indicated the main reason, according to which we expect a significant rise in the value of the trading invariant during the market downturn. During high market volatility associated with external shocks, selling pressure is exacerbated, which can lead to increased liquidity demand and limited arbitrage activity. Due to the limited liquidity supply from intermediaries, the average trade size may increase. As a result, an unusually high value of the logarithm of the trading invariant ι can be observed. Thus, the main hypothesis is formulated as follows: Hypothesis 1 There is a significant positive deviation of the standardized logarithm of the trading invariant during periods of market stress. For each stock i and each five-minute interval τ , the standardized logarithm of the trade invariant ι it is calculated as follows: where ι it is the mean of ι it for each stock i over each five-minute interval t; S ι it is the square root of unbiased sample variance of ι it calculated on the same set. The transition to standardized values is caused by the need to take into account possible violations in the assumption about the distribution of the invariant I iτ . Figure 10 shows the logarithms of the trading invariant ι it averaged according to scheme (4) for each specified five-minute interval and each stock from our sample. It is worth noting that all observations are concentrated in a relatively narrow interval. However, the universal distribution for all Fig. 10 The figure shows the logarithms of trading invariant ι it averaged according to scheme (4) for each fiveminute interval and each security from our sample. The dotted lines represent the last five-minute intervals of each calendar hour. The timestamp displays the right border of the corresponding five-minute interval stocks is not observed: many lines have no intersection with each other. Thus, standardization is used to smooth out such heterogeneities. 8, 9 Based on Fig. 10 , we briefly discuss the dynamics of intraday trading activity in the most liquid part of the Russian market. First, it is worth mentioning the U-shaped curve showing the dependence of the logarithm of the trading invariant on time for each stock from our sample. Similar results for dynamics of intraday bid-ask spreads and/or trading activity were reported by Wood et al. (1985) , Chan et al. (1995) , Madhavan et al. (1997) , Andersen et al. (2020) , etc. As shown by Madhavan et al. (1997) , the information asymmetry between market participants is high at the opening. As a result, liquidity suppliers are less active due to the adverse selection problem. In our opinion, the logarithms of the trading invariant have inflated values precisely because of the limited intermediary activity. A similar effect is observed during the last minutes of the trading day when liquidity suppliers increase the order-processing and/or inventory holding components of the bid-ask spread and limit the size of their positions. Second, it is necessary to pay attention to the local maxima of the logarithms of the trading invariant for various stocks, a significant part of which falls on the last five-minute intervals of each calendar hour. This empirical result may be associated with the corresponding dynamics of financial news releases or the specifics of algorithmic strategies used in the Russian stock market. 10 We now test hypothesis H1. In the first step, we apply basic specifications using highfrequency trading variables. (1.1) Baseline model (1.2) Baseline model with firm fixed effects 8 When we use the aggregation scheme (4), the universal distribution for 32 stocks is also not observed. 9 As a robustness check, we also consider non-standardized values of the logarithm of the trading invariant and do not find significant differences in the results. 10 Further study of the causes of the intraday dynamics of the bid-ask spread takes us beyond the scope of this paper. (1.3) Baseline model with firm and month fixed effects where 1{Marketturbulence} d is a dummy variable that is equal to + 1 for five most turbulent trading days in the Russian market over January 2014 − July 2018: March 3, 2014; December 15 and 16, 2014; April 6 and 9, 2018 and 0. Otherwise; α i is firm fixed effects; δ m is month fixed effects. The results are presented in Panel A of Table 2 . The hypothesis that there are no significant positive deviations of the standardized logarithm of the trade invariant for turbulent trading days is rejected for all three specifications. As a robustness check, we turn to an extended set, which consists of five above-mentioned most turbulent days and seven additional days: April 17, 2015; January 15, 2016; March 10, 2015; September 1, 2015; December 8, 2014; January 26, 2015; December 2, 2014. These trading days are also characterized by significant negative changes in the RTS index (from − 6.2% to − 4%). The results of our tests of the same three basic specifications are presented in Panel B of Table 2 . We do not find any significant differences with the previous results: the estimated coefficients on a dummy variable are positive and statistically significant but slightly lower in absolute value. Adding fixed effects to the model does not increase the coefficient of determination. In addition, we test the hypothesis for non-standardized values of the logarithm of the trading invariant ι iτ and also reveal the significance of the estimated coefficients on a dummy variable. We deliberately do not use scheme (3) to test these specifications at daily intervals. Since there were no trades at some five-minute intervals during the most turbulent trading days, some logarithms of the trading invariant are not defined. Thus, when averaging across the trading days, the previously detected differences in ι iτ blur out. Instead, we concentrate on the top 5% of maximum values of the standardized trading invariant for each stock and each trading day ( ι T op iτ ) to test the following hypothesis. Hypothesis 2 The market downturn is associated with extreme positive values of the standardized logarithm of the trading invariant at the daily level. According to the results presented in Table 3 , the coefficients on a dummy variable are positive and statistically significant. They are close to the estimates obtained earlier when testing specifications (1.1)-(1.3). Fixed effects do not lead to a considerable increase in the explained fraction of the variance of the dependent variable ι it . We next test the hypothesis β −2 for the specification (5) when considering only those 5 or 12 days that were defined as turbulent. In both cases, this hypothesis is not rejected at the 5% significance level. We obtain the same result if the dummy variable 1{Marketturbulence} d is added to the model (5): the coefficients on this variable are positive and statistically significant. We also apply unequal variances (Welch's) t-test for non-standardized values of the logarithm of the trading invariant. The first sample consists of observations belonging to trading days with a significant level of market drawdown (5 or 12), and the other observations belong to the second sample. According to the test results, the null hypothesis of equality of the mean values of the two samples is rejected in both cases (p-values < 0.01). To summarize, we find that the nonlinear ITI relationship holds across 32 most liquid Russian stocks for the sample of trading days with the largest market decline. In other words, we conclude that the fundamental mechanism that determines the average trade size depending on changes in trading intensity does not change in the case of market turbulence. At the same time, we show that the I has not invariant distribution across different stocks. Bucci et al. (2020) got similar results after investigating the ANcerno data on bets. The researchers found only "weak universality" when testing the invariance hypothesis. They demonstrated significant variations in the trading invariant I when examining some American stocks and futures contracts. Nonetheless, authors confirmed quantitative predictions about the relationship between trading variables. Figure 11 demonstrates the average spread cost at five-minute frequency. It is defined as the average trade size computed at five-minute frequency averaged across stocks times the bid-ask spread computed at five-minute frequency for every stock and then averaged in the cross-section. The red dashed vertical lines indicate March 3, 2014; December 15 and 16, 2014; April 6 and 9, 2018. The green dashed lines represent seven days from the extended set of turbulent trading days defined earlier. We can observe that the average spread cost had peak values during almost all crisis days. In our opinion, severe information asymmetry and high volatility of securities' fundamentals were among the main causes of soaring average spread cost of trade and a sharp increase in the ruble risk transferred by one bet per unit of business time. Similar to Kyle and Obizhaeva (2017b) , we consider the extension of the invariance hypotheses about the relationship between microscopic and macroscopic market characteristics to the hypothesis about the information process. However, we apply this methodology to the Russian stock market. According to the Information Flow Invariance (IFI) hypothesis, public information is "expected to arrive at a rate proportional to the rate at which the business-time clock ticks, with a proportionality constant being the same across assets and across time." Quantitatively, the IFI hypothesis predicts that the rate of information flow μ is proportional to W γ , where W is trading activity, defined as the product of volatility and ruble volume, and γ 2/3. There is a simple intuition behind this coefficient. Suppose business-time clock related to professional investors' activity slows down to half speed for some reason. As a result, we start to observe two similar effects. Information flows are also retarded: providers release twice fewer news articles about firms. In addition, both ruble volume and variance go down by a factor of two. Finally, the trading activity decreases by a factor of 2 · 2 1/2 2 3/2 , and we obtain the following relationship between the new (marked with an asterisk) and old rates of trading activity and the arrival rates of information: The similar principles apply in the case of retail investors' activity and Google relative search volumes of Russian stocks. To study the time-series variation in the estimated exponent γ capturing contemporaneous relation between the trading and information processes, we implement negative binomial regressions: where μ(W it ) is either the number of news articles μ W N ews it or Google relative search volumes μ W Search it of stock i and week t; η and W * are constants corresponding to the average number of news articles (or Google relative search volumes) and the trading activity of some benchmark stock; G it (α)is the Gamma variable with the mean of 1 and the variance of α. The invariance theory predicts that γ 2 3. We next discuss the advantages of using this count data regression. Google Trends provided the weekly data on relative search volumes of Russian stocks μ Search it . We analyze the relative popularity of search queries only within Russia during the period from the first week of August 2018 to the last week of June 2021. The search query has the following structure: "Name of a public company" + "stocks" written in Russian. The initial sample consists of only those 32 shares of Russian issuers included in the MOEX Russia Index during this period. We exclude relative search volumes of Aeroflot, MTS, and Magnit stocks since the search queries written in Russian have double meanings in these cases. Besides the meaning "stocks of the company," such queries can also be interpreted as "the company's special offers". 11 It is worth noting that Google Trends imposes some limitations on downloading and analyzing data. Firstly, it is impossible to compare search volumes of more than 5 queries simultaneously. Secondly, the algorithm does not show absolute values. Instead, it divides all numbers by the sample maximum, multiplies by 100, and rounds off all of them to the nearest integer between 0 and 100. We implement the following procedure to obtain relative search volumes data. In the first step, we find a benchmark query with the highest maximum search volume at some week among all queries over August 2018-June 2021. In our case, the benchmark query is "Gazprom stocks" written in Russian. We next get data on the search volumes of all remaining words relative to the benchmark's highest maximum. Ultimately, all relative search volumes are directly related to the benchmark's maximum, which equals 100. It is worth mentioning that we do not get rid of zero relative search volumes using this algorithm. Kyle and Obizhaeva (2017b) note that both the negative binomial (NB) model and the Poisson model can be used to study the relationship between the information flow and the trading flow. However, unlike the Poisson model, the NB model takes into account possible over-dispersion of the news/queries data. The additional variation caused by many zero relative search volumes is the main reason why the negative binomial model is preferable in our case. Thomson Reuters Eikon provided daily news data on 29 Russian stocks included in the Google relative search volumes sample. It also covers the period August 2018-June 2021. Each news article has information on the ticker of a firm, the time stamp, the headline, and the topic code. In the first step, we apply several filters. First, we include in the sample a news item if it is associated with at least one of the following topics: "Significant News", "Significant Company News", "Significant Economic News", "Significant Equity News", "Major News", "Company News". We also exclude all duplicated news articles. Additionally, we use daily data on the open (O id ), high (H id ), low (L id ), and closing (C id ) prices and share volume (V id ) from August 1, 2018 to June 30, 2021 provided by Thomson Reuters Eikon. The sample consists of 29 Russian stocks included in the Google relative search volumes dataset. For each day d and each stock i, we estimate market activity variables in the following way. P id is calculated as the average of opening and closing prices of stock i and day d. We use Garman and Klass (1980) variance measure: σ 2 id 1 2 (ln H id − ln L id ) 2 − (2 ln 2 − 2) · (ln C id − ln O id ) 2 . We next calculate trading activity W id for each stock i and day d as the product of P id , V id , and σ id and convert daily data on trading activity and the number of news articles to weekly data: W id → W it and μ N ews id → μ N ews it . Finally, we sum weekly trading activity of ordinary and preferred stocks of the three companies (Sberbank, Tatneft, and Surgutneftegas) for each week from the sample and match this data with corresponding weekly data on the news articles and Google relative search volumes. The same procedure is implemented for the remaining companies in the sample. Table 4 presents summary statistics for the trading activity, Google relative search volumes, and news articles provided by the Thomson Reuters Eikon. In this subsection, we perform the empirical tests by estimating the coefficients γ for the negative binomial model and testing whether it was an abrupt shift in the estimates at the beginning of 2020. We use the CUSUM method to detect a single changepoint in the time-series data. According to the results of the OLS-based CUSUM test, the null hypothesis that there is no mean shift is rejected (p-value 8.6 · 10 −11 ), and the estimate of a changepoint is the last week of February 2020. Figure 12 shows the estimates of the slope γ for the NB regressions with μ(W it ) μ W Search it run separately for each week from the first week starting with the first week of August 2018 and ending with the last week of June 2021. Figure 12 shows the time-series variation in weekly estimates of parameters γ in the case of using the Google for stock i and week t being modeled as, relative search volumes data. Indeed, we can observe that there is a changepoint in means around February-March 2020. The black dashed lines indicated the averages before and after the changepoint: 0.570 and 0.720, respectively. The green dashed line shows the value of γ 2/3, predicted by the invariance hypothesis. It is worth noting that the average coefficient γ after the changepoint is closer to 2/3 than the average coefficient γ before the abrupt shift. Thus, we conclude that trading activity and the information flow approximated by Google relative search volumes became more synchronized. We replace the Google relative search volumes data with the news articles data and repeat our analysis. In the case of news queries, the estimate of a changepoint is the last week of October 2020. However, we cannot reject the null hypothesis that there is no abrupt shift because p-value 0.067. Moreover, according to our results, the average value of γ before the last week of October 2020 was closer to the theoretical value of 2/3 that the average value of γ after this week: 0.527 and 0.437, respectively. Figure 13 shows the time-series variation in weekly estimates of parameters γ in the case of using the news articles data. The black dashed lines indicated the averages before and after the last week of October 2020. The green dashed line shows the value of γ 2/3, predicted by the invariance hypothesis. where W it is the trading activity, the product of weekly volatility and weekly ruble volume. U.S. dollar/Russian ruble exchange rate in February 2020). Moreover, it was the last week of February 2020 that was characterized by the most massive inflow totaling 16.7 billion rubles. Figure 14 shows the dynamics of inflow/outflow of retail investors' funds during the period January 2020-December 2020. where W it is the trading activity, the product of weekly volatility and weekly ruble volume. Thus, it is reasonable to believe that it was the drastic increase in retail investor participation in the Russian stock market that was the reason why the trading flow and information flow approximated by Google relative search volumes better conformed to the same business-time clock R. In addition, we do not find any significant differences in the relation between the trading and information processes in the case of using the news articles data. Moreover, the tests show that the level of synchronization between these two flows is somewhat lower in 2021 compared to the 2018-2020 period. In the first part of our paper, we tested the two specifications of the mixture-of-distributionshypothesis (MDH in Volume and MDH in Transactions) and the Intraday Trading Invariance hypothesis. It should be noted that the original market microstructure hypothesis is formulated for metaorders that are usually divided by traders into many smaller orders to minimize trading costs. However, we found that the invariance principles explain much of the endogenous variation between trading variables over short time intervals. The centralized structure of the Russian stock market made it possible to correctly identify the size of transactions, a key variable in our empirical tests. After applying different techniques to mitigate the impact of errors-in-variables problems and performing several robustness checks, we showed that logreturn variation per transaction is proportional to the − 2 power of the average trade size times the stock price. At the same time, we did not find confirmation of the invariance hypothesis for individual stocks and associate this result with significant noise in the regressors. We demonstrated that the nonlinear ITI relationship holds when considering trading days with maximum market downturns during the period January 2014-July 2018. However, we found that these days were characterized by a statistically significant increase in the trading invariant, the ruble risk transferred by one bet per unit of business time. In our opinion, the major causes were the sharp increase in the level of information asymmetry and the high volatility of securities' fundamentals since the average spread cost was soaring during these crashes. Our study also briefly touched upon the intraday dynamics of the trading invariant. We found a U-shaped pattern and drew attention to the presence of individual peaks of trading intensity at the moments of the last five-minute intervals of each hour. At the same time, we did not find the universal trading invariant for all stocks, one of the major assumptions of the invariance hypothesis. A more detailed study of the key factors of trading activity at highfrequency intervals and the use of more sophisticated econometric methodology for testing the log-linear regression specifications are among the main directions for future work. The exploration of the relationship between trading variables over short intervals was based on the Market Microstructure Invariance hypothesis extrapolated to the intraday dimension. At the same time, to study whether the positive shock to retail investors' demand for equities changed the relationship between the trading and information processes in the Russian stock market, we used another extension of invariance principles: the Information Flow Invariance hypothesis. This invariance-implied methodology allowed us to zoom in on one of the most notable episodes in the Russian stock market over the August 2018-June 2021 period: the considerable increase in individual investor participation since the beginning of 2020. In accordance with our empirical tests, the information flow approximated by the Google search activity of Russian stocks became more synchronized with market trading activity. We showed that a changepoint in the dynamics of estimates characterizing this relationship occurred in the last week of February 2020, when the Russian stock market faced a huge inflow of retail investors' funds. At the same time, we did not find a statistically significant change in the relationship between trading activity and news activity. To summarize, the level of synchronization between those two flows was quite stable through time but lower than the corresponding level in the case of Google search activity. In the future, we would like to examine the alternative sources of public information in the context of the Information Flow Invariance (e.g., tweets, Telegram messages, financial analytics). Funding The article was prepared within the framework of the Basic Research program at HSE University. The data that support the findings of this study are available from the Moscow Exchange but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. This data are however available from the authors upon reasonable request and with permission of the Moscow Exchange. Upon acceptance, news and transaction-level data provided by Thomson Reuters Eikon and Google search volumes data will be available on the official website of the Centre for Financial Research & Data Analytics. Code availability Not applicable. The authors report no declarations of interest. See Table 1 . The table contains statistics for the mean values and standard deviations (in parentheses) of trading variables, aggregated by five-minute intervals from January 6, 2014 to July 31, 2018. The cumulative trading volume (V), the transaction rate (trades per unit time) (G), and the average number of stocks per transaction (Q) are computed across each five-minute interval and then averaged across all observations. The variance measure (σ 2 ) represents realized variance and is computed from five one-minute squared log-returns across each five-minute interval. Note: On July 1, 2016 the ticker EONR changed to UPRO. See Table 2 . Driscoll-Kraay standard errors in parentheses. In Panel A (B), the results are conditional on the number of trading days considered as turbulent: 5 (12). The sample period is from January 6, 2014 to July 31, 2018. See Table 3 . Driscoll-Kraay standard errors in parentheses. In Panel A (B), the results are conditional on the number of trading days considered as turbulent: 5 (12). The sample period is from January 6, 2014 to July 31, 2018. See Table 4 . The table contains statistics for the mean values and standard deviation (in parentheses) of the trading activity, Google relative search volumes, and news articles provided by the Thomson Reuters Eikon, aggregated by weekly intervals from the first week of August 2018 to the last week of June 2021. The benchmark query is "Gazprom stocks" written in Russian (Gazprom ordinary stock's ticker is GAZP). For Tatneft (stocks' tickers: TATN and TATNP), Surgutneftegaz (stocks' tickers: SGNS and SNGSP), and Sberbank (stocks' tickers: SBER and SBERP), we sum weekly trading activity of ordinary and preferred stocks. where c is a constant and e iτ are residuals with zero expectation. Thus, when β 0, Eqs. (15) and (17) become equivalent to each other; when β 1, Eqs. (16) and (17) are identical. These assumptions contrast sharply with the Intraday Trading Invariance hypothesis predicting β −2 (with the inclusion of the term 2 p iτ ). It is also necessary to mention that the analysis of log-linear models is possible over intervals with positive trading activity. Therefore, we have to make the following adjustments to the original hypothesis. Modified Intraday Trading Invariance I iτ are independent and identically distributed for all stocks i and those intervals τ that are characterized by non-zero trading activity. Intraday trading invariance in the E-mini S&P 500 futures market. (NES working paper № 272) Modeling and forecasting realized volatility Order flow, transaction clock, and normality of asset returns Théorie de la spéculation Invariance of buy-sell switching points. (NES working paper № 273) Public information arrival Are trading invariants really invariant? Trading costs matter Who uses financial reports and for what purpose? Evidence from capital providers The intraday behavior of bid-ask spreads for NYSE stocks and CBOE options A subordinated stochastic process model with finite variance for speculative prices The volume of transactions and price changes on the new york stock exchange Search of Attention New frontiers for arch models The stochastic dependence of security price changes and transaction volumes: implications for the mixture-of-distributions hypothesis Trading costs On the estimation of security price volatilities from historical data News vs. sentiment: Predicting stock returns from news storis The dependence between hourly prices and trading volume Transactions, volume, and volatility The relation between price changes and trading volume: A survey News articles and equity trading. (NES working paper №233) Market Microstructure Invariance: Empirical hypotheses Dimensional analysis and market microstructure invariance Large bets and stock market crashes A Practitioner's guide to market microstructure invariance Why do security prices change? A transaction-level analysis of NYSE stocks The impact of Public Information on the Stock Market Liquidity Estimates and Selection Bias. (CEFIR/NES Working Paper №225). Centre for Economic and Financial Research at New Economic School The Russian ruble crisis of Periodic structure in the Brownian motion of stock prices The price variability-volume relationship on speculative markets Giving content to investor sentiment: the role of media in the stock market Are leveraged and inverse ETFs the New Portfolio insurers? (Finance and Economics Discussion Series 2013-48) The distribution of common stock price changes: An application of transactions time and subordinated stochastic models An investigation of transactions data for NYSE stocks Does arbitrage flatten demand curves for stocks? Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations The Intraday Trading Invariance hypothesis is formulated as follows: random variables I iτ defined asare independent and identically distributed for all stocks i and intervals τ. discuss the economic meaning of this variable.Ĩ is the monetary (e.g., dollar or ruble) risk transferred by one bet per unit of business time. The assumption about the approximate constancy of this value across assets and across time is based on the no-arbitrage condition. The authors provide the following illustrative example. Let large-cap stocks initially give more opportunities for profit than small-cap stocks. As a result, more informed traders start to take long positions in large-cap stocks. Consequently, the expected arrival of bets (the reciprocal of business time) for large-cap stocks rises. Moreover, the distance between trading prices and unobservable fundamental values tend to shrink for such stocks, and traders endogenously adjust their trade sizes to compensate for their trading costs. As a result, the monetary value of the transferred risk per unit of business time remains unchanged in equilibrium.We can turn to an alternative formulation of Eq. (5) if we take the logarithms of both sides of the equation and then take the conditional expectations:Taking into account the presence of an error term, we get the following log-linear relationship between trading variables. Intraday Trading Invariance:where c is a constant and e iτ are residuals with zero expectation. The inclusion of the stock price level p iτ is primarily due to the differences in stock prices in the cross-section. 12The mixture-of-distributions-hypothesisAs basic models, we consider the following two log-linear regression specifications. We use the notations and assumptions described earlier in this section. where c is a constant and e iτ are residuals with zero expectation. 2. The Mixture-of-Distributions-Hypothesis in Volume (MDH-V) s iτ c + v iτ + e iτ f or τ 1, . . . , D · T ; i ∈ I ,where c is a constant and e iτ are residuals with zero expectation. Let us generalize formulas (15) and (16), taking into account that v iτ q iτ + γ iτ .