key: cord-0129195-inffl85x authors: Giancaterini, Francesco; Hecq, Alain title: Inference in mixed causal and noncausal models with generalized Student's t-distributions date: 2020-12-03 journal: nan DOI: nan sha: 89bf1a69267eb762f267eb8925eff6a3e85f9807 doc_id: 129195 cord_uid: inffl85x This paper analyzes the properties of the Maximum Likelihood Estimator for mixed causal and noncausal models when the error term follows a Student's t-distribution. In particular, we compare several existing methods to compute the expected Fisher information matrix and show that they cannot be applied in the heavy-tail framework. For this purpose, we propose a new approach to make inference on causal and noncausal parameters in finite sample sizes. It is based on the empirical variance computed on the generalized Student's t, even when the population variance is not finite. Monte Carlo simulations show the good performances of our new estimator for fat tail series. We illustrate how the different approaches lead to different standard errors in four time series: annual debt to GDP for Canada, the variation of daily Covid-19 deaths in Belgium, the monthly wheat prices and the monthly inflation rate in Brazil. Mixed causal and noncausal models (MAR) are time series processes with both leads and lags components. Such specifications allow to capture nonlinear features such as bubbles, namely processes that experience a rapid increase followed by a sudden crash. Linear autoregressive models (e.g. ARMA models) cannot exhibit these bubble patterns. MAR models have successfully been implemented on several time series, for instance: commodity prices, inflation rate, bitcoin and other equity prices. Furthermore, forecasts from mixed causal and noncausal models often beat those from linear ones. They also have an economic flavor. They are interpreted as situations in which economic agents have more information then econometricians, linking MAR models with the existence of non-fundamentalness in structural econometric models (see Alessi et al. (2011) and Lanne and Saikkonen (2013) ). Still their estimation and in particular making inference on MAR parameters is far from trivial. This paper analyzes the behaviour of the Maximum Likelihood Estimator (MLE) for mixed causal and noncausal models with an error term following Student's t−distributions. Although most theoretical results for MARs are derived under the assumption of finite variance of the error term (see i.a. Breidt, Davis, Li & Rosenblatt (1991) ; Lanne & Saikkonen (2011) ), we emphasize that working with the generalized version of the Student's t allows to also cover infinite variance cases (when the degree of freedom 1 < ν ≤ 2). Many studies on commodity prices (see Fries and Zakoian (2019) ; Voisin (2019, 2020) ; Hecq, Issler and Telg (2020) ) reveal that the estimated degrees of freedom of the Student's t lie between 1.5 and 2. The alternative methods to make inference in the infinite variance cases would be either to work with a different asymptotic theory (Davis and Resnick (1985) ), or to use different distributions (see the work on alpha stable distributions by Fries and Zakoian (2019) ), or, in case of purely noncausal models, to rely on bootstrap estimators (Cavaliere, Nielsen and Rahbek, (2020) ). The rest of the paper is organized as follows. Section 2 introduces mixed causal and noncausal models. Section 3 presents the different ways of obtaining the expected Fisher information matrix for MARs. The existing strategies are briefly reviewed. Section 4 proposes a new approach to compute the standard errors of causal and noncausal parameters, based on a robust estimator of the residuals. We show its validity in finite samples. Section 5 studies, using Monte Carlo simulations, the performances of the current methodologies and of the new approach. Section 6 is dedicated to the empirical applications on four different time series. Section 7 concludes. 2 Mixed causal and noncausal models Breidt et al. (1991) introduce a maximum likelihood procedure for estimating the parameters of noncausal processes. Their starting point is the autoregressive model where L is the backshift operator, ε t is an independent and identically (i.i.d.) non-Gaussian 1 sequence of random variables with mean zero and finite variance. It is assumed that the autoregressive polynomial a(z) = 1 − φ 1 z − · · · − φ z z p has no roots on the unit circle, so that φ(z) = 0 for |z| = 1. Breidt et al. (1991) further assume that the polynomial a(z) has respectively s roots inside and r outside the unit circle. Equation (1) can be factored in where ϕ(z) * is called the noncausal polynomial since its roots are inside the unit circle such that 1−ϕ * 1 z −...−ϕ * s z s = 0 for |z| ≥ 1. Breidt et al. (1991) derive the covariance matrix of the estimated parameters only for probability density functions of ε t that satisfy a certain set of assumptions listed in Section 3. The generalized Student's t−distribution with degrees of freedom equal or less than 2, does not satisfy one of these assumptions and, as a consequence, this approach cannot be used in the heavy-tail framework. Lanne and Saikkonen (2011) directly start with a mixed causal and noncausal model expressed as the product of the backward and forward looking polynomials where L −1 produces leads such that L −1 y t = y t+1 . We denote such a model a MAR(r,s) with φ(L) the causal/autoregressive polynomial of order r and ϕ(L −1 ) the noncausal/lead polynomial of order s. With this representation it is assumed that both φ(z) and ϕ(z) have their roots outside the unit circle: φ(z) = 0 and ϕ(z) = 0 f or |z| < 1. Note that purely causal or purely noncausal models are respectively obtained when ϕ(L −1 ) = 1 or φ(L) = 1. In (3) the parameter vectors φ = (φ 1 , ..., φ r ) and ϕ = (ϕ 1 , ..., ϕ s ) turn out to be orthogonal to the parameters that describe the distribution of the error term t (see Lemma 1 of Lanne and Saikkonen (2011)). They can be estimated by an AMLE approach. AMLE refers to as the approximate maximum likelihood estimators because we lose the r first and the last s observations when estimating MAR(r, s). An important and useful feature of mixed causal and noncausal models, is that we can set: In order to obtain the standard errors of the estimated parameters, Lanne and Saikkonen work with a density function which satisfies similar assumptions presented in Breidt et al. (1991) and, in particular, that it must have a finite variance. Hecq, Lieb and Telg (2016) propose a new approach to more easily compute the standard errors for MAR(r, s) using the generalized t−distribution and relying on the results developed for the linear regression model by Fonseca et al. (2008) . Their approach, also implemented in the R package MARX, works if and only if E(| t | 2 ) < ∞ and hence if the degrees of freedom is larger than 2. We show however that this approach can be misleading as it imposes strong restrictions that can lead to incorrect estimates of the standard errors. Let us consider a general density function f and denote the likelihood function of θ by We indicate with θ 0 = (θ 1 , ..., θ p ) the vector of the true values of the causal and noncausal coefficients (p = r + s). The other parameters of the general density function (degrees of freedom and scale parameter), are, for the moment, assumed to be known and equal to their true population values; we will show next that they are independent from the estimation of θ. Furthermore, we assume that t has a finite variance, equal to σ 2 . Taking the logs of L(θ), we obtain the log-likelihood function Defining b(θ) = δl(θ) δθ the score vector of the log-likelihood, the MLE of θ is given by the solution θ to the p = r + s equations b( θ) = 0. If the sample size is sufficiently large, it turns out that the distribution of the maximum likelihood estimation θ can be well approximated by 2 where I is the expected Fisher information matrix Since it is not always trivial to evaluate analytically the expected value of the Hessian matrix, we can also compute the observed Fisher information matrix: For the law of large numbers I(θ 0 ) converges in probability to I(θ 0 ). In practice, since the true value of θ is not known, these two matrices are obtained by replacing the population parameters by their ML estimates to get I( θ) and I( θ). Let us start with I(θ 0 ), the observed information matrix of a MAR(r, s) as described in (3). We consider t i.i.d. and distributed according to a generalized Student's t distribution, such that its density function at time t is: with the corresponding approximate log-likelihood function, conditional on y = [y 1 , . . . , y T ], equal to: We indicate with ν 0 and η 0 respectively the true values of the degrees of freedom and of the scale parameter. Instead, σ 2 0 denotes the true value of the variance of the error term which, in a generalized Student's t−distribution, is equal to In this case, we have that I(θ 0 ) is given by knowing that, in the general case: . . , v t+s ) and Y t is a matrix r × s with elements y t−i+j (i = 1, . . . , r and j = 1, . . . , s). Section 3.1 shows that in mixed causal and noncausal models, the expected Fisher information matrix (unlike I(θ)) cannot be computed when the population variance is not finite. In Section 5 we will evaluate, by means of Monte Carlo simulations, whether the observed Fisher information matrix still allows to respect Equation (8) in this context. Lanne and Saikkonen (2011) propose to calculate the asymptotic covariance matrix using a general (Lebesgue) density function, which depends on parameters vector λ, where all the distributional parameters are collected (scale parameter and degrees of freedom which, exactly as the previous section, are respectively indicated with η and ν). Furthermore, it is characterized by an i.i.d innovation term with finite and constant variance, equal to σ 2 . Similar conditions as of Andrews et al. (2006) must be satisfied. In details these are: (A1) For all x ∈ R and all λ ∈ Λ, f (x, λ) > 0 and f (x, λ) is twice continuously differentiable with respect to (x, λ). (A2) For all λ ∈ Λ , xf (x, λ)dx = −1. (A6) The matrix Ω is positive definite. 3 (A7) For j, k = 1, ..., d and all λ ∈ Λ 0 , , (δf (x;λ)/(δλj )) 2 f 2 (x;λ) , and are dominated by a 1 + a 2 |x| c1 , where a 1 , a 2 , and c 1 are nonnegative constants and |x| c1 f 1 (x)dx < ∞. In this Section we relax the assumption that the distributional parameters of density f are known. Also, we need to introduce some notation used in their paper. Let ζ t ∼ i.i.d (0, 1) and define the AR(r) stationary process u * t by φ 0 (L)u * t = ζ t and the AR(s) stationary process Theorem 1 (by Lanne et al., 2011) Given conditions (A1)-(A7), there exists a sequence of local maximizers θ = ( φ, ϕ, η, v) of l t (θ) in (7) such that and Ω −1 is the asymptotic variance-covariance matrix of the distributional parameters. Lanne and Saikkonen (2008) show in detail how to obtain the Σ matrix. Furthermore, they show that we have a block diagonality because the representation (3) and the conditions (A2)-(A4). Due to the block diagonality of the covariance matrix of the limiting distribution, the AML estimators of ( φ, ϕ) and ( ν, η) are asymptotically independent. The matrix Σ is positive definite if condition (A5) is true (J > 1). To see for which type of distribution (A5) holds, we need to take into consideration Remark 2 of Andrews et al. (2006) . They show that, using (A2) and the Cauchy-Schwarz inequality with an equality if and only if f is gaussian. Hence, (A5) is true for non-gaussian f sinceJ can be rewritten asJ where the density function inside the integral, refers to a rescaled density function (that is with unit variance). In other words, we have that Σ is positive definite if σ 2J > 1 or if (A5) is true. 4 In our case, we have t i.i.d. according to a generalized Student's t distribution and: with f σ ( t , ν, η) defined in (11). It is easy to see that this approach works if and only if t has finite variance. This is the reason why most authors (e.g. Lanne and Saikkonen (2011)) use a standardized Student's t−distribution (such that σ = η) in their empirical application. When we consider this type of standardized distribution, the log-likelihood function is: such that, unlike (12), its structure ensures convergence for ν > 2 (hence finite variance). This is the shortcoming of this approach: we cannot take into consideration series with degrees of freedom less than 2. This is restrictive for series such as stock prices, commodity prices, bitcoin, etc. where heavy tails are observed with a degree of freedom that ranges between, 1.3 and 1.9 (without reaching the Cauchy for ν = 1 though). We also observe that the heavier the tails are, the faster the estimator seems to converge (see Hecq et al. (2016) ). where X i and β are both vectors p×1 and i are i.i.d. following a generalized Student's t-distribution with ν degrees of freedom and a scale parameter η, such that the log-likelihood function being where The first derivative of the log-likelihood function with respect to β is given by whereas the second derivative with respect to β, applying the product and chain rule is In order to obtain the expected Fisher information I(·) with respect to β, we have to take the expectation of this expression and multiply it by -1. Fonseca et al. (2008) show that: Hecq et al. (2016) adapt the results obtained by Fonseca et al. (2008) in the context of the noncausal model setup. That is, they consider a general MAR(r, s) model where t ∼ t(ν, η). To ensure a similar model setup, Hecq et al. (2016) use representations (5) and (6). They replace the aforementioned alternative representations of noncausal model to the original linear representation (17), so that they can compute the standard errors of the causal/noncausal coefficients using the results of Fonseca et al. (2008) . In other words, they obtain the standard errors of the causal coefficient using (5) and assuming the noncausal parameters as known. Instead, for the standard errors of the noncausal parameters, they use representation (6) supposing that the causal coefficients are known. This is of course an approximation which leads to a block diagonal and conditional expected Fisher information matrix (14). For instance, in a MAR(1,1) they obtain the following conditional expected Fisher information matrices for the causal and the noncausal parts implying both δ 2 l(·) δφδϕ = 0 and δ 2 l(·) δϕδφ = 0; hence Obviously, when we invert I(φ, ϕ), we have different results from those that we obtain if we would have inverted the complete Fisher information matrix. Hecq et al. (2016) illustrate that this approximation gives mildly satisfactory results. Furthermore, exactly as Lanne and Saikkonen (2011), Hecq et al. (2016) state that this methodology can be applied only if the error term has a finite variance (hence if ν > 2). A closed form solution for these limiting distributions does not exist (see Davis et al. (1992) and Andrews et al. (2013) ) and this problem could be overcome by means of bootstrapping and simulation-based models. In this section, we investigate the conclusions obtained by Hecq et al. (2016) in the finite variance case of the error term. In particular, we want to evaluate to what extent the assumption of the block diagonality of the conditional expected Fisher information matrix yields misspecified standard errors. Indeed their approach is implemented in the R package MARX and has been applied in several researches. For this purpose, we compute the empirical density functions of the percentage difference of the standard errors obtained through the two aforementioned approaches. In particular, we analyze the empirical density functions of Z φ,i and Z ϕ,j , where: The data generating process is a MAR(1,1) with a scale parameter η = 5, T=1000 observations and 10000 replications. In addition, we consider different values of degrees of freedom (ν 0 =3, ν 0 =4 and ν 0 =5) and different combinations of values for the causal/noncausal coefficients, that is: We conclude that the standard errors proposed by Lanne and Saikkonen (2011) should be used for non-heavy tailed models. The approximation developed in Hecq et al. (2016) on the other hand, underestimates the standard errors and consequently provides a too narrow confidence interval. Furthermore, such underestimation decreases with decreasing degrees of freedom. Hence, although the approach proposed by Hecq et al (2016) is easy to implement, it should be applied in cases of heavy tail disturbances, or where the Lanne and Saikkonen (2011)'s method already show some convergence problems. This happens, due to estimation uncertainty, when the degrees of freedom is small even though the population variance is finite. In this section we propose a new methodology to compute the standard errors of MAR parameters. It is valid for mixed causal and noncausal models whenever the error term is distributed according to a generalized Student's t−distribution and the sample size is finite. Although in the heavytail framework it is not possible to derive the theoretical limiting distributions of these parameters, Monte Carlo simulations in the next section show how our new estimator empirically satisfies Equation (8) for ν ∈ (1, D], with D < ∞. In Section 3.1, it is stated that the variance of the error term (σ 2 ) multiplies the block diagonal matrices of the Expected Fisher information matrix defined in (13). Since the Student's t−distribution with heavy-tailed innovations is characterized by an undefined variance, the expected Fisher information matrix cannot be computed in this context. Our alternative strategy Figure 3 : Density plots of the variables Z φ and Z ϕ , based on 3 degrees of freedom and T=1000 observations consists in replacing the variance of the error term with the variance of residuals (σ 2 ) in (13). Furthermore, especially in those cases where the population variance is not finite (ν ∈ (1, 2]), we expect the residuals having a wide range of values. In order to decrease the effect of huge outliers, we estimate the variance of the residuals using the Median Absolute Deviation (MAD) robust estimator: and a consistent estimation of the standard deviation is given by: Rousseeuw et al. (1993) show that, if we set k to 1.48, Equation (25) This implies that also k is a function of ν and T : In other words, k is a random variable with different density functions depending on the different values of ν and T . Supposing ν = 1.8 and T = 500, to obtain the empirical density function of k(1.8, 500) we apply a Monte Carlo experiment where the error term is simulated setting ν = 1.8 and T = 500. 5 In each replication we compute the value of k using Equation (26). In this way, the Monte Carlo experiment yields as many values of k as the number of replications. To identify from these values the empirical density function of k, we use the kernel density estimation. The extreme values of k can affect the non-parametric estimation and to avoid this, we extract all values of k within the range: where Q1 and Q3 are respectively the first and the third quartile of k and IQR is its interquartile range. With the following values, we obtain an empirical density function as shown in Figure 4 . In addition to choosing ν = 1.8 and T = 500, we also consider N=700.000 replications. A large number of replications is important to obtain an empirical density function as accurate as possible. Finally, in order to obtain a robust estimate of standard deviation of residuals, we take the mode of k, indicating this value as k * . Appendix A provides values of k * for other ν and T . In conclusion, this approach gives us a Fisher information matrix: where, using Equation (25), we have: So far we have seen that, for mixed causal and noncausal models with an innovation term distributed according to a generalized Student's t−distribution, it is not possible to derive the theoretical limiting distribution in the heavy-tail framework. This section focuses on identifying, through Monte Carlo experiments, which of the aforementioned estimators of the standard errors satisfies Equation (8) in finite samples. As previously stated, we will also include in the analysis the standard errors obtained by the observed Fisher information matrix (I( φ, ϕ)). For this purpose, we run several Monte Carlo simulations characterized by N=10000 replications each. The data generating process is a MAR(1,1) with a scale parameter η 0 = 3 and sample sizes T =(100, 200, 500, 1000). We also consider several degrees of freedom ν 0 = (3, 1.8, 1.5, 1.2) and different combinations of causal and noncausal coefficients, that is: For each replication we test whether the estimated causal and noncausal coefficients are equal to their respective true values. In particular, we compute two different t−tests: H 0 : φ = φ 0 and H 0 : ϕ = ϕ 0 against the two sided alternatives, respectively φ = φ 0 and ϕ = ϕ 0 . Tables 1-14 show the empirical rejection frequencies (at nominal significance level 5%) obtained using the different methodologies to compute the standard errors. In particular, the columnsΣ and I( φ, ϕ) indicate the empirical rejection frequencies whenever the standard errors are obtained by the matrices (27) and (23) respectively. We observe that for ν > 2 (Tables 1-4) , Hecq et al. (2016) method provides an empirical t−distribution characterized by tails fatter than a standard normal distribution. The reason is that in the denominator of the t−test we have underestimated standard errors (see Section 3.3). We also observe that our new approach and the observed Fisher information matrix have less distortions for small sample sizes (T=100, T=200) than the expected Fisher information matrix. The latter only gets closer the 5% nominal rejection frequency for T=1000. For ν ≤ 2 (Tables 5-14) the expected Fisher information matrix of the causal and noncausal parameters (Σ) cannot be derived. On the other hand, we loose the normality of these parameters whenever the standard errors are computed through the observed Fisher information matrix. The diagonal conditional expected Fisher information matrix (I( φ, ϕ)) performs better than I( φ, ϕ), but the results are still far from those that we would have obtained in case of standard normal distribution. Our new approach (Σ) is the only one that allows us to empirically satisfy Equation (8). Also, this new method provides empirical rejection frequencies slightly smaller for high values of causal and noncausal coefficients. This is not a issue of great relevance in terms of inference, as high values are likely significantly different from zero. We illustrate the differences and the similarities in the computed standard errors of MAR models on four time series. These are: (a) the annual debt to GDP ratio for Canada from 1870 to 2015 (source: IMF), (b) the variation of daily Covid-19 deaths in Belgium from 10/March/2020 to 17/July/2020 (source: WHO), (c) the monthly wheat prices from January 1990 until September 2020 (source: IMF) and (d) the monthly inflation rate in Brazil (obtained from year to year difference on IPCA index 6 ) observed from January 1997 to June 2020 (source: Central Bank of Brazil). Figure 5 presents the data. With this panel of applications we want to show that MAR models are also interesting for modeling other series than the usual commodity prices. The way to estimate MAR models imply a series of steps. We first estimate a conventional causal autoregressive model by OLS in order to obtain the lag order p using information criteria (see Lanne and Saikkonen (2011) ). We find p = 2 for three out of the four series, namely for inflation, debt to GDP ratio and wheat prices whereas p = 4 is chosen for Belgian's Covid series. Using an AML approach and searching for the r and s with p = r + s that maximize the generalized Student's t likelihood function, Canadian debt/GDP, wheat prices as well as Brazilian inflation follow a MAR(1,1) and the variation of Covid-19 deaths a MAR(2,2). We detail next the value of estimated parameters and their standard errors obtained using methods reviewed and newly introduced in this paper. From our simulation results, we can expect some differences and similarities given the degrees of freedom estimated for the four variables: for the Canadian seriesν = 2.37, for 'Covid-19 datâ ν = 1.17, on wheat pricesν = 2.21 andν = 3.22 for Brazilian inflation. Although we observe fat tails in each series, it is only on daily Belgian data that the the degree of freedom is below 2. However, none of them is significantly different from two. To check this, we use the standard errors given by −(T − p) −1 δ 2 l T (φ,φ, θ 2 )/δθ 2 δθ 2 withθ 2 = (ν,η), being a consistent estimator of the expected Fisher information matrix of the distributional parameters Ω (see Lanne and Saikkonen (2011) ). This matrix, unlike Σ, has no restrictions and can be computed also when the population Empirical rejection frequencies -MAR(1,1): φ 0 = 0.65, ϕ 0 = 0.35, ν 0 = 3 Empirical rejection frequencies -MAR(1,1): φ 0 = 0.5, ϕ 0 = 0.5, ν 0 = 3 Sample sizeΣ I( φ, ϕ) Empirical rejection frequencies -MAR(1,1): φ 0 = 0.35, ϕ 0 = 0.65, ν 0 = 3 Empirical rejection frequencies -MAR(1,1): φ 0 = 0, ϕ 0 = 0, ν 0 = 1.8 Empirical rejection frequencies -MAR(1,1): φ 0 = 0.65, ϕ 0 = 0.35, ν 0 = 1.8 Sample sizeΣ Empirical rejection frequencies -MAR(1,1): φ 0 = 0.5, ϕ 0 = 0.5, ν 0 = 1.8 Sample sizeΣ I( φ, ϕ) Empirical rejection frequencies -MAR(1,1): φ 0 = 0.35, ϕ 0 = 0.65, ν 0 = 1.8 Empirical rejection frequencies -MAR(1,1): φ 0 = 0, ϕ 0 = 0, ν 0 = 1.5 Sample sizeΣ I( φ, ϕ) Empirical rejection frequencies -MAR(1,1): φ 0 = 0.5, ϕ 0 = 0.5, ν 0 = 1.5 Sample sizeΣ I( φ, ϕ) Empirical rejection frequencies -MAR(1,1): φ 0 = 0.35, ϕ 0 = 0.65, ν 0 = 1.5 Sample sizeΣ to the approach used to compute the standard errors. In the empirical application concerning the Canadian debt (Table 17) , we have too narrow confidence intervals whenever we compute the standard errors using Hecq et al. (2016) or Lanne and Saikkonen (2011) methodology. Instead, in the table associated to the Brazilian inflation rate (Table 20) , we notice that the standard errors obtained through the robust estimator of residuals, are larger (and consequently also the confidence intervals) than those obtained by the "traditional" methodologies described in Section 3. The same is true for the causal coefficient in the wheat prices (Table 19) . For the noncausal coefficient of the same empirical application, we obtain too narrow confidence intervals when the standard error is computed by I( φ, ϕ) and byΣ. Finally, the time series related to the variation of deaths for Covid-19 in Belgium (Table 18) , is characterized by an error term with an undefined variance. Our method allows to have narrower confidence intervals than those obtained using Hecq et al. (2016) and the observed Fisher information matrix. In this paper we first review the behaviour of the ML estimator for mixed causal and noncausal models. In particular we focused on those having an error term distributed according to a generalized Student's t−distribution. We have seen that the expected Fisher information matrix (derived by Lanne and Saikkonen (2011) ) of the causal and noncausal parameters, can be computed if and only if the probability density function satisfies a certain set of assumptions. The generalized Student's t−distribution with an infinite variance (ν ∈ (1, 2]), does not meet one of these and hence this methodology is not applicable in this context. This is a strong limitation, since time series such as commodity prices, bitcoin, etc. feature heavy tails with degrees of freedom less than 2. Hecq et al. (2016) proposes a new and easier way to compute the standard errors of these parameters that is valid only when the population has a finite variance. It is also implemented in the R package MARX and applied in several researches. However, through a simulation study, we show that this approach leads to the underestimation of the standard errors due to the strong restrictions imposed on the expected Fisher information matrix. To overcome these problems, we propose an alternative way to compute the standard errors from the expected Fisher information matrix, based on a simple non parametric estimator of the variance of the residuals. Monte Carlo simulations show the good performances of this new estimator, even when the variance of the population is not finite. We estimate MAR models on four macroeconomic and financial time series and we illustrate the differences in the estimated standard errors using the different approaches. Non-fundamentalness in structural econometric models: A review Model identification for infinite variance autoregressive processes Maximum likelihood estimation for all-pass time series models Maximum likelihood estimation for noncausal autoregressive processes Bootstrapping noncausal autoregressions: with applications to explosive bubble modeling Limit theory for moving averages of random variables with regularly varying tail probabilities M-estimation for autoregressions with infinite variance Mixed causal-noncausal ar processes and the modelling of explosive bubbles Time series analysis Mixed causal-noncausal autoregressions with exogenous regressors Identification of mixed causal-noncausal models in finite samples. Annals of Economics and Statistics/Annales d'Économie et de Statistique Predicting bubble bursts in oil prices using mixed causal-noncausal models Forecasting bubbles with mixed causal-noncausal autoregressive models Modeling expectations with noncausal autoregressions Noncausal autoregressions for economic time series Noncausal vector autoregression Alternatives to the median absolute deviation