key: cord-0856341-vnjcqrgx
authors: Iyke, Bernard Njindan; Ho, Sin-Yu
title: Stock return predictability over four centuries: the role of commodity returns
date: 2020-08-01
journal: Financ Res Lett
DOI: 10.1016/j.frl.2020.101711
sha: 55030ad89f6ac02bffef405b28efdce582771504
doc_id: 856341
cord_uid: vnjcqrgx

We merge two unique historical datasets on commodity and stock prices covering four centuries and three leading stock markets (Netherlands, UK, and US) to show that, consistent with theoretical predictions, commodity returns can predict stock returns. We show that about 64% and 56% of the commodity returns can predict stock returns in- and out-of-sample, respectively. Aggregating commodity returns by market, returns from agriculture, energy, and livestock and meat markets appear to consistently predict stock returns. These results are robust to recessions and expansions.

We test the theoretical prediction that commodity returns contain essential information to forecast stock returns (see Jacobsen, Marshall, and Visaltanachoti, 2019) . Theory suggests that increasing commodity returns are associated with increasing inflation and interest rates, and, consequently, bearish stock markets (Black, Klinkowska, McMillan, and McMillan, 2014) . Besides, commodities are recognized as safe havens to diversify investments away from equities. Gorton and Rouwenhorst (2006) contend that commodity and stock returns are negatively correlated due to their behaviour over the business cycle. During the early stages of a recession, commodity returns are positive, whereas stock returns are negative; this trend reverses during the latter stages. 1 However, during an expansion phase, both returns tend to move in the same direction since they indicate the direction of the economy.

If the correlation translates into causation, commodity returns can help predict stock returns. To test this hypothesis, we use two unique datasets covering four centuries and three leading stock markets (Netherlands, UK, and US). Specifically, we consider stock return and 25 commodity-return data over the period of 1629 to 2005. We show that commodity returns contain useful information to forecast stock returns. We find that approximately 64% and 56% of the commodity returns can predict stock returns in-sample and out-of-sample, respectively. Aggregating commodity returns by markets, returns from agriculture, energy, and livestock and meat markets appear to predict stock returns. These results are robust to recessions and expansions.

The literature focuses on the role played by financial and macroeconomic variables such as interest rates, dividend yields, and consumption-wealth ratio, in predicting stock returns (see Ang and Bekaert, 2007; Campbell and Thompson, 2008; Welch and Goyal, 2008; Golez and Koudijs, 2018) . By comparison, the ability of commodity returns to forecast stock returns is not extensively investigated. The main studies considering the role of commodity returns in forecasting stock returns are Black et al. (2014) and Jacobsen et al. (2019) . Jacobsen et al. (2019) consider a single commodity return index, while Black et al. (2014) consider seven commodity-return indexes over a relatively shorter period.

We contribute to the scarce literature in three distinct ways. First, our analysis covers four centuries. Such an extensive examination has the advantage that it improves upon the statistical power to reject the null of no return predictability because it introduces independent variation to the data. Second, unlike prior studies, we consider 25 distinct commodity returns as well as four commodity market returns, namely metals, energy, livestock and meat, and agriculture. This aside, we examine three leading stock markets. This provides a rich picture of commodity returns as predictor of stock returns. Finally, we advance prior studies by simultaneously addressing issues of heteroskedasticity, persistency, and endogeneity that often feature time series data.

The paper proceeds as follows. Section 2 outlines our model and data. Section 3 presents the results and robustness tests. Section 4 concludes the paper.

Our model connecting stock returns to commodity returns is as follows:

(1) where and are, respectively, stock and commodity returns; , , and are parameters of the model; is the time subscript; is the error term. The model controls for endogeneity by including the lag of (see Westerlund and Narayan, 2012) , and deals with persistency by including first difference of (see Westerlund and Narayan, 2015) . Because the variance of is likely heteroskedastic, we model it as following an autoregressive conditional heteroskedastic structure:

where , is the information available at , s are parameters, and is the optimal lag. The predicted values of is used as the weight in the generalised least squares estimation of Eq. (1). Commodity returns consistently predict stock returns insample, if we can show that is significantly different from zero.

The stock returns data is from Golez and Koudijs (2018) and covers three leading markets, Netherlands and UK (1629-1812) , UK (1813 -1870 ), and US (1871 -2015 . These periods represent key events. Amsterdam was the leading financial center, followed by London, during the 1629-1812 period (Neal, 1990) . Similarly, during the 1813-1870 period, London was the leading financial center (Hickson et al., 2011) . Finally, during the 1871-2015 period, the US transitioned to the leading economy and New York became the leading financial center (Golez and Koudijs, 2018) . The full sample period merges these three samples, thus, covering the period 1629 to 2015. We calculate stock returns as , where , and are, respectively, the natural logarithm operator, stock price, and dividend. We calculate commodity returns as , where is commodity price. We collect data on 25 commodity price indexes over the period of from 1650 to 2005 from Harvey, Kellard, Madsen and Wohar (2010). Naturally, it will be interesting extending the data to the present in order to capture recent events, such as the global financial crisis of 2007-2008, the Russia-Saudi Arabia oil price war of 8 March 2020, and the current COVID-19 pandemic (see Iyke, 2020a,b) . However, while it is quite straightforward to extend the stock returns data, this is not the case for the commodity returns data, since the manufacturing value-added price index used to deflate the commodity prices is only available to 2005 (see Harvey et al., 2010) . Given this issue, combining the two datasets yields a sample period of 1650 to 2005.

We calculate aggregate commodity return indicators for the four main commodity markets namely, agriculture (including banana, cocoa, coffee, cotton, jute, rice, sugar, tea, tobacco, and wheat), energy (coal and crude oil), livestock and meat (including beef, hide, lamb, pig iron, and wool), and metals (such as aluminium, copper, gold, lead, nickel, silver, tin, and zinc), using principal component analysis. We do not outline this procedure, since it is well known. Table 1 reports summary statistics on stock returns, the 25 commodity returns, and four aggregate (market) commodity returns for the full sample period . Annual average stock return is 6%, while annual volatility is 15%. On average, 36% of the commodity returns are positive, while the remaining 64% are negative. The commodity returns are also volatile; tea returns being the highest (43%). At the market level, the commodities averaged positive returns with a very high volatility.

There is strong evidence against unit roots in all returns, using the Augmented Dickey-Fuller test (ADF). However, the autoregressive coefficient of order one (AR(1)) suggests that two returns (metals and agriculture) are persistent. The AR conditional heteroskedasticity (ARCH) effect test results suggest evidence of "ARCH" effects in three commodity returns. Finally, there is evidence of endogeneity in 10 out of the 25 commodity returns. Our framework controls for these statistical features of the returns.

We examine the in-sample predictive power of commodity returns by estimating Eq. (1). Table 2 shows these results. We chose the subsamples to mark the distinct periods when the Netherlands, the UK, and the US became the leading international financial centers, consistent with prior work (see Golez and Koudijs, 2018) . 2 From Table 2 , a maximum of 16 commodities can predict stock returns, which is substantial given a total of 25 commodities in our sample. Considering the entire sample period , over half of the commodities (i.e. 52%) can predict stock returns. The maximum predictability (i.e. 64%) is observed during the 1813-1870 subsample period, then followed by 1945-2005 (60%), 1650-1812 (52%), and 1700-1812 (52%). The weakest subsample predictability periods are 1871-2005 (12%) and 1871-1945 (28%) . This shows evidence of temporal or time-dependent predictability of stock returns, which is consistent with prior studies, such as Westerlund and Narayan (2015) and Golez and Koudijs (2018) , which document time-varying predictability of asset returns.

The results indicate that commodities such as beef, coal, cocoa, copper, gold, jute, silver, tobacco, wheat, and zinc are the most consistent predictors of stock returns. In terms of markets, returns from agricultural and energy markets consistently predict stock returns.

We perform the out-of-sample predictability tests using two widely recognised measures of predictive accuracy, namely the out-of-sample R-squared ( ) of Campbell and Thompson (2008) and Theil's ratio. Based on these measures, we compare the predictive accuracy of our model to the following historical average model:

The statistic for testing the predictive accuracy of Eq. (1) relative to Eq. (3) is:

where and denote the mean squared error of the historical average model and our model, respectively. 3 Our model is more accurate relative to the historical average, if is greater than zero (i.e. ). Similarly, Theil's ratio can be stated as follows:

and denote Theil's statistic of the historical average model and our model, respectively; , , and ̂ denote, relative Theil's ratio, the actual and forecast values of stock returns; and denote, respectively, time period and total observations. Our model outperforms the historical average, if Theil's ratio is less than one (i.e.

).

Unlike the in-sample analysis, the predictive accuracy for the out-of-sample analysis is based on the sample periods 1871-2005 and 1629-2005, to ensure adequate observations for the in-sample estimations. We use the first 50% of the sample to estimate both models and the remaining 50% for the out-of-sample predictions. Table 3 shows the and Theil's statistics for a one-year ahead prediction of stock returns. The statistic suggests that 28% of the commodities predict returns for the sample periods 1871-2005 and 1629-2005. Using Theil's statistic, we find that 20% and 56% of the commodities predict returns, for the sample periods 1871-2005 and 1629-2005, respectively. The moderate out-of-sample performance of the 1871-2005 period compares well with the in-sample performance. Aggregating commodity returns by markets, returns from energy, and livestock and meat markets appear to predict stock returns out-of-sample.

Stock return predictability depends on whether the economy is in a recession or expansion (Golez and Koudijs, 2018) . Our baseline in-sample predictability results may be driven by recessionary and expansionary pressures. We correct this by introducing into our model a recession dummy, which equals one if the economy is in a recession, and zero otherwise. The results, which appear in Table 4 , are consistent with the baseline. In fact, the predictive power of commodity returns improves to a maximum of 68%, and, thus, strengthening our conclusion. These results also suggest time-dependent predictability of stock returns. Table 5 reports similar out-of-sample results. As before, we use the first 50% of the sample to estimate both models and the remaining 50% for the out-of-sample predictions. We include the recession dummy and generate one-period ahead forecasts. The statistic shows that 24% and 48% of the commodities predict returns for the sample periods 1871-2005 and 1629-2005, respectively . Theil's statistic shows that 80% of the commodities predict returns, for the sample periods 1871-2005 and 1629-2005 . Consistent with the insample predictability, we find that the predictive power of commodity returns improves to a maximum of 80%, controlling for business cycles. In addition to energy and livestock and meat markets, returns from the metal market predict stock returns out-of-sample.

We test the hypothesis that commodity returns can predict stock returns. Using two unique historical datasets on commodity and stock prices covering four centuries and three leading markets, we show that commodity returns do predict stock returns. We show that 64% and 56% of the commodity returns can predict stock returns in-sample and out-of-sample, respectively. Commodity returns from agriculture, energy, and livestock and meat markets appear to consistently predict stock returns. Our results are robust to recessions and expansions. Our estimates imply that investors can potentially enhance their trading strategies by exploiting commodity return information. Similarly, our estimates imply that analysts can better forecast the stock market by considering commodity prices. Although we do incorporate business cycles in our framework, we do not include the popular predictors of stock returns, such as interest rates, dividend yields, and the consumption-wealth ratio. Further studies should consider these variables to substantiate the strength of commodity returns as a predictor of stock returns. . The statistics are mean value (Mean), standard deviation (SD), Augmented Dickey-Fuller test (ADF), autoregressive coefficient of order one (AR(1)), AR conditional heteroskedasticity (ARCH) effect test, and endogeneity test (ENDO). For endogeneity, we test whether in the regression is zero, where is the error term from our predictive regression and is the error term from the AR(1) regression of the predictor . We report in the table. There is endogeneity, if we reject the null hypothesis that . N/A, ***, **, and * denote, respectively, non-applicable and the statistical significance at the 1%, 5%, and 10% levels. This table reports results on stock returns predictability using 25 commodity returns as predictors. The predictive regression model is estimated using a bias-adjusted feasible generalised least squares estimator. We report the coefficients of the predictors. *, **, and *** are significance at the 10%, 5% and 1% levels, respectively. Coeff., p-val, and N/A denote, respectively, coefficient, p-value, and non-applicable due to insufficient observations. 

This table reports out-of-sample evaluations using relative out-of-sample R-squared (OOS) and Theil's U statistic (U). We set the out-of-sample period equivalent to 50% of the sample size. *, **, and *** are significance at the 10%, 5%, and 1% levels, respectively. This table reports out-of-sample evaluations using relative out-of-sample R-squared (OOS) and Theil's U statistic (U) and controlling for recessions. We set the out-of-sample period equivalent to 50% of the sample size. *, **, and *** are significance at the 10%, 5%, and 1% levels, respectively. 

Stock return predictability: Is it there

Forecasting stock returns: do commodity prices help?

Predicting excess stock returns out of sample: Can anything beat the historical average

Four centuries of return predictability

Facts and fantasies about commodity futures

The Prebisch-Singer hypothesis: four centuries of evidence

The rate of return on equity across industrial sectors on the British stock market

COVID-19: The reaction of US oil and gas producers to the pandemic

The disease outbreak channel of exchange rate return predictability: Evidence from COVID-19. Emerging Markets Finance and Trade

Stock market predictability and industrial metal returns

The rise of financial capitalism

A comprehensive look at the empirical performance of equity premium prediction

Does the choice of estimator matter when forecasting returns?

Testing for predictability in conditionally heteroskedastic stock returns