key: cord-0525070-5q6yul2q
authors: Levy, Bruno P. C.; Lopes, Hedibert F.
title: Trend-Following Strategies via Dynamic Momentum Learning
date: 2021-06-15
journal: nan
DOI: nan
sha: 4ef7a9379a17f414a2d408fb1fddc51cb966cc19
doc_id: 525070
cord_uid: 5q6yul2q

Time series momentum strategies are widely applied in the quantitative financial industry and its academic research has grown rapidly since the work of Moskowitz, Ooi and Pedersen (2012). However, trading signals are usually obtained via simple observation of past return measurements. In this article we study the benefits of incorporating dynamic econometric models to sequentially learn the time-varying importance of different look-back periods for individual assets. By the use of a dynamic binary classifier model, the investor is able to switch between time-varying or constant relations between past momentum and future returns, dynamically combining or selecting different momentum speeds during turning points, improving trading signals accuracy and portfolio performance. Using data from 56 future contracts we show that a mean-variance investor will be willing to pay a considerable management fee to switch from the traditional naive time series momentum strategy to the dynamic classifier approach.

A significant part of the hedge fund industry nowadays is based on managed futures funds, also known as Commodity Trading Advisors (CTAs). As shown by Hurst, Ooi, and Pedersen (2013) , the returns of these funds are usually explained by simple trendfollowing (aka time-series momentum) strategies on future contracts. These strategies use the ability of past returns to antecipate future return movements. The work of Moskowitz, Ooi, and Pedersen (2012) was the first to document the ability of time-series momentum strategies to generate significant profits over time and among different future markets, contradicting the random-walk theory where no past information is able to predict future returns. The basic ideia of such strategy is to vary the position of an individual asset based on signals of the past returns over a specific look-back period (traditionally, from one to twelve months). Therefore, the investor goes long during periods of positive trends and goes short during periods of downtrend.

The time-series momentum strategy is related to, but different from, the cross-sectional momentum strategy (Jegadeesh and Titman, 1993 and Asness et al., 2013) . The crosssectional approach explores the relative performance among different assets, buying those assets with higher past performance (winners) and selling those with lower performance (losers). Hence, even a security with positive but low past return can be sold if its peers are performing better recently. On the other hand, the time-series momentum explores the absolute performance of the own specific security, despite the performance of its peers. Interestingly, the work of Moskowitz et al., 2012 shows that the returns of time-series momentum strategies are not related to compensation for traditional risk factors, such as the value and size factors, but is partially related to the momentum factor.

After the work of Moskowitz et al., 2012 , the empirical literature on time-series momentum has grown rapidly, finding evidences that the returns of managed funds can be explained by time-series momentum strategies (Hurst et al., 2013 and Baltas and Kosowski, 2013 ) and its significant performance in different asset classes in emerging and developed markets (Georgopoulou and Wang, 2017) , among common stocks (Lim, Wang, and Yao, 2018) and throughout the entire past century (Hurst et al., 2017) . Using intraday data, Gao, Han, Li, and Zhou (2018) also show that the first half-hour return on the market is able to predict the last half-hour return. In terms of portfolio allocation, Baltas (2015) , Baltas and Kosowski (2020) and Rubesam (2020) show the benefits of correlations and risk parity for improving portfolio diversification on time-series momentum strategies.

Recently, Hutchinson and O'Brien (2020) have showed a link between time-series mo-mentum returns and the business cycle, giving evidences that the returns are stronger during both recessions and expansions. The literature has also recognized the time-series momentum pattern in risk factors. Gupta and Kelly (2019) document robust persistence in the returns of equity factor portfolios, showing that factor timing by time-series momentum produces economically and statistically large excess performance relative to untimed factors. Exploring this idea, Levy and Lopes (2021b) also insert a time-series momentum structure to predict risk factors in a high-dimensional portfolio allocation. In general, the papers cited above compare the results of different portfolios built by the use of different look-back periods (the number of periods to consider in the past to form a measure of momentum) or directly consider twelve months as the benchmark measure to generate momentum signs. Then, they set a buy or sell trading rule based on the observed momentum sign. This type of decision rule is motivated by practice and the academic literature that followed. However, we argue in this paper that the absence of an econometric model behind decisions can lead investor to misleading trading actions. For example, what guarantees that the returns from previous months will always indicate a positive relationship with future returns? Each asset can respond differently not just to the same measure of momentum but also for different look-back periods. Some assets can have a negative (reversal) relation with shorter look-back periods and others a positive effect. Also, this pattern can change over time. Since the environment of the financial market is continuously changing, a pattern that was common in the 80s can differ from the 90s or during financial crisis and pandemics. Motivated by these ideas, we use a dynamic binary classification model to infer about the future trend of returns. The approach is able to handle look-back period uncertainty and time-varying parameters in a dynamic fashion. Hence, investors can learn from past mistakes, giving lower importance to look-back periods that have performed worse in the recent past and assigning higher probabilities to look-back periods with higher predictability. Also, by the use of time-varying parameters, the model adapts to changes in the financial environment, switching from periods of momentum to reversal if it is empirically wanted.

The literature on return predictability is not new. The seminal paper of Welch and Goyal (2008) shows that it is extremely hard to predict stock returns using well known predictors in a econometric model, i.e., predictors are not able to outperform the simple historical average of stock returns. After Welch and Goyal (2008) , several other studies have appeared in the literature trying to find bettter predictors or econometric models that could be able to improve predictability (Campbell and Thompson, 2008 , Rapach, Strauss, and Zhou, 2010 , Dangl and Halling, 2012 , Johannes, Korteweg, and Polson, 2014 , Chinco, Clark-Joseph, and Ye, 2019 , Gu, Kelly, and Xiu, 2020 , Liu, Pan, and Wang, 2021 . Some crucial aspects that can be found in several papers that followed Welch and Goyal (2008) are the presence of time-varying coefficients and model combination. In fact, the accumulated academic evidence has shown that parameter instability is able to handle changes in market sentiment, institutional framework and macroeconomic conditions. Additionally, model combination is able to dramatically improve forecasts since it combines important economic information contained in each different predictor.

Inspired by the recent advances on the return predictability literature, our goal is to improve trend-following strategies by the use of model selection and model combination, where different look-back periods can be considered to build momentum measures. We follow the approach of McCormick, Raftery, Madigan, and Burd (2012) to build our dynamic trend return classifier. Our classifier relies on the use of a dynamic logistic regression where parameters are able to be constant or time-varying over time and uncertainties about how far the investor should look into the past to predict the future is dealt by the use of dynamic model probabilities. After assigning probabilities for each model setting, we are able to integrate uncertainties by dynamic model averaging (DMA) or dynamic model selection (DMS). The approach is a binary counterpart of the DMA approach recently used with great sucess in other Bayesian econometric applications (Koop and Korobilis, 2012 , Dangl and Halling, 2012 , Koop and Korobilis, 2013 , Catania, Grassi, and Ravazzolo, 2019 and Levy and Lopes, 2021a . Using discounting methods and distribution approximations, there is no need to use expensive simulation schemes such as Markov Chain Monte Carlos (MCMC), which makes the whole process much faster to compute. It can be viewed as a great advantage for quantitative investors, since the amount of assets available is growing and trading positions are getting faster nowadays.

The binary approach of McCormick et al. (2012) was originally applied to a medical classification problem and it was first introduced in the economic literature in Hwang (2019) where the authors use the binary classification method to forecast recession periods. At the best of our knowledge, we are the first to introduce this dynamic approach in a financial econometric context. Since our interest here is not to predict raw returns but its future direction (buy or sell sign), it fits perfectly to the time-series momentum application. The great advantage of using dynamic model probabilities is to combine different economic informations coming from many look-back periods in a sequential fashion. As soon as new data arrives, the model is able to adapt to new informations, assigning higher probabilities for models using look-back periods with stronger informations.

The idea of combining information from different look-back periods has already ap-peared in the literature before. Han, Zhou, and Zhu (2016) show economic gains when combining informations from short, intermediate and long-term look-back periods to build cross-sectional momentum strategies. More recently, the works of Garg, Goulding, Harvey, and Mazzoleni (2020) and Garg, Goulding, Harvey, and Mazzoleni (2021) explore the impacts of turning points on time series momentum strategies. They show evidences of an increase in the presence of trend breaks in the last decade, leading to a negative impact on final portfolio performance. It happens due to the fact that after a trend reverses its direction, trend-following strategies tend to place bad bets since past momentum can reflect an old and inactive trend direction. The authors propose a trading rule where information of both fast and slow momentum look-back periods are considered if it identifies a turning point. Also, by the use of a machine learning technique, the work of Jiang, Kelly, and Xiu (2020) use stock-level prices images to detect future price directions instead of using returns information. They apply a convolutional neural network model to classify future return signals and perform a cross-sectional portfolio strategy based on these signal predictions. The authors found robust evidences that image-based predictions are powerful to predict future returns. Additionally to the increase in trend breaks in the last decade, in Section 5 we also discuss a topic not well explored by the academic literature on time series momentum: the impacts of the 2009 market rebound on portfolio performance and drawdowns. Similar to the momentum crash observed on cross-sectional momentum strategies after the Great Financial Crisis (Barroso and Santa-Clara, 2015 and Daniel and Moskowitz, 2016) , traditional time series momentum portfolios also suffered from strong trend breaks, leading to huge losses as soon as old negative trends reverted to positive ones. Motivated by the literature on time-series momentum and return predictability, our goal is to provide an econometric solution to deal with trend reversals, minimizing portfolio drawdowns.

The great advantage of our classifier model compared to the works mentioned above is its ability to sequentially learn the importance of each look-back period individually and for each asset in parallel. Using a dynamic model, we are able to understand the timevarying behavior among different momentum speeds individually and assign higher or lower speed probabilities which are updated from most recent data observations. Hence, as soon as a market correction or rebound seems to appear in the data, slower momentum measures start to receive lower probabilities while faster momentum probabilities increase, influencing final predictions. Therefore, the dynamic classifier approach is able to deal with turning points problems in a customizable and automatic fashion. Also, by allowing time-varying parameters, the model introduces higher flexibility to capture pos-itive or negative relations among past accumulated returns and future returns. Hence, for some periods of time, past returns can induce reversal but in others, momentum.

Using futures data from 1980 to September 2020 on 56 assets across four asset classes (equity indices, commodities, currencies and government bonds) we build time-series momentum strategies using information from our dynamic classifier model and compare with the standard naive time-series momentum where the investor just buy or sell each asset based on specific previous returns. We show that the dynamic classifier approach not only produces better out-of-sample accuracy about the future return directions, but also improves significantly portfolio performances. The model specification using timevarying parameters and DMS to predict future trend signals generated a 52% increase in annualized out-of-sample Sharpe ratios compared to the naive approach benchmark. The constant parameter counterpart of the DMS approach also performed quite well, delivering a 44% Sharpe ratio increase compared to the benchmark. We also show that by the use of DMS or DMA, our dynamic binary classifier was able to explore sudden turning points during the 2009 momentum crash. While the naive time series momentum strategy produced strong losses during the crash period (25% of cumulative return losses in 16 months), applying dynamic momentum speed selection with time-varying parameters earned 22% of total cumulative return gains in the same period. Finally, in the same spirit of Fleming, Kirby, and Ostdiek (2001) , we show that a mean-variance investor will be willing to pay 425 basis points as annualized management fee to switch from the standard naive time-series momentum strategy to our dynamic classifier approach with timevarying parameters and look-back period selection.

The rest of the paper is organized as follows. Section 2 explains the traditional timeseries momentum strategy and how to create portfolios based on specific look-back periods. Section 3 describes the econometric methodology behind the dynamic classifier. In Section 4 we describe the data used and explore the empirical results of dynamic portfolios strategies, both in terms of out-of-sample predictability and economic performance. In Section 5 we discuss the economic performance during the well known 2009 crash period and the subsequent years. Finally, Section 6 concludes.

In this Section we descrive how the most common time-series momentum is performed. The definitions are based on the main literature on time series momentum cited above, in special the work of Moskowitz et al. (2012) . Let r it represent the log-return of security i at month t. We can define Mom L it as the momentum measure at time t for security i and look-back period L as:

which is basically the cumulative return from the previous L periods until time t − 1.

Using the momentum measure, we can find trading signals. For example, if Mom L it ≥ 0, it represents a long position (+1) and Mom L it < 0 indicates a short position (-1) on asset i. It is common practice in the literature to size each asset position so that it has an ex ante annualized volatility target of σ tg = 40% (Moskowitz et al., 2012) . Hence, the position size is chosen to be 40%/σ t , where σ t is the ex-ante asset volatility estimate. In this manner, the time-series momentum return for asset i at time t will be:

The usual volatility model applied in the literature is the EWMA volatility measure. The annualized volatility can be represented as

where D is the number of observations within a year and δ is a decay factor 1 We recognize the simplicity of this volatility measure to capture the right movements of returns volatilities. However, it is important to highlight here that our goal in this study is not to perform volatility timing strategies, but to show performance improvements via signal predictions based on momentums. In order to approximate our study as closely as possible to the format used by the literature and to fairly compare our results we use the same volatility model for all different strategies in the paper. Differently from the work of Kim, Tse, and Wald (2016) , where the authors argue that time-series momentum strategies are driven by volatility scaling, we show in our results significant portfolio improvements by just modeling return signals instead of volatilities. Considering a holding period of one month, the return of the overall portfolio diversifying across the N t assets available at time t is simply

Note that in the standard time-series momentum strategy, return signals are based just on the observation of past returns, i.e., there is an absence of an econometric model to infer about the correct future return directions and we argue here that it can dramatically reduce the final portfolio performance. Since the standard time-series momentum strategy does not take into account the relationship between past momentum measure and future returns and how it changes over time, the investor is giving up the opportunity to learn about new economic environments to improve signal forecasts and possibly is incurring in misleading trading positions. This is why, for now on, we will refer to the standard time-series momentum strategy as naive.

Now we describe in details the dynamic classifier model used in this paper. It is mainly inspired by the work of McCormick et al. (2012) and can be viewed as an econometric substitute for the traditional naive time-series momentum approach. As before, the investor wishes to antecipate the future direction of asset returns in order to set long or short positions. Therefore, we still have a binary classification problem. The great difference now is that decisions will be based on a statistical model that is able to better digest today's information complexity to infer about the future return direction and improve trading decisions.

The econometric method is based on a dynamic logistic regression model for each individual return series. The model is written in state-space form, where s t represent a binary response of a individual asset return, i.e., s t = 1 when returns are greater or equal to zero and s t = 0, otherwise. Let x t be a d-vector containing a set of possible momentum measures as predictors for the specific asset return signal. Then:

where p t is the probability of a positive return signal and the d-vector θ t contains regression coefficients representing the relationships among momentum and future return direction and also an intercept coefficient. Note that coefficients are allowed to evolve over time as random-walks. Here, we initially consider a single arbitrary model containing a specific set of momentum predictors x t . Later, in the next section we explore our dynamic momentum learning procedure where different models with different momentum predictors are considered. For now, consider the existence of K possible look-back periods we may use and let x t contain only one of the 2 K possible combination of predictors we may include in the particular model. 2 Let D t−1 represents the whole information available until time t − 1, i.e., D t−1 = s 1 , . . . , s t−1 . Hence, the posterior distribution for coefficients at time t − 1 is

where m t−1 and C t−1 represent the posterior mean and covariance for θ t−1 at time t − 1 . The prediction equation for time t given information until time t − 1 is given by

where a t = m t−1 and by the use of a discount factor 0 < λ t ≤ 1 we can obtain the predicted covariance matrix of states as R t = C t−1 λ t . The use of discounting methods simplifies estimation and is widely applied in the bayesian literature. It can be viewed as a way to discount more heavily past information. A discount factor lower than 1 imposes time-variation in coefficients while λ t = 1 set coeficients to be constant (see West and Harrison, 1997 , Prado and West, 2010 and Raftery, Kárnỳ, and Ettler, 2010 for details about discounting methods).

After computing the prior state distribution for time t, we are able to generate a return signal prediction for the specific asset as:

which will be used to portfolio decisions as we explain in Section 4.

At time t, the investor observes s t and is able to update her state estimates by the Bayes' rule:

which is simply the product of the likelihood at time t and the prediction equation for θ t defined above. Since Equation (9) is not available in closed-form, McCormick et al. (2012) approximate this posterior with a normal distribution. Let l(

and Dl(θ t ) and D 2 l(θ t ) being its first and second derivative. The mean of the approximate distribution will be the mode of Equation (9) and its estimate will be given by

and the state variance is updated by

In order to apply DMA or DMS (see below) and tune discount factors, the predictive likelihood will be taken into account:

However, since this integral is not available in closed form, a Laplace approximation is used such that:

which makes computation much faster, since no expensive simulation schemes are required. In order to tune λ t , we propose a grid of values for λ t and sequentially select over time the one such that Equation (12) is maximized. In our empirical section below, we use λ t ∈ {0.98, 0.99, 1} for the time-varying parameter (TVP) setting. Hence, our approach is able to induce higher or lower degree of variability in coefficients if it is empirically suited. For the constant parameter (CP) case, no discounting is applied, so we fix λ = 1 for all periods of time.

Considering the uncertainty around which look-back period brings more information about future direction of returns, now we explain how momentum (speed) uncertainty can be inserted in our dynamic classifier model. Suppose there are K possible look-back periods an investor may consider for a time-series momentum strategy for a specific asset. Since there is uncertainty about the amount of economic information each look-back period may provide to infer the future direction of returns and what is the best speed (or combination of speeds), the investor is faced with the problem of momentum uncertainty. How is an investor able to understand the complexity of the trend structure just by looking at past returns? There are several different paths that may define a positive or negative trend. For instance, one asset may have a slower (long period) positive momentum trend, but a fast (short period) negative momentum. Also, there are cases of long and short positive (negative) momentum trends, but negative (positive) intermediate trend. We argue here that those patterns are changing over time, since the financial market is continuously adapting to new environments. Therefore, the idea of dynamic momentum learning is to compute all the M = 2 K possible models 3 in parallel and assign dynamic model probabilities for each one in such way that the investor is able to sequentially learn from past model mistakes, switching to model settings that are performing better in the recent past or combining all of them weighting by their model probabilities.

as the posterior probability of a model i with a specific subset of momentum predictors at time t − 1. 4 Following Raftery et al. (2010) and McCormick et al. (2012) , the predicted probability of the model i given all the data available until time t − 1 can be expressed as:

where 0 ≤ α ≤ 1 is another discounting (forgetting) factor. The main advantage of using α is avoiding the computational burden associated with expensive MCMC schemes to simulate the transition matrix between possible models over time. This approach has also been extensively used in the Bayesian econometric literaure in the last decade (Koop and Korobilis, 2013 , Zhao, Xie, and West, 2016 , Lavine, Lindon, West et al., 2020 and Beckmann, Koop, Korobilis, and Schüssler, 2020 . After observing new data at time t, we update model probabilities following a simple Bayes' rule:

which is the posterior probability of model i at time t and p i (s t | D t−1 ) is the predictive density of model i evaluated at s t . Note that the predictive densities have already been computed as we have shown in Equation (12), which implies that no extra computations are required here to update model probabilities. Hence, upon the arrival of a new data point, the investor is able to measure the performance of each model i and to assign higher probabilities for those models that generate better out-of-sample performance. One possible interpretation for the forgetting factor α is through its role to discount past performance. Combining the predicted and posterior probabilities, we can show that

Since 0 < α ≤ 1, Equation (15) can be viewed as a discounted predictive likelihood, where past performances are discounted more than recent ones. It implies that models that generated higher out-of-sample performance in the recent past will receive higher predictive model probabilities. The recent past is controlled by α, since a lower α discounts more heavily past data and generates a faster switching behavior between models over time and α = 1 represents no forgetting information. The value of α t is sequentially selected over time such that it maximizes the average predictive likelihood over all different model:

Similar to the discount factor λ t , we propose a grid of values α t ∈ {0.99, 1} such that the model can switch between forgetting and no-forgetting information over time.

Using predictions for each individual model i, s i t|t−1 , we compute the dynamic model average prediction ( s DMA t|t−1 ) weighting by each individual model probability:

while dynamic model selection (DMS) is applied by simply selecting the model i with the highest predicted model probability, π t|t−1,i , for period t, .

For each asset available, the investor can apply the whole procedure described above and classify long or short position for individual assets based on different out-of-sample signal predictions. In the next section we explain in details the time-series momentum strategy using the dynamic classifier approach and describe the data used in our empirical study.

Now we discuss how to incorporate output informations from the dynamic classifier approach as inputs in a dynamic time-series momentum strategy. Supposing there are N t assets available for investing at time t, instead of considering sign(Mom L it ) as a signal classifying long or short positions, the investor will use the DMA or DMS classification prediction for each individual asset to generate trading signals for her portfolio. Hence, if s t|t−1 ≥ c it indicates a long position (sign( r t|t−1 ) = +1) and if s t|t−1 < c we have a short position (sign( r t|t−1 ) = −1) in that specific asset, where c represents a cutoff selected by the investor. The most direct and simple cutoff choice is to consider c = 0.5 or to apply a sequential grid search to maximize out-of-sample accuracy for each individual asset. In our empirical results we show results using both approaches. Therefore, the return of the overall time-series momentum portfolio using DMA will be given by

and the return of the time-series momentum portfolio with signals obtained from the DMS procedure will be given by

where results with constant and time-varying parameters are shown in the empirical section. We also show portfolio performances when, instead of using model combination or selection, we use models with single predictors. For example, when we display results for TVP-12m, we are referring to a model using the twelve month momentum measure as the only predictor and time-varying parameters are allowed. In this case, portfolios are formed as in Equations (18) and (19), where signals are coming from this specific single predictor model. Using these simple diversified portfolios, we follow the majority of the works related to time-series momentum strategies. It allows us to evaluate the real economic improve-ments due exclusively to our dynamic classifier method using momentum combination or momentum selection compared to the standard time-series momentum classification and not due to volatility timing effects. 5 We let volatility timing in time-series momentum strategies as a future research extension to our approach.

Therefore, for each period of time, using the method described above, the investor uses signal forecasts for each individual asset s it as inputs in the dynamic portfolio, sizing each position based on ex ante volatilities as in Equation (18). Since the number of assets available for investors at the beginning of the sample does not comprise the entire data sample, it is important to notice that the number of assets which enters the portfolio N t varies over time.

In our empirical study we will consider several specification in order to compare the benefits of using time-varying parameters, model averaging and model selection to the naive time series momentum (Naive-TSMOM). We show statistical and economic results for Naive-TSMOM strategies considering fixed look-back periods of 1, 2, 4, 6, 8, 10 and 12 months. Then, we also report performances when using this same fixed look-back periods when the trading signals are obtained from our dynamic classifier model. That is, when there is no model uncertainty and a single momentum measure is applied within our econometric model. In this cases, we divide results for look-back periods of 1, 2, 4, 6, 8, 10 and 12 months for constant parameters (CP: λ = 1) and also when time-varying parameters are allowed (TVP: λ t ∈ {0.98, 0.99, 1}). The CP specification can be viewed as a recursive static logistic regression. Finally, we show results for both CP and TVP when DMA and DMS approaches are applied, meaning that the investor is dynamically learning momentum speed probabilities and averaging or selecting predictions based on those probabilities.

Following Rubesam (2020), we use data of continuous prices for 56 futures contracts downloaded from Refinitiv/Datastream for the period from January 1980 to September 2020. The data covers 12 developed market equity index futures, 25 commodity futures, 11 developed sovereign bond futures and 8 currency pairs futures. The contracts are rolled over the last trading day of the expiry month and adjusted at the roll date to avoid artificial returns. Since our methodology is inherently built for one-step ahead predictions and daily portfolio rebalancing prevents the best exploitation of longer momentum effects and increases transaction costs, we decide to follow the great majority of the literature and use monthly log-returns in our study. We consider one month as holding period, that is, we rebalace portfolios in a monthly basis. Tables (8) and (9) in the Appendix provide a statistical summary of future contracts and its start dates.

In our analysis we use the first three years (36 months) of data for each individual asset as a training period for our models and use the rest of the subsequent periods as an outof-sample evaluation period. Hence, as soon as a new asset is available in the data set, it enters in the portfolio just three years later. Although the naive time series momentum strategy does not rely on a econometric approach and does not require a training period, in our empirical results we also discard the same initial sample window in order to fairly compare all approaches with the same data set.

We analyse how the proposed dynamic econometric specifications perform in terms of out-of-sample forecasting by comparing models via mean absolute errors (MAE). In order to provide an overall metric considering all different asset returns and its particular different periods within the real portfolio application, we decided to stack all asset returns as they were a single series and average absolute error considering the sum of the test sample periods of each asset. More specifically, consider that for each asset i the training period ends at month T train,i and its number of months in the test sample is T test,i , hence we compute MAE of an econometric approach j as:

where T is the last month of our sample (September, 2020). Differently from an econometric forecasting model, the naive strategy does not provide a probability prediction per se, so the investor should buy or sell based only on the sign of past accumulated returns. Therefore, we compare absolute errors within different econometric models, where the chosen benchmark is the binary classifier using the twelve months momentum as the only predictor and using constant parameters (CP-12m). We report relative percentage performance improvements

Hence, positive numbers represent a percentage reduction in terms of out-of-sample forecasting error compared to the statistical model using just the look-back period of 12 months and constant parameters.

Since the forecasted value from our dynamic classifier approach is an estimate for the probability of a positive return, a true positive (TP) occurs when the specific model forecasted a probability of positive returns greater or equal than a cutoff c and it coincides with a positive realized return, while a true negative (TN) occurs when the model forecasted a value lower than the cutoff c and it coincides with a negative realized return. At the other hand, false positives (FP) and false negative (FN) represent the case where the realized return is the opposite to what the forecasted values were indicating. As in Jiang et al. (2020) , we compute classification accuracy as:

where again we consider the sum of TP, TN, FP and FN for all assets available to compute total accuracy for each strategy approach. In our main results we consider a cutoff c = 50%. However, as a robustness test, we also report results considering cross-validation procedures to sequentially select the best cutoff among different values in two grids and for each asset individually over time. The first cross-validation procedure (CV 1 ) considers a grid c ∈ {0.49, 0.491, ..., 0.509, 0.51} and the second cross-validation procedure (CV 2 ) considers c ∈ {0.45, 0.46, ..., 0.54, 0.55}. Those are reasonable grids of values, since much higher or lower values tend to induce mislieading trading signals. The idea of the grid search CV is to select the c such that it produces the highest accuracy rate over time. The CV is repeated once a month and we use the last three years of out-of-sample accuracy observations to select the best c. In this sense, such smaller grid of values also avoids aditional computational burden, since the procedure is repated several times for each asset available. Inspired by a Bayesian Decision Theory perspective, we show in the Appendix an additional portfolio exercise where the investor maximizes an expected utility where probabilities of different scenarios are coming from our classifier models. Table (1) below shows the results for forecast performance without CV (c = 50%), for both CV procedures and the MAE metric.

From Table ( 1), the column referring to MAE shows forecast error reductions for different models compared to the CP classifier approach using a twelve month momentum predictor. Both DMA and DMS demonstrate higher improvements, specially when TVP are allowed. We can note that DMS has a slightly better performance compared to DMA and none of the models using a single preditor was able to outperform model selection or combination. Focusing on classification accuracy, the first column of Table (1) shows results when a single cutoff c = 50% is applied for all assets and for all periods of time.

Since this accuracy can be obtained from a signal, the Naive-TSMOM is able to infer about this trading signals by just looking to past momentum measures. The bottom panel of the table shows a good performance for Naive-TSMOM when longer look-back periods are considered. Indeed, the Naive-TSMOM of twelve months generated a forecast accuracy of 52.3%. This is in line with the empirical academic research, where the look-back period of twelve months is the main setting for many studies (Moskowitz et al., 2012 and Rubesam, 2020) . The first and second panels of Table (1) show that by the use of a binary classifier approach, forecast accuracy can be considerably improved. It is interesting to note that i) for model settings with a single momentum predictor and short look-back periods, the performance tend to be better than longer look-back periods, the opposite to what is observed from Naive-TSMOM strategies, ii) single predictor models with constant parameters and longer look-back periods tend to perform better than their time-varying parameter counterparts, while for shorter look-back periods time-varying parameters are slightly better than their constant parameter counterpart, and iii) model combination and selection were able to significantly increase out-of-sample classification accuracy. For both CP and TVP, DMA and DMS delivered accuracies higher than 53%, with DMS-CP being of 53.3%. In the next sections we show that, although DMS-CP performed slightly better than DMS-TVP in terms of total accuracy, the TVP version has greater versatility to recognize sudden turining points, reducing drawdowns and performing better during bad periods for time-series momentum strategies, which ends up improving its final economic performance.

Although results in Table ( 1) may still seem low, differently from other binary classification applications and literatures, it is important to remember that in the context of return predictability where forecasting is a huge challange, any tiny accuracy improvement can be translated in strong economic improvements across time for the investor (Chinco et al., 2019 , Gu et al., 2020 and Jiang et al., 2020 . In fact, Jiang et al., 2020 show evidences that small accuracy gains in the order of 1% is able to be translated into considerable Sharpe ratio gains for trading strategies based on these predictions.

In order to test the robustness of results for different cutoffs, we let them change over time and for each individual asset. The second and third columns from Table (1) show results when c are coming from a cross-validation procedure, where each column utilizes a different grid of possible values, as explained above. In fact, the results are still robust with small improvements for the TVP in the CV 1 case for longer look-back periods and the DMS approach, whereas for the CP counterpart the results are slightly improved for shorter look-back periods and harmed for longers and model combination and selection. In the CV 2 , there is a small improvement for short momentums in both the TVP and CP models.

In general, the results are still quite similar when c = 50% and model combination and selection continue to show accuracy gains compared not just to single predictor models but specially to the Naive-TSMOM strategy.

At the end of each month, the investor observes the data available and predicts the future directions of returns for the end of the next month. After obtaining forecasting outputs (and calibrating signals cutoffs when applying the cross-validation procedure), the investor is then able to use the prediction outputs as inputs in a portfolio allocation problem, sequentially rebalancing her portfolio. Using predictions from each model setting, we build portfolios as in Equations (18) and (19).

In order to show economic improvements for investors, we show important measures of portfolio performance such as annualied mean excess returns (Mean), volatilities (Vol.), Maximum Drawdowns (Max DD) and Sharpe Ratios (SR). The latter is commonly used among practicioniers in the financial market and by academics. Despite its popularity, SR is an unconditional measure and is not well suited for dynamic allocations with timevarying and sequential predictions (see Marquering and Verbeek, 2004) . Also, they do not take into account the investor risk aversion. In order to overcome this problems and improve our model comparisons, we follow Fleming et al. (2001) and provide a measure of economic utility for investors. We compute ex-post average utility for a mean-variance investor with a quadratic utility and calculate the performance fee that an investor will be willing to pay to switch from the standard time-series momentum strategy to the dynamic classifier method (DCM):

where γ is the investor's degree of relative risk aversion, R DCM p,t is the gross return of the specific DCM portfolio and R Naive−12m p,t is the gross return from the Naive-TSMOM strategy portfolio when a look-back period of twelve months is considered. As in Fleming et al. (2001) , we report our estimates of Φ as annualized management fees in basis points using γ = 10 as risk aversion. Notice that Φ is computed by equating the average utility from the investor applying the Naive-12m strategy with the average utility of the DCM portfolio (or any other alternative specification).

In an effort to bring our results as closer as possible to a real world example, we follow Baltas and Kosowski (2020) and Rubesam (2020) and report results net of transaction costs. Both rebalancing and rollover costs are taken into account. Each asset class will rely on different transaction costs following the same values in basis points reported in Baltas and Kosowski (2020) .

First, we show in Table ( 2) annualized results when no CV is performed to obtain the cutoffs. Hence, c = 0.5 is selected for all periods of time and for all assets. All timeseries momentum strategies are scaled to an ex-post annualized volatility of 10%. Focusing in the bottom panel of Table ( 2), which refers to the Naive TSMOM strategy, we note similar performances to other studies. When using L = 12 months, the strategy performed particularly well over the last decades. It delivered a SR of 0.81 and 8.1% of annualized average excess return, both metrics being higher than using shorter look-back periods. For instance, a naive strategy using 6 months look-back period would deliver almost the half of the traditional 12-month look-back period strategy. Additionally, shorter look-back period strategies tend to generate much higher portfolio turnovers, which also harm final performance due to transaction costs. Within the Naive-TSMOM group, all shorter lookback period strategies have presented lower maximum drawndowns than the 12-month strategy. Finally, in terms of utility gains, an investor applying any shorter momentum speed would be willing to pay an annualized managment fee from 89 to 411 bps to use the naive-approach with 12-month look-back period.

When we focus on econometric approaches, the results differ to the naive approach. We first notice that using the 12-month momentum as a single predictor does not imply a better portfolio performance and the 1-month predictor was able to deliver strong performances, in a comparable magnitude to the 12-month naive benchmark. Single predictor models tend to induce much lower portfolio turnovers than the naive approach, specially when constant parameters are applied. In fact, the constant parameter group delivered similar or even better results compared to the time-varying parameter case, except for 1 and 6-month momentum predictors. However, when model selection or averaging are applied to different momentum predictors, models with time-varying parameters appear with superior performance compared to its CP counterpart. The DMS-TVP approach was able to dramatically improve portfolio performance compared to the naive approach, generating an annualized Sharpe ratio of 1.23, representing an increase of more than 50%, while its CP counterpart delivered a SR of 1.17. For both CP and TVP panels, DMA performed better than any individual momentum predictors but worse than model selection.

One important aspect of model selection and combination is the ability to reduce maximum drawdowns with higher reductions when TVP are allowed. While the naive benchmark suffered a drastic maximum drawdown of 25.3% (see Section 5 for a deeper discussion on drawdowns and the 2009 momentum crash), the DMA-TVP suffered a maximum drawdown of 13.1%, almost half the size of the losses obtained from the naive benchmark.

In terms of monthly turnover, both model selection and combination increase position changes over time compared to single predictor models and the naive approach, specially for TVP models. However, this turnover increase is more than compensated with higher accuracy and portfolio returns. Our results confirm that dynamically combining economic informations from different look-back periods using DMA or sequentially selecting the best look-back periods by DMS improves portfolio performance, being consistent with our previous out-of-sample accuracy results. The last column of Table ( 2) shows that a mean-variance investor will be willing to pay 425 bps as annualized management fee to switch from the naive 12month time-series momentum strategy to our dynamic classificer approach with momentum speeds selection and time-varying coefficients. The economic performances are still strong when using DMA and/or constant parameters.

In order to investigate portfolio performances within different asset classes, in Figure  ( 1) we show the differences in Sharpe ratio across equities, bonds, commodities and currencies (FX). First, it is interesting to notice the diversification effect when we allow to combine different asset classes in the same portfolio, because none of the individual asset classes in Figure ( 1) was able to deliver a Sharpe Ratio as high as in the whole portfolio in Table ( 2). Our dynamic binary classifier method performed specially better among commodities, an asset class traditionally explored in CTAs by the use of trend-following strategies. Regardless of the model specification, applying our econometric approach to commodities delivered substantial improvements compared to the naive benchmark, with even stronger results for TVP-DMS. There was also a small improvement among bonds and similar performance compared to the naive benchmark for equities. The only asset class where our econometric approach clearly performed worse than the benchmark was for the FX class.

In Tables (3) and (4) we show results when the classifier cutoffs are obtained from crossvalidation procedures, as described in Section (4.2). Out-of-sample portfolio performances are still robust for different cutoffs. Model selection and averaging continue to improve Figure 1 : Sharpe Ratios by asset classes final performance outcomes not just in terms of Sharpe Ratios but also in terms of utility gains for the investor. For CV 1 in Table ( 3), DMS-TVP was able even to slightly improve compare to the fixed c = 50% case. A mean-variance investor would pay 439 bps to give up the naive benchmark strategy to the DMS-TVP in this setting, while she would pay 349 bps for its CP counterpart. For the CP panel the performance is still similar, but with tiny decreases, while for TVP there was small improvements in the overall evaluation.

For the second cross-validation procedure (CV 2 ) in Table ( 4), performances are still strong compared to the naive time-series momentum strategy. There are small portfolio outcomes declines related to those observed when c = 50%, but results are still consistent, with DMS and DMA delivering important improvements and TVP outperforming its CP counterpart. It is interesting to notice smaller portfolio turnovers than previous results. Table (4) shows that a mean-variance investor would pay 278 bps to give up the naive benchmark strategy to the DMS-TVP, while she would pay 240 bps for its CP counterpart. The DMA setting also performed well, with SR higher than 0.90 for both CP and TVP. Finally, model selection and combination were able to reduce maximum drawdowns, regardless of the dynamics on coefficients.

The results in this section confirm that the discretionary way of building trading positions based solely on the observation of past momentum is not enough to distinguish between future uptrends or downtrends. By ignoring the time-varying patterns of dif- The table reports economic performance from classifier models using constant parameters (CP) and timevarying parameters (TVP), considering single predictors (1m, 2m, 4m, ..., 12m) or applying DMA or DMS with all predictors in the model space. Classifier cutoffs are obtained from a sequential crossvalidation procedure over time for each individual asset, where c ∈ {0.45, 0.46, ..., 0.54, 0.55}. All strategies are scaled to an ex-post annualized volatility of 10%.

ferent momentum speeds and its relations with future returns, the investor is giving up the opportunity to sequentially learn about trend instabilities to improve trading signals. Also, although we observe just small economic gains on introducing dynamics on coefficients compared to CP models, it enabled the investor to learn the time-varying relations between momentum speeds and future returns and, as we show in the next section, this time-varying pattern was highlighted during the 2009 TSMOM crash. Therefore, we argue here that by the use of a dynamic classifier model and momentum speed learning, the investor is benefited not just in terms of higher Sharpe Ratio and returns, but also by an increase in final utilities after accounting for a risk aversion measure and considerable reduction on portfolio drawdowns, reducing losses during bad periods for time series momentum strategies.

As we previously discussed, the dynamic classifier was able to dramatically reduce drawdowns. In this section we explore the benefits of automatically let the model learn a turning point indicating a market rebound. It is well known the failure of momentum strategies during market rebounds, in particular the 2009's. Daniel and Moskowitz (2016) investigates cross-sectional momentum crashes and show that the 2009's was largely impacted by the market rebound. After a period of negative trends, at March of that year the market started to strongly recover. However, at that time, the 12 months momentum strategy was mainly selling assets with high betas (strong positive correlation with the market) and buying assets with low betas (strong negative correlation with the market). As soon as the market recovered, the momentum strategy faced huge losses, since it was selling assets with strong positive recovery and buying assets with weak recovery or even negative growth.

In the present study we investigate a similar pattern for the naive time-series momentum strategy. During March 2009, the naive benchmark started to suffer huge losses that lasted until June 2010, accumulating a total loss of 25.3%, representing its maximum drawdown in the last four decades. The literature has already recognized the weakness of TSMOM strategies after the Great Recession (Baltas and Kosowski, 2020 , Garg et al., 2020 and Garg et al., 2021 , where the usual explanation goes from higher asset correlations to the increase of trend breaks but, at the best of our knowlegde, the literature have not discussed the special period of the 2009 market rebound. This important drawback from naive time-series momentum strategies and the lack of academic discussion on the subject motivate us to explore this problematic period and to show how our dynamic classifier approach is able to deal with trend breaks with great sucess.

The great weakness of time-series and cross-sectional momentum strategies is its lack of ability on recognizing such turning points. A longer momentum speed strategy is not able to recognize a sudden trend break, so the investor keeps following a trend that no longer exists. At the other hand, a very short momentum speed is able to identify new trends when they start to appear, but fails to enter in more stable medium/longer trends. Additionally, sticking to a short momentum speed strategy tend to be less profitable and riskier. Therefore, the great challange is to recognize the periods when this new trends begin by considering informations from fast momentum signals or learning when older long trends disappear. However, we argue here that those patterns are not clear from the simple observation of past returns. Hence, the investor can rely on an econometric approach that is able to digest those patterns in the data. By the use of a dynamic model, the time-varying relations between different momentum speeds and return signals can be capture. Therefore, our dynamic binary classifier approach suits quite well to this kind of decision making problem, since by the use of model selection or model averaging we are able to assign dynamic probabilities for different iterations of momentum speeds, moving from an old type of trend to a new one if it is empirically desirable.

In Table (5) we compare the performance of the dynamic classifier method and the Naive-TSMOM during the crash period. The second column of the table shows the accumulated return during the crash, where the Naive-TSMOM benchmark suffered 25.3% of losses. In fact, one aspect that is not well discussed in the literature is that the naive benchmark was able to recover from those losses just at the end of 2014! The 8 and 10-month naive strategies also delivered negative returns during the crash period, while shorter momentum straties were able to survive the crash period with positive returns. Interesting, the 4-month naive strategy presented a Sharp ratio of 0.83, while the 12-month benchmark had a strong negative return adjusted by risk of -1.57.

Table (5) gives evidences of failure in traditional longer TSMOM signals to antecipate drastic trend changes. At the other hand, by looking for the first and second panels of the table, we notice that our dynamic binary classifier was able to perform extremely better than the naive approach. As it was expected, fast momentum predictors delivered a very high accumulated returns, in speciall when TVP are allowed. The 1-month single predictor for the TVP setting obtained 41.3% of total accumulated excess returns during the crash period with a impressive Sharpe Ratio of 2.04. The only look-back period delivering a negative performance was the 12-month single predictor, but the losses were tiny compared to the naive approach. However, as mentioned before, sticking solely to a very fast look-back period can induce lower performances in the long-run, then recognizing the time-varying importance of different momentum speeds is crucial for a stronger portfolio with lower risk and drawdowns. Since the DMA and DMS settings were built exactly with the goal of learning different momentum speed dynamics, we can notice that they provide strong portfolio performances not just on the long-run as we have shown in last sections, but also during the 2009 momentum crash period. Table (5) also makes clear the advantage of allowing time-varying parameters, since during bad periods the economic relations among financial data can change in a matter of few periods. When the investor considers the DMS-TVP approach, she is able to obtain 22.1% of accumulated returns during the momentum crash with a robust SR of 1.32! It means that a mean-variance investor would pay 4,541 bps to switch from the 12-month naive approach to the DMS-TVP method.

In order to provide evidences that the dynamic classifier method with TVP is capable of learning from past mistakes and assign higher probabilities for those look-back periods that are performing better in the recent past and reducing probabilities for trends that no longer exists, Figure ( 2) shows inclusion probabilities for momentum predictors of 1, 2, 10 and 12-months, averaged across all different assets. For a given asset, a inclusion probabilitity (IP) for a specific momentum speed L can be defined as

where J represents the subset of models containing the specific momentum predictor L and 1 (j⊂J) is an indicator function taking the value of 1 if the model j is cointained on J. Hence, a higher IP L means that models with the momentum predictor L are performing better in the recent past and then receiving higher model probabilities. Since we average IP L for all assets available at the time period, Figure ( 2) can give us a sense of the overall importance of longer ou shorter trends over time and in special the 2009 momentum Crash.

It is evident that as soon as the rebound starts at the beginning of 2009, models with look-back period of 10 and 12-month momentum predictors sequentially received lower probabilities while the 1-month momentum predictor increase in importance. The 2month momentum predictor continued to oscillate around its older inclusion probabilities, but remaining higher than longer momentum predictors. It is interesting to notice that since the 2009 Crash, longer momentum speeds remained much less important than faster momentum speeds, in line with recent evidences on the increase of trend breaks (Baltas and Kosowski, 2020 , Garg et al., 2020 and Garg et al., 2021 . At the same time, the 1-month look-back period remains as the predictor with highest inclusion probabilities. In fact, although longer trends were more important than they are nowadays, the 1-month momentum already had greater importance even before the Great Recession. Therefore, Figure ( 2) gives evidences that sequentially learning the importance of each momentum speed, combining or selecting those different informations to generate outof-sample signal forecasts was able to deal with the 2009 trend break problem with great sucess, as Table (5) highlights. The table reports economic performance after the time-series momentum crash period (2010m07 until 2020m09) from classifier models using CP and TVP, considering single predictors (1m, 2m, 4m, ..., 12m) or applying DMA or DMS with all predictors in the model space. Results consider a classifier cutoff of c = 50%. All strategies are scaled to an ex-post annualized volatility of 10%.

After the Great Recession, the number of turning points increased considerably for different assets. Garg et al. (2020) show that there is a negative relation among the number of turning points and Sharpe Ratios of TSMOM strategies. In order to show the robustness of our dynamic classifier approach on the subsequent periods of the 2009 Crash, Table (6) displays portfolio results from July 2010 to September 2020. Our results confirm the weakness of the Naive-TSMOM strategies on the post Great Financial Crisis, as observed in the works of Rubesam (2020), Garg et al. (2020) and Baltas and Kosowski (2020) . The naive benchmark strategy obtained a Sharpe ratio of 0.50 during the period, which means a reduction of about 40% compared to the whole sample evaluated before and the results remain weaker regardless of the look-back period considered.

When our binary classifier is applied, the optics is still very optimistic. Indeed, what can be seen is a much stronger performance for the subsequent period of the 2009 Crash compared to the Naive-TSMOM. For any single predictor setting, the performances are better than the naive benchmark. Portfolio improvements are observed regardless of the dynamics induced in coefficients. The DMS-TVP was able to generate a Sharpe Ratio of 1.10, a 120% increase in relation to the naive benchmark, which would require an annualized management fee of 609.3 bps for the investor give up the traditional naive benchmark to start using the DMS-TVP model. It is interesting to notice that, for this particular sample period, the CP setting performed even better than the model settings where TVP are allowed. The DMS-CP have showed a Sharpe ratio of 1.18, representing 698.9 bps as management fees to use switch from the naive benchmark to this particular model setting. These results demonstrate evidence of no incremental performance for dynamics on coefficients and a simple constant parameter binary classifier model is able to successfully deal with the amount of turning points in the post 2009 Crash. The most important pattern observed in the period is the strong performance of the single 1-month momentum predictor. The results are in line with Figure (2) , where the 1-month look-back emerged with higher inclusion probabilities than longer/slower momentum measures.

Finally, just for the sake of curiosity, one of the highest drawdowns from the naive benchmark strategy was exactly during the Covid period. Since April 2020 to September of that year, the traditional 12-month time-series momentum strategy accumulated 8.6% of return losses. The performance was not worse in 2020 because many assets at the very beginning of the year were signaling negative momentum such that, when the market really suffered huge losses in March, the strategy was able to profit from negative trends, earning 8.7% in that month. Hence, from March to Semptember, the naive benchmark accumulated just 0.6% of losses. At the other hand, the DMS-TVP was able to deliver 18.4% of accumulated returns from March to September. Its CP counterpart, the DMS-CP model, also performed quite well in this period, earning 15.0% of accumulated returns.

Therefore, Table (6) gives evidences that the the dynamic classifier performance is robust even for periods of higher trend breaks. Since dynamic model probabilities are able to dynamically assign higher or lower probabilities to models with different momentum speeds, as we have showed in Section 3, new financial environments are not enough to weaken its final portfolio performance. What we actually see is the oposite, where period of higher changes in financial trends are accompanied by better returns adjusted for risk.

Since the work of Moskowitz et al. (2012) , the literature on trend-following strategies has grown rapidly and its applicability has spread throughout the financial industry. However, there is still a lack of discussion of how to incorporate econometric models to help investors to learn about better momentum speeds over time. From the investor perspective, the better understanding of the time-varying relations from past accumulated returns and future return signals are crucial for portfolio construction. Recent evidences has shown that standard discretionary time series strategies tend to suffer stronger breaking trends and crashes, which dramatically harm portfolio returns adjusted for risk.

In this study we propose the use of a dynamic binary classifier model where investors can sequentially learn the sensitivities between past returns and future signals. Imposing time-varying parameters, the model is able to adapt to changes in the financial market, moving faster from momentum to reversal if it is empirically wanted. Also, by the use of dynamic model probabilities, the approach is able to recognize sudden turning points, sequentially switching from slow to fast momentums after a market rebound, dramatically reducing drawdowns and momentum crashes. Our results show not just better forecasting accuracy gains compared to the naive time series momentum strategy but also that an investor using the dynamic classifier approach earns annualized Sharpe Ratios much higher than the naive benchmark. We analyze different model specifications, cutoffs and subsamples and results still have shown robustness. The performances remained quite strong even after the Great Financial Crisis. Considering a mean-variance investor with a quadratic utility, we show that she will be willing to pay an annualized management fee of 425.1 basis points to switch from the naive 12 months time series momentum strategy to our dynamic classifier approach with model selection and time-varying coefficients. Therefore, it generates not just strong portfolio performance, but great economic utility gains for investors. We show that utility gains are even higher during the 2009 momentum Crash and in the last decade. Those are good news for portfolio managers who are interested in improving trend investment strategies in a unstable financial world with high model uncertainties and rapid and complex changes over time.

The strong results obtained using future contracts, in special among commodity futures, motivate us to consider as an extension for future research the use of a larger set of commodities to be analyzed. Also, we also pretend to extend to a larger cross-section of equity returns. Inspired by the recent works of Jiang, Kelly, and Xiu (2020) and Kelly, Moskowitz, and Pruitt (2021) , where the authors also apply econometric forecasting models to portfolio construction, our future interest is to test the dynamic classifier approach for cross-sectional momentum strategies, building long-short portfolios by different quantiles of model ranking predictions. We believe that this extension can be seen as a strong return forecasting model competitor for the recent advances in the momentum literature.

We explain here the implicit cutoff selection obtained from an Bayesian decision perspective. The main goal of the Bayesian investor is to sequentially select the action that maximizes expected utility. For each period of time t and for each asset available i, the investor is faced with two simple actions within the set of possible actions A = {Long, Short}, i.e., she can open a long or short position for asset i. After the realization of the true return value, each given action can produce a different utility for the investor and this utility will depend on the actual return direction for that period t. If the investor went long an asset and after observing its true direction it was actually up (positive), then the investor should receive a positive utility gain. The same would apply if she went short an asset that was actually down (negative). When the action made by the investor does not match the actual direction, she should lose utility. The Table below • U L,N is the utility when a Long position is opened and the actual return was Negative;

• U S,P is the utility when a Short position is opened and the actual return was Positive;

• U S,N is the utility when a Short position is opened and the actual return was Negative

Since we consider a quadratic utility for the mean-variance investor in the same spirit of Equation ( where the bar upscript represents historical sample estimates until the decision period and the signs subscripts in parentheses filter for positive or negative historical observations. Hence, the investor considers those utility estimates as possible final outcomes before assuming a specific action.

At the end of time t − 1, the investor will choose the action that maximizes his expected utility for time t, where the expected utility for each action will depend on the forecasting output for asset i from the model she is considering. As an example, suppose the investor is willing to open a position on a specific asset i and will consider to use DMA as a forecasting model to decide the probability of a positive return on the next period. Hence, she will use the forecasting output as in Equation (17) Table (7) below show results when the investor sequentially applies the mechanism explained above for each asset available across time. It is important to highlight here that the Bayesian decision is applied just to produce trading position signs, but portfolio construction and weighing still follows the same structure as Equations (18) and (19).

The general conclusions in Table (7) follow the same we have obtained before in the main body of the paper. The table reports economic performance from classifier models using constant parameters (CP) and timevarying parameters (TVP), considering single predictors (1m, 2m, 4m, ..., 12m) or applying DMA or DMS with all predictors in the model space. Classifier cutoffs are implicit obtained from a sequential Bayesian decision problem, where the investor selects the best trading action based on its expected utility. All strategies are scaled to an ex-post annualized volatility of 10%. 

Value and momentum everywhere

Trend-following, risk-parity and the influence of correlations

Momentum strategies in futures markets and trend-following funds

Demystifying time-series momentum strategies: volatility estimators, trading rules and pairwise correlations

Momentum has its moments

Exchange rate predictability and dynamic Bayesian learning

Predicting excess stock returns out of sample: Can anything beat the historical average

Forecasting cryptocurrencies under model and parameter instability

Sparse signals in the crosssection of returns

Predictive regressions with time-varying coefficients

Momentum crashes

The economic value of volatility timing

Market intraday momentum

Breaking Bad Trends

Available at SSRN 3489539

The trend is your friend: Time-series momentum strategies across equity and commodity markets

Empirical asset pricing via machine learning

Factor momentum everywhere

A trend factor: Any economic gains from using information over investment horizons?

Demystifying managed futures

A century of evidence on trend-following investing

Time series momentum and macroeconomic risk

Forecasting recessions with time-varying models

Returns to buying winners and selling losers: Implications for stock market efficiency

Re-) Imag (in) ing Price Trends

Sequential learning, predictability, and optimal portfolio returns

Understanding momentum and reversal

Time series momentum and volatility scaling

Forecasting inflation using dynamic model averaging

Large time-varying parameter VARs

Adaptive variable selection for sequential prediction in multivariate dynamic models

Dynamic Ordering Learning in Multivariate Forecasting

Dynamic Portfolio Allocation in High Dimensions using Sparse Risk Factors

Time-series momentum in nearly 100 years of stock returns

What can we learn from the return predictability over the business cycle?

The economic value of predicting stock index returns and volatility

Dynamic logistic regression and dynamic model averaging for binary classification

Time series momentum

Time series: modeling, computation, and inference

Online prediction under model uncertainty via dynamic model averaging: Application to a cold rolling mill

Out-of-sample equity premium prediction: Combination forecasts and links to the real economy

The Long and the Short of Risk Parity

A comprehensive look at the empirical performance of equity premium prediction

Bayesian forecasting and dynamic models

Dynamic dependence networks: Financial time series forecasting and portfolio decisions