key: cord-0228932-4s2ho96d
authors: Li, Xiaoyue; Uysal, A. Sinem; Mulvey, John M.
title: Multi-Period Portfolio Optimization using Model Predictive Control with Mean-Variance and Risk Parity Frameworks
date: 2021-03-19
journal: nan
DOI: nan
sha: dca409ac498aa3c8081d34f9eb09c0867144f497
doc_id: 228932
cord_uid: 4s2ho96d

We employ model predictive control for a multi-period portfolio optimization problem. In addition to the mean-variance objective, we construct a portfolio whose allocation is given by model predictive control with a risk-parity objective, and provide a successive convex program algorithm that provides 30 times faster and robust solutions in the experiments. Computational results on the multi-asset universe show that multi-period models perform better than their single period counterparts in out-of-sample period, 2006-2020. The out-of-sample risk-adjusted performance of both mean-variance and risk-parity formulations beat the fix-mix benchmark, and achieve Sharpe ratio of 0.64 and 0.97, respectively.

Portfolio optimization is one of the central problems in financial engineering. Popular allocation methods include mean-variance optimization (Markowitz (1952) ), fixed-mix strategies motivated by Merton (1969) , risk parity and risk budgeting (Maillard et al. (2010) , Bruder and Roncalli (2012) , Chaves et al. (2011) ), etc. Fixed-mix strategies such as 60/40 provides a rule-of-thumb for average investors who seek for a steady growth of the portfolio over a long horizon with modest downside protections. Mean-variance optimization takes advantage of information by considering the estimation and forecasts of the performance of underlying assets. It successfully catches the uptrend when one has promising mechanism for predicting future returns. On the other hand, empirical results suggest that the estimation of asset returns is difficult and inaccurate compared to the estimation of covariance matrices. To alleviate the harm of inferior return predictions, some investors look for strategies that are robust under worst-case scenarios (Gülpınar and Rustem (2007) ), and some others seek a stable investment strategy that distributes risks evenly to each of the assets, namely risk parity.

Mean-variance and risk parity allocation strategies with rolling single-period optimization have proven to be successful and elegant methods. On the other hand, single-period models fail to fully address important issues in portfolio management, including transaction costs, change of market dynamics, intermediate cash flows, and goal-based risk measures etc. Topaloglou et al. (2008) compare a dynamic stochastic programming model for portfolio management with single-stage and two-stage setups, and find that the two-stage model provides a dominating return-CVaR efficient frontier. In this paper, we introduce a portfolio allocation framework based on multi-period optimization problem with selection of allocation model.

It is non-trivial to solve multi-period optimization problems. With traditional numerical methods, the running time grows exponentially as a function of problem size.

Even a modest number of time periods, say above 5-10, and branching processes exceeds the capability of most modern computers without specialized algorithms. To deal with the curse of dimensionality, we adopt a model predictive control (MPC) framework for portfolio optimization (Boyd et al. (2017) ), and compare the performance of MPC based on mean-variance optimization versus that based on risk parity. The market environment is described by a hidden Markov model (Nystrup et al. (2019) ) which provides necessary asset parameter forecasts for multi-period optimization models. Our results show that the models have different strengths and their performance depends upon the underlying market environment, but both succeed in beating the fix-mix benchmark.

The investment framework we propose is as follows:

1. Employ the hidden Markov model on the historical returns to estimate the expected returns and covariance matrices for future periods; 2. Solve two multi-period portfolio allocation problems based on MPC with meanvariance objective and MPC with risk-parity objective.

The contribution of this paper is two-fold. First, we extend model predictive control approach from Boyd et al. (2017) to solve a multi-period risk-parity portfolio with transaction control. The resulting portfolio enjoys a high Sharpe ratio and an evenly distributed risk contribution compared to benchmark strategies and the mean-variance approach.

We also illustrate the asset allocations of both mean-variance and risk-parity strategies under specific a time period to show the relative strength of each objective. Second, a successive convex algorithm is derived to solve the MPC problem with a risk-parity objective. The convex formulation of risk-parity objective shortens the running time and stabilizes the numerical solution. We find that empirically, the algorithm converges within a few steps and offers desired global solution.

Section 2 introduces the hidden Markov model and our implementation to address the market environment for the next time step. Section 3 introduces the multi-period portfolio optimization problem, and Section 4 explains model predictive control approach with the mean-variance and risk parity formulations. Section 5 presents computational results, and Section 6 concludes.

We address the market returns by a hidden Markov model (HMM) (Baum et al. (1970) ).

In this section, we discuss the structure of HMM and the merit of it. Popular pricing and return models include geometric Brownian motion, mean-reverting models such as Ornstein-Uhlenbeck process, regime-based model such as HMM, etc. Elegant as geometric Brownian motion, market returns are often asymmetric and have a heavier left tail than a normal distribution, suggesting that a HMM could provide a more realistic description of the return distribution. Dias et al. (2015) discuss the regime classification for 21 stock markets, and find their extended HMM provides meaningful classification for general markets. In fact, the usefulness of HMM has been proven by successfully capturing the stock price change during the 2008 financial crisis. Many papers implement HMM to utilize regime-switching dynamics in investment decision making framework, including examples from Bae et al. (2014) , Reus and Mulvey (2016) and Nystrup et al. (2019) . The recent paper from Uysal and Mulvey (2021) show benefits of regime-switching models in risk parity portfolios.

A hidden Markov model is a pair of series {(X t ), (Y t )} such that • {(X t )} is a Markov process, whose states are not directly observable; and

• {(Y t )} is a series of X t -measurable variables that can be observed.

For simplicity and interpretability, we will model the market returns with two states: normal regime and contraction regime. In each of the regimes, the returns follow a multivariate Gaussian distribution whose parameters depends on the underlying regime.

In the context of HMM, {(X t )} is the series of market regime which the investors cannot perceive directly; {(Y t )} is the return series that investors may observe and by which they infer the likelihood of the current regime.

During normal regimes, the market usually experiences positive expected returns and low (co)variances, whereas in contraction regimes, the market faces low expected returns and high (co)variances. The transitions between these two regimes are described by a probabilistic matrix

where p nn is the probability that the next period stays in normal regime given the current period is normal, and p cc is the probability that the next period stays in contraction regime if the current period is contraction.

The forecasting of future returns and covariance matrices are based on fitted HMM parameters. Given that the returns follow N (µ n , Σ n ) under normal regime and N (µ c , Σ c ) under contraction regime, and that the probability of the market being normal at time t is q t , then

• the probability that the market is under normal regime at time t + 1 isq t+1 =

• the forecasting of expected return vector at time t+1 isμ t+1 =q t+1 µ n +(1−q t+1 )µ c ;

• the forecasting of covariance matrix of returns at time t

The parameters of time t + 2, t + 3, ... can be estimated withq t+1 ,μ t+1 andΣ t+1 , and iteratively.

Importantly, in our empirical work, we employ approaches that are evaluated in an out of sample test period, including estimating the parameters of the hidden Markov model (HMM). We estimate the parameters of the HMM by the expectation-maximization algorithm based on market returns over the past 2000 days. The rolling window is chosen to be 2000 days so that (i) the estimation is made on the most recent market performance, and (ii) the data is likely to include a full market cycle. To illustrate the effectiveness of HMM, we compare the performance of portfolios based on the parameters from HMM versus those based on the naive historical estimates. For stability of the estimation, we fit the regimes with two driving market variables -US equities and US treasury returns, and use the regime labels to estimate the mean and covariance matrix under each regime for all asset categories. Here, a regime is label in the growth period if the HMM probability is equal to .5 or greater, whereas the regime is taken as contraction if the probability is less than .5 for the next period.

Effective as single-period portfolio optimization models, they fail to adequately address critical concerns in portfolio management, including transaction costs and taxes, the change of asset return dynamics, and short-term versus long-term benefit trade-off. Here,

we provide a generic model for multi-period portfolio optimization problem as follows:

where T is the investment horizon, π 0 , π 1 , ..., π T −1 ∈ R n are the allocation at the beginning of each period, 1 ∈ R n is the vector with all ones, W t is the dollar value of wealth at the beginning of period t, r t ∈ R n is the vector of returns in period t, is the element-wise multiplication operator, and C(W ; π → , π) is the dollar value of transaction and market impact costs when the allocation is rebalanced to π → from π with current wealth being W . All quantities with → correspond to the quantity at the end of the period. Initial wealth W 0 and the transaction function C(·; ·, ·) are given. The distribution of asset returns r t depends on specific circumstances. The arguments of utility function Multi-period models provide superior capabilities over single-period models, as it takes into consideration of future events when planning for the current period. However, multiperiod optimization suffers from the curse of dimensionality, the phenomenon that run-ning time grows exponentially as the complexity of problem increases. In the following section, we will introduce a truncated multi-period model that successfully avoids the curse of dimensionality while taking advantage of multi-period forecasting. We consider two variations of the multi-period portfolio problem with MPC approach:

1) mean-variance, and 2) risk-parity. The mean-variance framework has been the traditional approach to decide portfolio allocations on the basis of return-risk trade-off (Markowitz (1952) ). The mean-variance coefficient provides an intuitive choice of returnrisk balance, and investors may choose the coefficient according to their risk appetite. Platanakis et al. (2020) show that mean-variance portfolios are superior to 1/n portfolios the in asset allocation problem. However, it faces practical drawbacks (Kolm et al. (2014) ), some of which are, sensitivity to estimated input parameters and concentration of portfolio risk. Among input parameters, expected return estimates cause large changes in asset allocations and they are found to be hard to estimate. On the other hand, van Staden et al. (2021) provide an extensive study of robustness of mean-variance portfolios under different modeling choices, and they find that dynamic mean-variance portfolios are more robust to model misspecifications, especially in the multi-period setting.

Risk management has become an important part in portfolio analysis, especially since the 2008 financial crisis. Investors move towards more risk-based investment strategies due to their historical low drawdown performance. Risk budgeting portfolio optimization is a popular risk-based asset allocation technique (Bruder and Roncalli (2012) ). Here, the risk budgets are assigned to each assets' risk contribution, and equalizing all risk budgets in the portfolio is known as risk parity strategy (Maillard et al. (2010) ). Unlike mean-variance, risk parity strategy provides a balanced risk concentration in the portfolio. Furthermore, risk parity portfolios don't require expected asset return estimates as input, and it's shown that they are robust to estimation errors. By its nature, the risk parity portfolios tilt towards primarily in low risk assets (historically fixed income products), and provide low but stable returns over the investment horizon. Although risk parity portfolios can be conservative from an investor perspective, that can be enhanced with leverage and target return constraints. There are conflicting views of the risk parity performance in high interest rate environment due to fixed-income heavy asset allocation. However, Uysal and Mulvey (2021) show that regime-switching models can help to improve the risk parity performance in the historical high interest rate period of the late 1970s. Some researchers (Chaves et al. (2011)) show that the performance of the risk parity is dependent on the selected asset universe. On the computational aspect, find- To demonstrate the performance and compare relative strengths, we implement multiperiod formulations based on both frameworks.

We introduce the mean-variance term and transaction cost with 1 penalty in the objective function with long-only budget constraints as follow:

where γ risk is the risk-aversion parameter and γ trade is the penalty for transactions.

When transaction costs incur, the transaction penalty helps to avoid unnecessary turnovers associated with weak signals, which is intuitively consistent with Gârleanu and Pedersen (2013) who suggest a gradual move toward the target portfolio under the existence of transaction costs.r τ |t andΣ τ |t are return and covariance matrix forecasts by HMM at time t for periods τ = t+1, . . . , t+H. Note that the actual transaction cost is calculated with the difference between the beginning-of-period allocation and the end-of-period allocation of previous period. Here, in the formulation, for calculation simplicity, we replace the end-of-period allocation of previous period with the beginning-of-period allocation of previous period. When each time period is short, it is reasonable to assume that the end-of-period allocation is close to that at the beginning of period. Therefore, −γ trade π τ − π τ −1 1 provides a meaningful control of portfolio turnovers.

At each period t, we solve the multi-period Problem 2 over the planning horizon H, which produces allocation vectors π t+1 , . . . , π t+H . We execute the first trade (π t+1 ) and move on to the next period, and solve the problem until the end of investment horizon T . Note that at period t, π t is not a decision variable, it is an input which denotes the current portfolio allocation. Notice that Problem 2 has a convex objective function with linear constraints, and it is efficiently solved by publicly available convex program solvers.

Following the same formulation structure in Problem 2, we implement risk parity condition with volatility risk measure. Volatility, a common risk measure in practice, R(π) = √ π T Σπ is a homogeneous function of degree one which verifies Euler decomposition. Under this risk measure, the marginal risk contribution of asset i is

. To enforce the risk parity condition in the asset allocation decision, where all risk contributions are equal, we introduce the following sum of squares term into the objective function:

where b i ∈ [0, 1] denotes risk budget for asset i and it is equal to 1/n for risk parity portfolio. Likewise to Problem 2, the last part penalizes transactions with an l 1 term. Unlike mean-variance problem, the MPC approach in the multi-period risk parity portfolio formulation leads to a non-convex problem due to the risk parity term.

Problem 3 is non-convex due to the risk parity term. In practice, non-convexity does not only increase the running time, but also impairs the stability of results when multiple local solutions exist. In order to solve the model predictive control with risk parity efficiently and effectively, we propose a successive convex optimization algorithm for solving Problem 3, based on the techniques introduced by Feng and Palomar (2015).

Let g τ,i (π τ,i ) denote the deviation from desired risk budget of asset i in period τ with allocation vector π τ,i , i.e., g τ,i (π τ,i ) =

We start the successive procedure with an initial solution π 0 = [π 0 t+1 , ..., π 0 t+H ]. At the k-th iteration, a linear expansion of g τ,i around π k τ gives g τ,

where δ τ > 0 is a regularization term for convergence purpose. Expanding the sum of squares, one finds that the first part of the objective function can be written as

Note that by the form of Q k τ , all of them are positive definite when δ τ > 0, and therefore the objective is rearranged to a convex quadratic function. In particular, at the k-th iteration, the problem becomes

We solve Problem 3 with Algorithm 1.

Inputs: π t−1 ; δ τ > 0 for τ = t + 1, ..., t + H; tolerance ; Feng and Palomar (2015) provide the convergence analysis in the case of single-period model, and the same analysis is applicable to our setting.

In addition to multi-period formulations of mean-variance and risk parity portfolios, we implement the single period formulations to analyze the contribution of multi-period formulations. We employ the same objective functions with transaction penalties in single period formulations, but the only decision variable is π t+1 . Inputs are one period ahead forecasts of asset returns and covariance matrix, and the current portfolio allocation vector π t .

The advantages of using the successive convex formulation for the multi-period riskparity problem is two-fold: accuracy and running time. First, our experiments show that the convergence to global optimum is more stable with the successive convex programs. In particular, the algorithm usually converges within 10 steps and provides the risk-parity allocation as desired. To illustrate the accuracy of the convex algorithm on the convex formulation of the multi-period risk parity portfolio (Problem 7), we run the successive algorithm in CVXPY (Diamond and Boyd (2016) ) with OSQP solver (Operator Splitting solver for Quadratic Programs) (Stellato et al. (2020) ). The original risk-parity MPC (Problem 3) is computed with SLSQP solver (Sequential Least Squares Programming) (Kraft (1988) ) in SciPy package. We set maximum iteration number to 10,000 for both solvers, and left other parameters at their default values. Accuracy is measured 1 norm of ex post and ex ante risk contributions. We consider portfolios with close to zero transaction penalty (γ trade ≤ 10 −3 ) to compare the the ex post risk contributions with the nominal risk-parity solution. Error metrics are reported for four planning horizons (H = 1, 5, 15, 30) and presented in Table 1 . Computations are performed over the period training period (1998) (1999) (2000) (2001) (2002) (2003) (2004) (2005) with 10 assets. Our algorithm achieves a risk-parity measure of essentially zero, whereas the original formulation fails to converge from time to time. When the transaction penalty increases, our convex formulation leads to result deviating from the nominal risk-parity solution in order to balance the trade-off between risk-parity objective and turnover rates The risk contribution from each asset class stays comfortably close to each other, while providing avoidance of excessive transactions. As the planning horizon (H) increases, the rate of increase in errors is larger in the non-convex formulation (Problem 3) than convex formulation (Problem 7). The errors also gets larger when transaction penalty increases, that it is almost 10 times larger for convex formulation when γ trade = 10 −3 than that when γ trade = 10 −6 . Higher transaction penalty diverges allocation from the true risk parity portfolio. In addition, the max error of the convex formulation (71 × 10 −4 ) is also significantly smaller than that of the non-convex formulation (6, 000 × 10 −4 ) over the hyperparameter space. For small H, the worst case error of convex formulation is one magnitude smaller than that of the non-convex formulation. For long plan-ahead horizons H = 15 and H = 30, the non-convex formulation fails to converge on certain days, whereas the convex formulation consistently provide the desired solution. Second, the successive algorithm successfully shorten the running time from the nonconvex formulation. For a wide range of hyperparameters, the convex formulation consistently takes shorter time to converge than the original non-convex formulation. 

We implement the multi-period portfolio models in a rolling window whose parameters are learned based on returns of past 2000 days. Every time period, we train the HMM to estimate asset parameters and the regime transition matrix for the next period. In each period, we assume a transaction cost of 10 basis points for each asset class 1 . We consider various investment strategies in the portfolio analysis: 1) 100% stock, 2) equal weight (1/n), 3) 100% mean-variance ETF based on single and multi periods meanvariance (SPO,MPO), and 4) 100% risk-parity ETF based on single and multi periods risk parity (SPO RP,MPO RP). We perform the following steps for portfolio optimization techniques at each time period t, until the end of investment horizon T :

1. Update HMM model parameters and get estimates of asset parameters for H periods ahead 2. Compute the optimal sequence of portfolio allocations π * k,t+1 , . . . , π * k,t+H for each strategy k ∈ {SPO,MPO,SPO RP,MPO RP} via optimization Problem 2 and 7 3. Execute trades for each strategy π * k,t+1 and calculate portfolio returns after transaction costs 4. Return to step 1 and repeat 5 Performance on Market Data

In the portfolio analysis, we consider daily asset returns from major asset classes 2 over the period from 1991 to 2020. 1-month Treasury bill rate returns are obtained from Notice the bond indices have the highest Sharpe ratio due to historically macroeconomic conditions, but the US domestic equity index (S&P 500) has the highest returns during the same period. 

We separate the market data so that 1991-2005 are used for hyperparameter tuning, and 2006-2020Q3 are fully out-of-sample. After taking out the first 2,000 days for HMM parameter estimation, the actual tuning period we use starts from 1998 until the end of 2005. In particular, the hyperparameters to be tuned involve:

• In SPO (single-period mean-variance optimization) and MPO (multi-period meanvariance optimization) formulation, one needs to decide the mean-variance coefficient γ risk and transaction penalty γ trade . In multi-period setting, the length of planning horizon H should also be chosen carefully.

• In SPO RP (single-period risk-parity optimization) and MPO RP (multi-period risk-parity optimization) framework, one needs to choose the transaction penalty γ trade , with one extra hyperparameter H in the multi-period setup. This quantity could be different from that of MPO formulation, as the scales of mean-variance objective and risk-parity objective differ from each other. In the convex formulation of risk parity, the selection of hyperparameters may affect the running speed of Algorithm 1, but will not impact the output allocation from the algorithm as long as the solution converges to the proper optimum. Following Feng and Palomar (2015) and after some tuning, we choose γ 0 = 0.8 with updating rule γ k = 1 − 10 −7 γ k−1 and δ τ =

, where n is the number of assets.

Recall that the single-period model is the same as their multi-period counterpart with H = 1. In the following subsections, we will analyze the hyperparameter tuning with mean-variance formulation and risk-parity formulation, respectively. We will choose the hyperparameters with consideration of return generation, Sharpe ratio and turnover rate, where the turnover rate measures the length on average an asset is held in a portfolio. Mathematically, the annual turnover rate is Σ τ u τ 1 where τ runs over all trading days in the year, and u τ = w τ − w → τ −1 is the vector of trade amount on day τ . Notice that tuning period includes a major contraction (2001) (2002) period, still all strategies provide promising Sharpe ratios. Although different families of strategies can bring distinct performance under different market conditions, it is reasonable to assume that, within the same family of strategies, the relationship between hyperparameters and performance is rather stable.

Portfolios over the tuning period with various choice of planning horizon H ∈ {1, 2, 5, 10, 15, 30}, risk aversion coefficient γ risk ∈ {0.01, 0.1, 1, 3, 5, 10} transaction penalty γ trade ∈ {0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 25} appear in Figure 2 .

First, we decide on the planning horizon of the multi-period model. We find that shorter planning horizons provide better Sharpe ratios across different transaction penalties over the training period. With MPC, since all future parameters are replaced with their forecasting values, the estimation accuracy drops as planning horizon gets longer. There is a trade-off between a longer planning horizon and better estimation of parameters. Our tuning period suggests that a planning horizon of 5 days yields good performance, and we will keep H = 5 for γ risk and γ trade tuning as well as for the out of sample tests.

In general, a higher transaction penalty γ trade leads to a lower turnover rate. On the other hand, we find that the turnover rate does not further decrease and Sharpe ratio stops increasing after a certain threshold, which is around 0.01 for MPO. Recall that the purpose of including the transaction penalty is to avoid frequent trading that consumes profit. Therefore, we pick γ trade at the kink, so that transactions are controlled in a meaningful way, without fully discouraging transactions and avoiding profit-generating rebalances. The choices of γ trade that are relatively close to the transaction cost of 10bp also have good interpretability with the goal of generating returns (Remark 1), though it can be higher or lower based on the investors' transaction preferences.

Remark 1. By choosing a γ trade close to the transaction cost, the return part and transaction penalty part in the objective function of Problem 2 approximates the actual returns. Note that the expected return in period τ with allocation π τ isr T τ |t π τ , and the transaction cost paid as a proportion of total wealth is (transaction cost * π τ − π → τ −1 1 ). Therefore, the expected realized return in period τ isr T τ |t π τ −(transaction cost * π τ − π → τ −1 1 ), which is approximated by τ isr T τ |t π τ − (transaction cost * π τ − π τ −1 1 ) when returns are close to zero. When γ trade is close to the linear transaction cost, the corresponding return and transaction penalty parts together approximates expected actual returns.

The γ risk parameter controls the trade-off between return and risk, and describes the subjective preference of the investor. Selecting γ risk on Sharpe ratio solely could be dangerous due to the market instability. When the market booms, small γ risk that leads to aggressive approach will generate high Sharpe ratio, whereas conservatively low γ risk usually outperforms in bearish market. In order to provide a reliable returnrisk balance, we introduce an approximation of mean-variance optimization to constant relative risk aversion (CRRA) utility in Remark 2, where the CRRA utility function is

at wealth level W for risk aversion coefficient γ. Recalling that return estimation is generally harder and less accurate than covariance estimation, we pick a relatively conservative γ risk in order to achieve a stable portfolio. The literature suggests that people usually have a CRRA coefficient γ ∈ [−10, 0], based on which we choose γ risk in Problem 2 equal to be 5.

Remark 2. With some generic assumptions, we show that there is an approximate equivalence between CRRA utility with risk aversion γ and mean-variance optimization problem with coefficient λ ≈ 1−γ 2 . A second order Taylor expansion gives

Let m = E[W ] and s 2 = V ar(W ). Note terminal wealth W = W 0 (1 + r) where W 0 is the initial wealth and r is return. Thus m = W 0 (1 +r) and s 2 = s 2 r W 2 0 wherer and s 2 r are the mean and variance of r, respectively. We have

with a Taylor expansion ofr around 0

For mean-variance optimization, we assume the initial allocation is an equal-weighted portfolio. The reason is two-fold. First, it provides a better comparison when all tested strategies start with the same given allocation. Second and more important, starting with equal-weighted portfolio generates stabler portfolio. In particular, imagine a starting allocation with zeros invested in all categories. With the constraint of all weights adding up to one, the first-period transaction penalty will always be γ trade * 1, and the optimal portfolio largely depends on the return and covariance estimation of the first day.

On the other hand, starting with equal weights offers stability on allocation decisions and provides consistent performance over various periods.

The same ranges for hyperparameters γ trade and H are considered in risk parity portfolios and results appear in Figure 4 . When the transaction penalty is small, the penalty is negligible compared to the risk parity term, and therefore, the optimal solution satisfies nominal risk parity condition. Small transaction penalty fails to provide a meaningful control of the turnovers. For the multi-period risk-parity model (MPO RP), we choose We observe that over the training period optimal multi-period mean-variance and risk parity portfolios have significantly better performance than their traditional single period counterparts. The transaction penalty in MPC portfolios explain the performance difference from nominal portfolios. On the other hand, we observe a slight performance gain in the multi-period portfolio over the single-period model, in Sharpe ratio and turnover rates. We believe that this is caused by model parameter estimations by the HMM model. The return estimation is a weighted average of mean return of normal and contraction regimes over the past 2000 days. When multiple asset categories are included in the model, the difference in returns under various regimes is not as obvious as the discrepancy in covariance matrices. Therefore, even though the model provides robust hints on the asset returns on a relative basis, without leveraging other micro-structure data, the return estimations based on historical returns alone are less informative as desired.

Comparing the problem formulation and resulting metrics on returns with these strategies, we find that MPO has the advantage of considering return estimations, whereas MPO RP brings higher Sharpe ratio. Their relative strength is consistent with their single-period counterparties as discussed in risk parity literature (Maillard et al. (2010) ).

Both mean-variance and risk-parity objectives with transaction penalty successfully reduce the turnover rates to be below the fix-mix benchmark. Since we have chosen a relatively conservative set of hyperparameters for the mean-variance formulation, the reduction in turnover rates is significant in MPO results. In our mean-variance formulation, the transaction penalty is set equal to γ trade = 0.01 to avoid unnecessary trades triggered by fake signals, which discourages any trade suggested by single-period forecast, and leads to an essentially zero turnover rates in SPO. Yet, when multi-period forecasting is considered, the trading signal is stronger than that of single-period counterpart, because there is profit by gradually switching to the desired allocation. On the other hand, a multi-period risk-parity formulation provides better transaction control than the single-period risk-parity by predicting the future covariance matrices.

Risk parity strategy is often employed with leverage to achieve desired return level.

However, leverage introduces new risks to the portfolio. Here, we consider an unlevered portfolio strategy. Both strategies provide meaningful control of volatility and turnover rates, and beat the fix-mix strategy Sharpe ratio of 0.57. MPO reduces the turnover rates by 95% and offers a Sharpe ratio of 0.64, while MPO RP reduces the turnover rates by 42% and leads to a Sharpe ratio of 0.97 over the out-of-sample period.

In addition, we find that the resulting allocations from the MPO and MPO RP strategies are different, and have relative strength and weakness under various market conditions, which is consistent with their objectives by nature. We will zoom in a recent period and compare the allocations in detail in Section 5.3.2.

To justify the use of HMM, we compare the performance of the same set of strategies with parameters estimated with and without HMM. Model performances without HMM inputs are presented in Table 5 . We replace the forecast of expected return and co- Therefore, SPO and MPO essentially mimic the performance of buy-and-hold strategy with starting point equalling the 1/n portfolio. 

To understand the relative strength of the proposed strategies, we zoom in the recent period to compare the allocation and performance. Figure 5 presents portfolio performances and asset allocations during the market downturn in March 2020 due to COVID-19 pandemic. The first plot shows that bond indices have the best performance, and long maturity government bonds have the highest cumulative return. Both MPO and MPO RP are successful in investing the top performers in this period, and they both allocate heavily in US Aggregate Treasury which turns out to be the most stable winner.

Despite the fact that both strategies selects the asset classes which performed well in the COVID-19 crash, their allocations are different given their objectives and formulation.

By taking return forecasting into consideration, MPO weighs heavily in the top three asset classes in terms of return, and is also able to identify the second-tier in which it invests lightly. None is allocated to the four asset classes that performed worst. On the other hand, MPO RP picks the asset classes that are stable while providing relatively high returns. In particular, MPO RP invests less in the best performer, US Long Treasury, than MPO, due to the high volatility of nature of this asset class (Table 3) .

The losers crashed heavily, leading to high short-term volatility, which helps risk-parity strategy to not invest heavily in them.

Comparing the overall performance of MPO and MPO RP during this period, we find that both outperforms the fix-mix strategy by identifying the winners. The MPO strategy bests MPO RP up to mid-late March when the market downturn happened, thanks to its vast investment in US Long Treasury, an asset class with high-return and mid-high volatility during contraction periods.

Over this sample period, both MPO and MPO RP allocate capitals in consistency with their objective functions. The mean-variance optimization finds a balance between chasing returns and controlling risk; whereas the risk-parity objective prefers the stably growing asset classes. Table 6 : Annualized portfolio metrics over the period from 2020-01-01 to 2020-11-30 

One natural question in evaluating allocation strategy performance is how sensitive the performance is to transaction costs. An underestimation of transaction cost may lead to a strategy that seems to be profitable, but fails to generate returns in real-world applications. To test the robustness of our proposed frameworks, we implement the same allocation strategy with a various transaction costs, ranging from 1 bp to 150 bps (Figure 6) . The risk-parity formulation (MPO RP) consistently provide higher Sharpe ratio than the mean-variance formulation (MPO) and the fix-mix benchmark. On the other hand, MPO is hurt less by unexpectedly high transaction costs due to its low turnover rates.

For investors facing a higher range of transaction costs, we suggest re-tune the hyperparameters of both MPO and MPO RP to obtain a suitable choice for the transaction costs. The performance of both frameworks are robust when the estimation error of transaction costs is within a reasonable range. 

In this paper, we propose a risk-parity portfolio that can be effectively solved by model predictive control, and provide a successive convex program algorithm that provides faster and more robust solutions. The risk-parity MPC generates higher Sharpe ratio than the MPC with mean-variance objective, with both formulations enjoys different strength. In the out-of-sample period, MPO and MPO RP leads to Sharpe ratio of 0.64 and 0.97 after transaction costs, respectively, outperforming the fix-mix benchmark of 0.57. Zooming in the recent market downturn due to COVID-19 in 2020, we observe different strength of mean-variance and risk-parity objectives. In addition, the proposed strategies are robust even when there is misspecification of transaction costs, due to their successful control of unnecessary transactions.

There are numerous next steps that we want to point out for future research. In this paper, we choose the hidden Markov model to describe asset dynamics, but other approaches can be employ for asset parameter estimations such as factor models. A good predictive model for asset parameters will improve strategy performances, especially for multi-period mean-variance portfolio. Leverage and other budget constraints can be introduced to enhance performances in the mean-variance strategy. However, budget constraints will create computationally difficulty in risk parity portfolio. In addition, we believe that a good predictive signal of cross-asset momentum will provide substantial gains in ETF switching strategies.

Dynamic asset allocation for varied financial markets under regime switching framework

A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains

Multi-period trading via convex optimization

Performance bounds and suboptimal policies for multi-period investment

Managing Risk Exposures Using the Risk Budgeting Approach

Risk parity portfolio vs. other asset allocation heuristic portfolios

Cvxpy: A python-embedded modeling language for convex optimization

Clustering financial time series: New insights from an extended hidden markov model

Scrip: Successive convex optimization methods for risk parity portfolio design

Dynamic trading with predictable returns and transaction costs

Worst-case robust decisions for multi-period mean-variance portfolio optimization

60 years of portfolio optimization: Practical challenges and current trends

A software package for sequential quadratic programming

The properties of equally weighted risk contribution portfolios

Portfolio selection

Lifetime portfolio selection under uncertainty: The continuoustime case. The review of Economics and Statistics

Multi-period portfolio selection with drawdown control

Multi-period portfolio optimization with investor views under regime switching

Horses for courses: Mean-variance for asset allocation and 1/n for stock selection

Dynamic allocations for currency futures under switching regimes signals

Osqp: an operator splitting solver for quadratic programs

A dynamic stochastic programming model for international portfolio management

A machine learning approach in regime-switching risk parity portfolios

The surprising robustness of dynamic mean-variance portfolio optimization to model misspecification errors