Asset allocation models based on moving averages are usually sold on the basis of historical outperformance of the S&P 500 total return at reduced risk. However, the longer-term backtests shown are often based on non tradable indexes, such as the S&P 500, the MSCI EAFE, NAREIT and also on difficult-to-trade for the retail crowd assets, such as fixed income, commodities and gold. Before discussing some of the issues related to these backtests, I would like to emphasize that I am not disputing the existence of the momentum premium and the benefits of asset allocation. What I am disputing in this article s the evidence provided to convince the retail crowd that these can be exploited easily. I list four reasons for this below:
- Before 1993 (SPY inception) it was difficult for a retail investor to track the S&P 500 index. An index tracking portfolio was required to minimize transaction cost and that was an art and science known only to investment banks.
- Products for tracking developed stock markets, bonds, gold and commodities appeared after 2000. Before that it was difficult for the retail crowd to effectively allocate to these assets without using derivatives or other securities or funds.
- Some have argued that transaction cost is not important due to the infrequent rebalancing of allocation schemes based on monthly data but, in reality, there was continuous rebalancing of the underline indexes. For example, any backtests on S&P 500 index before SPY was available implicitly assume rebalancing of index tracking portfolios. Note that although the math of index tracking was exciting, this approach lost its appeal in the 1990s due to high transaction cost and tracking error problems.
- More importantly, most asset allocation and momentum systems presented in the literature are data-mined and conditioned on price series properties that may not be present in the future. Showing robustness to moving average variations is not enough to prove that such methods are not artifacts of data-mining bias.
In this article I will concentrate only on No. 4. First I will show through a randomization study that a moving average model lacks intelligence and then I will explain why such models are based on wishful thinking.
Moving average crossover models are not intelligent
One way to show that a trading model is not intelligent is by demonstrating that it underperforms a sufficiently large percentage of random models that have similar properties. For the purpose of this study we will consider adjusted SPY (NYSEARCA:SPY) monthly data that reflect total S&P 500 return in the period 01/1994 to 07/2015. The "dumb model" is a 3-10 moving average crossover system, i.e., a system that fully invests in SPY when the 3-month moving average crosses above the 10-month moving average and exits the position when the opposite cross occurs. This is a popular moving average crossover used in some widely publicized asset allocation methods. This system has generated 8 long trades in SPY since 01/1994 and has outperformed buy and hold by about 110 basis points at a much lower maximum drawdown. The rules of the system are as follows
If monthly MA(3) > monthly MA(10) buy at the next open
Exit at the next open if MA(3) < monthly MA(10)
The equity curve of this system is shown below:
Below are some key performance statistics of this system:
Parameter | SPY System | SPY B&H |
CAR | 10.42% | 9.31% |
Max. DD | -15.28% | -50.80% |
Sharpe | 0.57 | |
Win rate | 87.50% | |
Profit factor | 266 | |
Payoff ratio | 38 | |
Trades | 8 | |
Commission | $0.02/share | $0.02/share |
It may be seen that the timing models generated about 110 basis points of annual excess return as compared to buy and hold but at a much lower drawdown.
I just want to emphasize at this point that the job of every serious trading system developer is not to try to find support for the result of a backtest but instead to try to discredit it. Unfortunately, exactly the opposite is the case in most publications. For example, varying the moving averages and claiming that the system is robust because it remains profitable, is not enough. We will consider in the second part of this article an example but first we will test this system for intelligence.
One way of testing a system for possessing intelligence is through a suitable randomization of performance. For this particular moving average system, we will randomize performance by generating random moving average crossovers for each entry point that range from 1 to 8 for the fast and from 2 to 20 for the slow. We will consider only those systems with slow ma > fast ma. In addition we will randomize the entry point by tossing a coin and we will require that in addition to the crossover condition, heads show up. On top of that, the exit will be set to a number of bars that are randomly sampled between 5 and 55. Note that the average number of months in a position for the original system was 25.
Each random run is repeated 20,000 times and the CAR is calculated. Then the cumulative frequency distribution of CAR is plotted as shown below:
The CAR of 10.42% of the original 3-10 crossover system results in a p-value of 0.117. This p-value is not low enough to reject the null hypothesis that the system is not intelligent. in fact, the system generated lower return than about 12% of the random systems, as shown by the vertical red line on the above chart.
Note that well curve-fitted systems always result in low p-value and that makes this method not very robust in general. However, this method provided in this case an initial indication that the 3-10 moving average crossover system in SPY lacks intelligence. Again, this is because 12% random system performed better than the original system. However, there is another more practical way of showing that this system is data-mined, dumb and that its performance is based on wishful thinking.
Future performance is based on wishful thinking
The reason for this is that these models assume that the past will remain similar to the future. In the case of the SPY system, the model assumes that uptrends and downtrends will be smooth enough and come in V-shapes with no protracted periods of sideways price action. We do not know if this will be the case in the U.S. stock market in the future but relying on such assumptions is wishful thinking. One can get a taste of what may happen to an account that invests with such a model by a backtest on EEM data from 01/2010 to 07/2015, a period of 5 1/2 years during which the emerging markets ETF moved for all practical purposes sideways. Below is the backtested equity curve:
Below are some performance details:
Parameter | EEM system | EEM B&H |
CAR | -7.64% | +0.21% |
Max. DD | -35.28% | -29.12% |
Return | -35.22% | +1.14% |
It may be seen that the 3-10 moving average crossover system based on monthly data performed exceptionally bad during the sideways market period, losing 35.22% as opposed to a gain of 1.14% for the buy and hold.
Can the U.S. stock market move sideways for an extended period of time? I cannot answer this question. My point here was that moving average crossover systems on monthly data, the types used in some asset allocation models, assume V-shaped reversals from downtrends to uptrends with no protracted choppy action in between. Therefore, the future performance of such systems is based on wishful thinking. These systems are dumb and risky.
Conclusion
Ninety nine percent of systems in the trading literature are data-mined. There is nothing wrong with that in principle except the fact that data-mined systems are 99.999% or more curve-fitted on market conditions. It is an art and a science to distinguish those that are not from the many that are and in fact this is the trading edge, it is not the system. Nowadays, a computer can generate hundreds of systems per minute. Proving that systems are intelligent is the true edge, not their generation. This will remain an art and science that no mechanical process will ever be able to accomplish for all cases.
Asset allocation methods based on moving averages suffer from the lag inherent in price series smoothing operators and do not perform well in fast and sideways markets. It is highly likely that the allocation models presented in the literature that are based on moving averages were data-mined to optimize CAR and minimize drawdown. In case the U.S. stock market enters a protracted period of sideways action, these models will generate significant losses.