The Wall Street Journal ran an article on May 2, 2009 called “Odds-On Imperfection: Monte Carlo Simulation.” The sub-title is “Financial Planning Tool Fails to Gauge Extreme Events.” The main point of the article is that Monte Carlo Simulations did not predict the potential for a market meltdown on the scale of what we experienced in 2008. This article reinforces some common misconceptions about Monte Carlo planning tools and probabilistic models in general. As the author of a Monte Carlo planning package, I got quite a few questions about this article.

The main premise of the WSJ piece is that “there is little chance your Monte Carlo simulation, named for the gambling mecca, would have highlighted a scenario like the market slide just seen. Though these tools typically run a portfolio through hundreds or thousands of potential market scenarios, they often assign minuscule odds to extreme market events.” The author then frames this point as a general critique of the use of probabilistic portfolio management tools like Monte Carlo Simulation.

The author is correct that available Monte Carlo models and other risk models assigned very low probability to losses on the scale of what we observed in 2008. I have no problem with this assertion. It is where the argument goes from there that is flawed. It remains unclear as to what odds should have been assigned to an event like 2008. If we have experienced two events on the scale of 2008 in the last hundred years (2008 and 1929), this is a very small sample. There is no way to really ‘validate’ a model’s assigned probabilities of events at this kind of extreme. After a truly extreme event, it should be the case that Monte Carlo models projected that this event was possible, but it is not realistic to believe that it is possible to truly ‘validate’ models on the basis of events that happen once in fifty years or so. Perhaps we should always apply a ‘safety factor’ to estimates of 1-in-50 or 1-in-100 year events (like the safety factors used in structural engineering, for example). There are ongoing efforts to improve the way that extreme events are modeled—including by my firm. This is an important area of research, but focusing just on this issue is a mistake.

The main purpose of Monte Carlo Models for portfolio management and planning is to show how market volatility can impact an investor’s long-term plans. There are a range of risks that can be understood only with a Monte Carlo simulation or other probabilistic tools. These include sequence-of-returns risk and longevity risk. These models provide insight into how investors can build better long-term plans, accounting for these risks. If I were to read the WSJ article without knowing a great deal about these models, I might walk away thinking that the models were essentially useless and perhaps worse than using nothing at all. The author cites well-known experts in making her argument about the limitations in Monte Carlo models, but she never notes that these same people are strong advocates for the use of Monte Carlo Simulation (even with its limitations).

William Bernstein, one of the experts, was one of the early advocates for the use of these tools in retirement planning. Moshe Milevsky, also cited as a critic of Monte Carlo, devotes an entire chapter of a recent book to the use of Monte Carlo Simulations and advises investors to “ask your financial or investment advisor to generate a Monte Carlo illustration of your financial future.”

Another problem with this article is that it focuses exclusively on the issue of ‘fat tails’ and ignores other factors that drive the outputs of these models. The ‘fat tail’ problem refers to the fact that equity market returns generate extreme events (both good and bad) with higher probability than a model that assumes Gaussian returns predicts. The shorter the time horizon, the more this problem is evident. This often occurs because of momentum effects—momentum creates fatter tails in returns.

It is true that fat tails are an issue of concern in Monte Carlo models, but there are a range of other issues that should be of just as great or more of a concern when looking at these models. Monte Carlo models must generate not only the probability distribution of returns (where one may look for fat tails), but also (1) the standard deviations in all assets, (2) correlations between assets, and (3) expected returns for all assets. These three factors will have at least as great an impact on long-term planning as the method used to generate the probabilities of extreme events. When combined together, these variables determine the construction of an “optimal” portfolio.

The fat tails issue is only one part of what makes a good model. Any investor or advisor considering the use of Monte Carlo models would do well to examine the three factors listed above before worrying about how the model captures ‘fat tails.’ FinancialEngines (a well-known provider of Monte Carlo simulations) projects that emerging markets have lower long-term expected returns than domestic equities, whereas Quantext Portfolio Planner projects that emerging markets have higher expected return and higher future risks than domestic equities. Financial Engines projects that small cap stocks will not out-perform large-cap stocks to any meaningful degree over the long-term, whereas Quantext Portfolio Planner consistently shows a size effect in which riskier small-cap stocks out-perform. These sorts of differences will typically have a substantial impact on portfolio planning.

The irony of the WSJ article in suggesting that Monte Carlo tools have failed investors in this time of crisis is that these tools tend to (1) discourage exceedingly risky portfolios, (2) encourage higher savings rates (pdf), and (3) encourage the creation of actively diversified portfolios (as opposed to the ‘passive diversification’ benefits from simply buying some of everything). The first of these is especially important in tempering investors’ tendencies to chase performance. One of the general features of Monte Carlo simulations is to suggest that it is impossible to obtain more than some threshold level of return for a given level of risk over extended periods of time. On that basis, simply betting on asset classes that have generated very high returns over recent years is a mistake. Quantext Portfolio Planner suggested that most major asset classes (and especially real estate and emerging markets) were overvalued in 2007, for example.

While the WSJ article implies that MCS is not useful in its current form and even cites recent research by retirement planning expert Moshe Milevsky in apparent support of this argument, that is not the conclusion that Milevsky reaches in his research. To the contrary, Milevsky’s article (pdf) cited by the author of the WSJ article could be seen as a basic refutation of the WSJ article:

“Here is the bottom line. Instead of condemning the entire Monte Carlo Simulation industry for missing the meltdown, let’s take this unique opportunity to properly harness the full power of stochastic methods.”

Milevsky proposes that investors and advisors should stress test Monte Carlo models by looking at how much a bad sequence of years (he chooses 3 years for his examples) would impact an investor’s long-term plans. Milevsky proposes a simple metric to capture this effect from Monte Carlo output that he calls the Sequence of Returns Downside Exposure [SORDEX]. The measure of the “extreme” event that he proposes is the projected 1% worse event from the Monte Carlo model. Conversely, one might simply choose to use a specific period as the stress test (2007-2008 perhaps). Milevsky’s point is that extreme events can have a major impact on long-term plans and that investors need to be aware of the fact that 1-in-100 events happen and to be prepared to survive such a scenario. I have been doing tests of Milevsky’s stress test metric inside Quantext Portfolio Planner and I am quite impressed at the utility of this approach. Any good Monte Carlo model should support this form of test.

In summary, I believe that the WSJ model does a disservice to investors and advisors in implying that Monte Carlo simulation and other probabilistic models are fatally flawed because they “did not highlight” a scenario like what happened in 2008. There are many ways in which portfolio analytics for long-term planning can be improved—and these models will evolve and improve over time—including with the addition of ‘fat tailed’ outcomes. Whether with additional analytics to focus on extreme events or not, investors should stress test their portfolios using a range of conditions, and I think that some form of Milevsky’s SORDEX test is an excellent approach. 2008 notwithstanding, Monte Carlo simulation and related planning tools have helped their users to estimate the interplay of diversification, longevity risk and market risk in the process of building better portfolios and financial plans than they could have otherwise.