Can You Trust Monte Carlo Models?

Includes: QQQ, SPY, XLU
by: Geoff Considine

I recently found an article about the use of Monte Carlo simulation for financial planning and it resonated with some comments that I have heard from people who are considering the use of this type of tool. The article, written by Moshe Milevsky and Anna Abaimova (of The Individual Finance and Insurance Decisions Center at Canada’s York University) appeared in July on the website of The Journal of Financial Planning. The article raises a number of common questions about the use of Monte Carlo tools for financial planning and serves as a nice focus for a discussion of the kinds of testing and validation that is required to make Monte Carlo programs useful for planning and managing a portfolio of real assets.

The title of this article, Will The True Monte Carlo Number Please Stand Up?, is a very good starting point for clearing up some points of confusion about the use of Monte Carlo models for planning. The focus of the article, as the title suggests, is that different Monte Carlo simulations can give different answers as to the survival rates of a generic portfolio for supporting a stream of retirement income. The authors compare six different Monte Carlo models in determining the sustainability of a theoretical retiree’s income.

In this study, the authors started with the assumption that his “entire nest egg was assumed to be invested and rebalanced in a portfolio of diversified equities which was projected to earn an arithmetic average 7 percent (after inflation) each year, which is equivalent to a geometric average of 5 percent with a standard deviation or volatility of 20 percent.” They then attempt to compare the six different Monte Carlo models. Some of the models project survival rates for specific time horizons (the probability of the portfolio sustaining a specific level of income for a specific period of time) while two the models fold in mortality rates—which means that they calculate the odds of being able to sustain an income until death, where your age at death is uncertain. Needless to say, there are significant differences between these models—largely because the underlying assumptions are different. The authors discuss the large range of differences as though there is problematic or odd here:

“Our first reaction was to blame the programmer or manufacturer of the [Monte Carlo models] for building a faulty product. Like all high-tech gadgets on the market, some are better than others, but that is a simplistic, knee-jerk reaction. The true reason for the divergence of results is more complicated and subtle…”

At this point, I beg to differ. There is nothing subtle about the fact that these models were built with very different underlying algorithms and assumptions. Among these models, one simply re-samples historical data while others make arbitrary assumptions about the future returns and volatilities of various asset classes. It should come as no surprise to anyone that these models yield different results. The authors further opine that all of this disagreement can be solved if the financial planning industry were somehow to adopt standards to make the models generate output that is more similar across platforms.

We performed our own comparison study in 2005 by comparing our own Quantext Monte Carlo simulations to three other models under very controlled conditions, and accounting for differences in the underlying models to the extent that this is possible.

This comparison is similar to the analysis by Milevsky and Abaimova (hereafter referred to as M-A) except that we chose four models that were conceptually similar and that were first reviewed to get a sense that they were reasonably robust. We found that, with consideration given to differences between models, the results were remarkably consistent.

There are several key points that individual investors and their advisors would do well to keep in mind when reading the M-A article—or others like it. First, some of the available Monte Carlo models – such as the one on (one of those included in the M-A study) – are designed for purposes of illustration and certainly not to provide concrete numbers for planning. Second, anyone who uses software for financial planning must do some due diligence. Why would one believe that any model is providing reasonable results? The four models that we used in our analysis were selected based on a reasonable degree of internal consistency (to the extent that this could be determined) based on our due diligence and that appeared to be professionally executed.

Aside from the basic mathematical issues of Monte Carlo analytics, there is a much larger issue at hand—one that is not mentioned at all in the M-A article. A Monte Carlo tool that calculates survival rates based on either pure history or simply an a priori assumption that your portfolio will generate X% per year with Y% of standard deviation is basically useless—the hard part of the problem is coming up with X and Y for each asset and accounting for the impacts of the differences between real asset performance and hypothetical index performance. The authors pose the following question:

“What is the probability the total return from the S&P 500 index will be greater than 5 percent in 2006? There are a number of philosophical approaches to dealing with such a question. One is to literally “dump” the one-year historical returns of the S&P 500 index in a “hat,” sample from this hat a large number of times, and then count the frequency with which the sample average exceeded 5 percent. This approach is at the heart of Monte Carlo. But as many mathematicians know, another approach is to fit a curve (for example, the normal distribution) to long-term historical returns and then evaluate the curve—that is, compute the tail probability—at the 5 percent mark.”

We can agree that the ability to calculate the probability that the S&P500 will generate a return greater than 5% in any given year is going to be critical to the results from a Monte Carlo model. Here is where things get interesting. To come up with assumptions about the probability of the market generating some percentage of return in a given year requires some critical inputs—namely how to determine the future volatility and expected rate of return for the S&P500. The authors of this article start with an assumption about the average rate of return and its standard deviation (and focus on how the calculation is performed once you have these inputs), but the results that a Monte Carlo model generates will be sensitive to these input assumptions. What should these numbers be, and can we obtain estimates for them that are better than guesswork or using pure history?

The M-A article basically states that the there should be some degree of consistency in how the mathematical operations are performed—and I agree. Our own comparison showed a reasonable consistency between four Monte Carlo models. This is only a starting point in determining the value of a Monte Carlo framework, however, and consistency between models in such controlled circumstances falls far short of making Monte Carlo models useful. The really germane issue is demonstrating that a model can generate reasonable inputs to run the Monte Carlo—say the probability that the S&P500 will generate a return of 5% or more in the next twelve months.

In professional portfolio simulation applications, there is a standard approach to coming testing the reasonableness of these inputs and it is called ‘mark to market.’ In this approach, the volatility for a given instrument generated by the Monte Carlo model is compared to the implied volatility that is backed out from the price of the underlying and the prices at options on that instrument are trading. For a detailed explanation and examples using our own Monte Carlo models, see this article (pdf).

Quantext’s tools have been shown to generate Monte Carlo output that is consistent with the implied volatility of the market—and this is a good test for determining that the projected future results from the Monte Carlo are plausible. I will note that while the concept of mark-to-market testing is uncommon in financial planning for individuals, it is standard in many areas of finance such as professional risk management applications and accounting standards for dealing with risky assets. One important hurdle for Monte Carlo tools is generating consistent mark-to-market results for high Beta instruments (such as QQQQ) or low Beta instruments (such as XLU)—as in the paper above—along with the S&P500. The ability to generate consistent mark-to-market results for high Beta and low-Beta shows that the model is doing a reasonable job of forecasting volatility for assets with high levels of systematic risk and for those with high levels of non-systematic risk (respectively).

While it is very important to be able to simulate future volatility for the S&P500 (i.e. the market as a whole), the real challenge is simulating the statistical parameters for a portfolio made up of real assets. These assets have fees and may, in fact, behave somewhat differently than the underlying indices would imply. This is not a distinction that I would worry much about if an investor looks at the S&P500 as a proxy for SPY, for example, but it can become a major issue in real portfolios—see for example this article by John Bogle.

Monte Carlo models need to have the ability to generate meaningful input parameters for real assets, both individually and in a total portfolio. The interaction of real assets in the total portfolio is often a challenging problem. Asset allocation is all about how to get the greatest benefit from a mix of assets. Do we really expect that individuals or their advisors are going to use tools that require them to somehow perform all of these complex calculations to generate input to Monte Carlo models?

The practical value of Monte Carlo tools is essentially nil if an advisor or individual investor must generate a projection for the average return and standard deviation in return of a portfolio of real assets (stocks, mutual funds, and ETF’s) and then provide these as inputs to the Monte Carlo model—along with the additional input statistics for capturing the correlations between these assets. The potential variation in the estimates for these inputs typically far exceed the importance of the other input or modeling assumptions. To discuss the range of possible outputs from Monte Carlo models without addressing the core problem of where the input assumptions about future asset performance come from is, in my opinion, of limited value. For Monte Carlo models to have value for practical financial planning applications, they must be able to take a real portfolio of assets as input and generate reasonable projections of the future risk and return for the total portfolio.

To be fair, authors of the article cited at the start of this paper did not intend to address the issue of where the inputs come from. Indeed, they are mainly saying that multiple Monte Carlo models, given exactly the same inputs, should generally provide very similar outputs. This is not unreasonable. Any analytical models that are to be used for something as important as financial planning must be testable. That said, aside from uncovering coding errors, such tests are ultimately not terribly helpful to an advisor or individual investor who wants to use Monte Carlo tools for portfolio planning. My point is that for Monte Carlo models to be useful, you must go far beyond this type of test and deal with the combination of the underlying simulation and the input values for simulating a portfolio of real assets. Monte Carlo models are not useful for the vast majority of investors or their advisors unless the models generate their own parameters and can be used to simulate portfolios of real assets without the potential users needing a Ph.D. in finance to run them.