Monte Carlo portfolio planning tools allow investors to account for the effects of volatility on their long-term plans. These models simulate many possible future outcomes and calculate the probability that an investor’s portfolio will be able to provide a long-term income stream or that it will meet some other goal. These models must generate statistical projections for the future risks and returns of all assets in a portfolio, as well as accounting for the relationships between asset classes.

In light of recent years—especially 2008—there has been growing misconception that these tools were inadequate to show investors the potential for substantial losses and thus are of limited use in long-term planning. This may be true for some Monte Carlo models, but certainly not all. In this article, I demonstrate a straightforward way to stress test these models, using Quantext Portfolio Planner (QPP) as an example. The technique that I will present is related to Moshe Milevsky’s SORDEX ratio (pdf). The basic idea behind stress testing is to see how some extreme but possible event will impact long-term plans. Milevsky proposes the following:

1) Design a plan that meets your criteria and has a high probability of meeting your goals using the Monte Carlo Simulation

2) Run the Monte Carlo simulation three years forward

3) Look at the worst 1% outcomes from the Monte Carlo Simulation

4) Calculate the probability of meeting your goal assuming that three years have passed and the worst 1% outcome occurs

5) Look at how much the probability of success has declined from the original analysis

This approach makes a great deal of sense—it is a fairly simple stress test, and simply requires two runs of the Monte Carlo Simulation rather than one. Milevsky shows that a Monte Carlo Simulation run through late 2007 would have clearly demonstrated that a heavy allocation to equities for a recent retiree was very risky under this stress test, even though the baseline run from the Monte Carlo Simulation showed a high probability of success in the original analysis. The motivation for this approach is that people should plan to be able to survive an event with a 1% probability. We buy homeowners insurance which will replace our homes and contents if the house is destroyed, even though this event has a very low probability of occurring. There is a further notion that motivates the stress test. Very low probability events that have severe consequences for one’s well being should be emphasized in planning. There is actually a huge literature on this issue that goes under the name ‘utility theory.’

I am a proponent of stress testing of any planning model, and Milevsky’s article is very timely in this regard. I have developed a closely related stress test that is easier to apply in most Monte Carlo or other planning models. I will demonstrate this using Quantext Portfolio Planner.

Imagine that we are standing at the end of 2007 and trying to build a plan going forward. First off, there were a variety of reasons to believe that most asset classes were over-valued. If we were building a plan in late 2007, how would this go?

**Step 1: Build a Strategic Asset Allocation Plan**

As always, we need a straw man—and for today, it’s Bob. Bob is 65 and has just retired. He has $1M in invested assets and plans to draw $50K per year, increasing by 3% per year to cover inflation, for his retirement—and he is retiring now. Bob has a fairly generic portfolio—and we have analyzed this using QPP with all default settings and data only through 12/31/2007 to see what the world would have looked like back then.

Bob’s portfolio is shown below:

The Portfolio Stats box shows the projected future risk and return for this portfolio, and the Average Annual Return column shows the projected average annual return for each asset class.

Based on this allocation, QPP generates the following probabilities that Bob will run out of money by a certain age:

Bob looks like he is in good shape. He has a 20% chance of running out of money by age 93—or, conversely, an 80% chance of making it to 93 without running out of money.

This process is how Monte Carlo is usually run—we generate survival probabilities. These results depend on the portfolio, as well as other information about the individual. The main variables are projected annual savings vs. income draws each year.

**Step 2: Estimate the ‘worst case’**

The next step is to determine a realistic ‘worst case,’ given that we cannot really know the absolute worst case. I believe that the following process is a good baseline stress test.

1. The expected annual return over an N year period is estimated to be:

(8.7% - 5%) * N

in which the 5% is the estimated draw rate ($50K / $1000K) and the 8.7% is the expected return.

2. We are going to use a 3 standard deviation event as our estimated worst case, which leads to the following downside risk estimate over a 3-year period:

-3 * 10.6% * N 1/2

The factor of three in this equation is because we are looking at a 3 standard deviation event. The 10.6% is the estimated future standard deviation for the portfolio. The square root of N comes in because the standard deviation of a random walk increases as the square root of time. Our total estimated loss over the ‘worst case’ one year period is:

8.7% - 5% - 3*10.6% = - 28%

**Step 3: Simulate probability of success, assuming the worst case comes to pass**

If the worst case occurs, Bob has only $720K after the next year. When we look at QPP for Bob at age 66, with $720K, his odds of being able to sustain his desired income have plummeted:

Bob now has a 20% probability of failure at age 82—11 years earlier than in the original calculation. Bob has a 50% chance of running out of funds by age 88. It is something of a standard pf practice to manage to an 80% success rate / 20% failure rate, which is why I am focusing on this percentile. The loss of eleven years of funding at this percentile is the impact of the very bad 1-year event.

Milevsky looks at how the probability of failure changes, and I am looking at how many years of projected retirement you lose at the targeted percentile, but the process is the same. I find the number of lost years to be a good basic metric. If you are not confident of survival in the event of this worst case, it is a good idea to consider alternative.

It is important to keep in mind that we have assumed that Bob will continue to draw his $50K in current dollars, no matter that his portfolio has lost 28% of its value.

**Step 4: Determine whether the worst case is survivable**

In real life, Bob is not going to keep on spending at the same level after a major reduction in his portfolio value. What Bob needs to determine is whether this worst case is survivable. Obviously it has significant odds of not being survivable without some reduction in spending. Bob determines that in this ‘perfect storm,’ he could reduce his living expenses by 25%. We then re-calculate his odds of survival:

If Bob could reduce his annual draw from this portfolio to $40K per year in the event of the worst case outcome, he is now back up to a 20% probability of running out of funds by age 88. If Bob considers this to be acceptable in the event of a ‘Black Swan’ event, his plan is in okay shape. If Bob wants to get all the way back to his original odds of success, he will need to either (1) change his asset allocation, or (2) be willing and able to further reduce his income in the event of the worst case outcome.

**Step 5: Mean reversion**

I noted near the start of this article that QPP projected that almost every asset class was over-valued as of the end of 2007. Bob’s portfolio was only moderately exposed to this over-valuation because of his 50% allocation to bonds, but his portfolio is projected to generate an average return of 8.7% per year and the most recent three years have returned an average of 9.2%. This suggests that a period of lower-than-expected returns is likely so that the long-term average will converge towards the expected value. This portfolio would need to return an average of 8.2% over the next three years to bring the long-term balance into equilibrium (to average the 9.2% with 3 years of lower returns and get an average of 8.7%). On the other hand, things could revert to the mean a lot faster (as they did in 2008), but we cannot know that ahead of time. For a more extreme test, we might assume that the 3-year excess returns will mean revert over the next year, in which case the expected return over the next year is:

4*8.7% - 3*9.2% = 7.2%

so the new ‘worst case’ estimate becomes:

7.2% - 5% - 3*10.6% = - 29.5%

This will make the probability of running out of funds even worse. In Bob’s case this is not a huge effect. In the case of investors who were overweight sectors like emerging markets, this would have been far more severe.

There is another factor that is worth noting: mean reversion in volatility. The trailing volatility for this portfolio for the three years through 2007 was 6.03%. The projected volatility was 10.6% (see the QPP output shown earlier). If we assume that volatility also mean reverts, we would need to adjust the projected volatility considerably upwards (as we adjusted return down). I will discuss this topic in a future article.

**Summary**

In 2008, Bob’s asset allocation lost only 14%, but his total portfolio value is down 18%-20%, depending upon the schedule upon which he drew his income. This is more like a 2-standard-deviation event than our estimated ‘worst case’ 3-standard-deviation event. If Bob had run the kind of stress test described above in 2007, he would have had a good idea of his potential exposure to an extreme event—and this is the whole point of stress testing.

Embedded in this process, there are two important ideas. First, a 3-standard-deviation event is a 3-in-1000 event based on the assumption of Gaussian returns (no fat tails). In reality, however, we don’t know the true probabilities of these events. Even if returns are Gaussian, we have what is sure to be only a rough approximation of the true standard deviation of the returns—and this estimation error will raise the probability of under-stating the risks. Thus, the use of a 3-standard-deviation event is a reasonable stress test. If we incorporated the mean reversion in volatility, the projected 28% loss would have been estimated as far more probable than the 3-standard-deviations, of course, but using this as a basic ‘worst case’ that needs to be survivable.

The second important idea is that the consequences of an extreme outcome are so bad, that we need to pay attention to this kind of event, even if it has a very low probability. As I noted earlier, this is an outcome of utility theory.

The simple equation that I provide for estimating worst-case scenarios for stress testing of financial plans will make it easier to apply this approach to various models. Most planning tools do not provide the estimated 1% tails from which to estimate a ‘worst case.’ The reason for this is quite sensible: the estimation error of 1% tails is very high because of model uncertainties and limitations. A more substantial problem is that very few Monte Carlo or other planning tools (with QPP being a notable exception) allow the user to see what the model would have projected from some earlier date.

The number of lost years at the 20th percentile survival rate provides an intuitive metric for seeing the impacts of extreme events, and you can look at extreme events on various time scales. I used 1 year in this case, but Milevsky uses 3 years in his example. While he does not say so, he probably chose 3 years because this is a time horizon on which statistical models tend to do a better job than one year. Putting it another way, fat tails are less extreme on 3 year time horizons than 1 year.

The bottom line, as it were, is that stress testing can help investors and advisors make sure that extreme events are survivable. This does not mean that these models have a magic formula for predicting the ‘true’ probability of events like 2008, but rather that these stress tests provide ‘emergency drills.’ An important take-away is that the potential impact of extreme events is determined by the specifics of the investor (age, income flexibility, planned income, risk tolerance) and the portfolio. Our model investor, Bob, is at the highest risk point in his investing career, the years surrounding the onset of retirement. The long-term impact of a severe market downturn will typically be lower for both older and younger investors.