(Credit: Pixabay.com)
In Part 1 of this series on backtesting, we looked at how backtesting is intended to be used with portfolio strategies, and how important it is to look at how the strategy performed over as many different time periods as possible.
To recap, my definition of backtesting is as follows:
Backtesting is the use of historical data to assess how a rational "buy and hold" portfolio investment thesis has performed in the past over several extended periods. After a prudent assessment, the strategy may be considered to perform similarly in the future over an extended period.
In Part 1 of the series, we looked the key statements
- "rational 'buy and hold' portfolio investment thesis", and
- "several extended periods"
In this article we look at what makes a "prudent assessment".
The "Prudent Assessment"
So we have a rational investment thesis, and we have tested it over several time periods as far back into the past as possible. Now it's time to interpret the data.
I will refer to the low EV/EBIT strategy for the Large and Mid Cap, US Market throughout this article. For reference, the stocks screened on 01 Jan of this year are as follows:
Ticker | Name | EV/EBIT(as of 01 Jan 2016) |
APOL | Apollo Education Group Inc | 0.3 |
PDLI | PDL BioPharma Inc | 1.53 |
MOH | Molina Healthcare Inc. | 1.8 |
NSU | Nevsun Resources Ltd | 1.94 |
BPI | Bridgepoint Education Inc | 2.58 |
AGX | Argan Inc | 2.73 |
CYD | China Yuchai International Ltd | 2.79 |
TA | TravelCenters of America LLC | 3.01 |
EXTN | Exterran Corp | 3.26 |
HPQ | HP Inc | 3.43 |
GBX | Greenbrier Companies Inc. (The) | 3.49 |
RPXC | RPX Corp | 3.49 |
BPT | BP Prudhoe Bay Royalty Trust | 3.6 |
CYOU | Changyou.com Ltd | 3.61 |
ENTA | Enanta Pharmaceuticals Inc | 3.69 |
ATW | Atwood Oceanics Inc. | 3.76 |
CALM | Cal Maine Foods Inc | 3.83 |
ARLP | Alliance Resource Partners LP | 4.03 |
OUTR | Outerwall Inc | 4.1 |
WLKP | Westlake Chemical Partners LP | 4.11 |
EPE | EP Energy Corp | 4.51 |
PZE | Petrobras Argentina SA | 4.54 |
UEPS | Net 1 Ueps Technologies Inc | 4.56 |
CENX | Century Aluminum Co | 4.74 |
GME | GameStop Corp. | 4.77 |
SSL | Sasol Ltd | 4.78 |
AR | Antero Resources Corp | 4.86 |
SAFM | Sanderson Farms Inc | 4.9 |
ESV | ENSCO Plc | 4.91 |
VLO | Valero Energy Corp | 4.98 |
(Source: Portfolio123 Data)
For this strategy, the table below provides typical summary results since 1999:
(Source: Portfolio123 Data and Author Calculations & Table)
Let's take a closer look at these metrics.
The Returns
What do nearly all investors look for first when they're looking at new strategies? The return, of course. This is how the strategy has performed, and what investors hope to emulate. Returns are presented as averages to capture performance over many periods. The averages can be presented in different ways, and it's important to understand each one.
To illustrate, let's look at an investment that has performed over 10 years as follows:
Year | Annual Return, % | Balance |
0 | $1,000 | |
1 | 10.0% | $1,100 |
2 | 10.0% | $1,210 |
3 | 45.0% | $1,755 |
4 | -25.0% | $1,316 |
5 | -35.0% | $855 |
6 | 15.0% | $984 |
7 | -8.0% | $905 |
8 | 8.0% | $977 |
9 | 15.0% | $1,124 |
10 | 5.0% | $1,180 |
We have returns for each year, but how did the strategy do over the full period?
If we look at the arithmetic average (or mean, each yearly return summed and divided by 10 years), we get an average annual return of 4%.
While helpful, this annual rate looks at each year in isolation, without any consideration for compound interest. Looking at this another way, if you assumed a return of 4% per year, and compounded each year at 4%, you would arrive at a final portfolio value of nearly $1,500, which is not what happened in reality.
Year | % return | Balance | Arithmetic Average | Balance |
0 | $1,000 | $1,000 | ||
1 | 10.0% | $1,100 | 4.00% | $1,040 |
2 | 10.0% | $1,210 | 4.00% | $1,082 |
3 | 45.0% | $1,755 | 4.00% | $1,125 |
4 | -25.0% | $1,316 | 4.00% | $1,170 |
5 | -35.0% | $855 | 4.00% | $1,217 |
6 | 15.0% | $984 | 4.00% | $1,265 |
7 | -8.0% | $905 | 4.00% | $1,316 |
8 | 8.0% | $977 | 4.00% | $1,369 |
9 | 15.0% | $1,124 | 4.00% | $1,423 |
10 | 5.0% | $1,180 | 4.00% | $1,480 |
This is where the geometric average comes in handy. This rate tells you the average compounded rate you would have achieved per year, taking compound interest into account. In our example, the geometric average for the period is 1.67% per year. Compounding this rate each year, we arrive at the actual value of our portfolio after the 10 years.
Year | % return | Balance | Geometric Mean | Balance |
0 | $1,000 | $1,000 | ||
1 | 10.0% | $1,100 | 1.67% | $1,017 |
2 | 10.0% | $1,210 | 1.67% | $1,034 |
3 | 45.0% | $1,755 | 1.67% | $1,051 |
4 | -25.0% | $1,316 | 1.67% | $1,068 |
5 | -35.0% | $855 | 1.67% | $1,086 |
6 | 15.0% | $984 | 1.67% | $1,104 |
7 | -8.0% | $905 | 1.67% | $1,123 |
8 | 8.0% | $977 | 1.67% | $1,142 |
9 | 15.0% | $1,124 | 1.67% | $1,161 |
10 | 5.0% | $1,180 | 1.67% | $1,180 |
It's important to note that the geometric mean is not a "realtime" value. It can only be determined when looking over a period of the investment; it also changes with the period of time that is being covered.
Arithmetic and geometric average return values in isolation don't quite give you the full story. The arithmetic mean gives you the "standalone" average rate per year, but cannot be used as a compounding rate to estimate the value of a portfolio over time.
The geometric mean takes into account compounding, but is not the actual rate you would have achieved in a given year.
An investor needs to take information from both of these rates to assess the return of the strategy over time.
Volatility
What neither of these rates accounts for is volatility. How can you measure how volatile, or how much the annual returns vary, per year or per month? This is where standard deviation (or SD) comes in. The SD is simply a measure of how scattered the various values are around the mean. The lower the value, the less scatter, and thus the less volatility in the strategy.
For our example strategy, we have a historic SD of 21.3%. Essentially this means that the average difference between a given year's return from the period's average return is either 21.3% greater OR 21.3% less than the mean.
How can you determine if volatility (or SD) is high or low? This will vary depending on the type of strategy you are looking at. In most strategies that I look at, the average is about 18 - 22% going back to 1999. Anything less (15 - 18%) is low, and would be considered good. Values greater than 22% will really test an investor's staying power. Conventional wisdom tells us that you need to take on more risk/volatility to achieve higher returns, however this may not always be the case. Be sure to compare any performance against the risk and volatility by referring to the standard deviation.
The takeaway here is the lower the SD the better. When looking at a strategy's return, it is always good to weigh the potential return with the volatility. Many, but not all, high return strategies are accompanied by higher levels of volatility. Return is important, but more importantly, if you don't have the stomach to hold on during times of high volatility strategy, then the return becomes meaningless.
Annual or monthly SD?
Volatility over more time periods is better than less. In technical terms, returns and their deviation from the mean is measured monthly. This allows investors to see how a strategy has fluctuated over monthly periods; if looking only at annual deviation you would miss all of the peaks and valleys within the year.
While we encourage investors to avoid very frequent checks on their portfolio returns, it is very difficult to not check for an entire year. For this reason, monthly SD values are more useful. When researching backtest data be sure to verify which SD value is being presented.
Risk Adjusted Returns
You will hear this term often in quantitative performance statistics. This is essentially the return you are achieving above and beyond the risk free rate (say US T-Bill), per unit of volatility. This measure weighs risk against reward; is it worth the trouble of investing in this strategy, or would it have ended up being the same as just investing hassle free in a T-bill?
In mathematical terms, this is simply the difference between the strategy return less the risk free rate over the same period, divided by the volatility (or standard deviation above). This is known as the Sharpe Ratio. A high, positive Sharpe value tells us that the strategy return beats the risk free rate; the higher the value, with less volatility. Lower values indicate less of a beat compared to the risk free rate and/or with more volatility.
To sum up, the higher a Sharpe ratio the better.
Sometimes it is the downside that investors are more concerned about, which is where the Sortino Ratio comes in. This is the same concept as the Sharpe ratio, instead of using the standard deviation, the Sortino ratio uses the downside deviation (which then measures how values deviate below the mean). Like the Sharpe ratio, the higher the value, the better.
You may find a strategy with a very good return, but also with higher volatility. The Sharpe and Sortino ratios can be very helpful here in that they tell you how much return (above the risk-free rate) you are achieving per unit of volatility. When comparing two strategies, the one with the higher Sharpe and/or Sortino ratios suggest you are achieving greater returns but with less volatility.
How Low Can you Go?
The ugliest statistic, maximum drawdown tells you the maximum amount the strategy would have decreased in value in the worst performing period. In the last 17 years, this is often, but not always, during the 2007-2009 period. Drawdown is unavoidable, however you should work to minimize it. If a strategy has had significant drawdown at some point, take this into account. It is much more difficult to hold on when your portfolio has lost 75%, and the benchmark only 50% or less. It is often after these drawdowns that the recovery happens.
What are the Odds?
Once compiled, rolling periods of data can provide very concise results and a clear picture (we discussed rolling periods at length in Part 1 of this series). This is done by calculating the odds of beating a benchmark over a given length of time, with all rolling periods within that time period. For example, assuming an investment in each month of the year, from a 5 year period from 2010 to 2014, you have 5*12 = 60 rolling 5 year periods. From 2005 - 2014, 120 rolling periods, etc. Backtesting will then compare results from each rolling 5 year period of the strategy to the same rolling 5 year periods of the benchmark. The number of times the strategy beats the benchmark divided by the total number of periods then gives you the odds of beating the benchmark, or the "base rate".
It is also helpful to see by what average value the strategy beats the benchmark. This is a very powerful statistic, and it should be considered with any strategy. You will see that most strategies improve their base rates with longer holding periods. Seeing how the strategy has performed on a rolling basis for 1, 3, 5, 7 and 10 year rolling periods provides a very good picture of the strategy in only a few lines. Many funds and strategies report 3 or 5 year performance, however it is usually for only one period, i.e. Jan 2010 - Dec 2014. It is important to understand the performance on a rolling basis.
For an example of this, see below for a summary table on the low EV/EBIT strategy I wrote about recently (Large & Mid Cap, US):
(Source: Portfolio123 data and Author Calculations)
How do you apply this information? Say you're invested in a quantitative strategy, and you're 5 years in. Performance begins to drop. What do you do? When you chose the strategy you found out that it has beat the benchmark over 97% of 5 year rolling periods. If you are lagging the benchmark, look at it from a 5 year period, and you have the comfort knowing that you are still within the odds.
This said, there are still no guarantees in investing. While quantitative investors purposely test as far back as possible to capture the strategies over different periods, it is not a 100% guarantee that this will continue. There will be new events that may change these statistics. That said, using base rates is one of the most powerful tools at your disposal in quantitative investing.
"Extended Period"
In the final line of our definition of backtesting, it states:
After a prudent assessment, the strategy may be considered to perform similarly in the future over an extended period.
As it is important to test strategies over periods as far back as possible, the results of the past may not show themselves in the near future after taking a position in the strategy. A single year or two of performance may be different than the mean data. Or the strategy may be in a lull and actually have a stretch of underperforming years.
Easier said than done, the results that backtesting provides are intended to be applied long term into the future.
Concluding Remarks:
We have discussed several key metrics of backtested data. I hope that this will encourage investors to review backtest results thoroughly before coming to a conclusion about a given strategy.
In the 3rd and final installment of this series, we will take a closer look at testing universes, data sources and other important considerations in backtesting.
Stay tuned!
Follow me on Seeking Alpha for more articles on Quantitative Value Investing!
Disclosure: I am/we are long AGX, ARLP, BPI, HPQ, NSU, RPXC, SSL, UEPS, VLO.
I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.
Additional disclosure: I am a user of Portfolio123, and have included affiliate links in this article.