Simulated Backtests May Not Be Realistic For Volatility ETNs

|
Includes: SVXY, VXX, XIV
by: Robin Hewitt

Summary

Best benefit from the volatility ETNs comes from using a trading strategy.

Trading strategies are commonly evaluated with synthetic data to cover longer times.

Backtesting these ETNs on synthetic data is widely considered valid.

This article shows that on the contrary, short-lived tracking errors can dramatically change the results.

The volatility ETNs VelocityShares Daily Inverse VIX Short-Term ETN (NASDAQ:XIV) and iPath S&P 500 VIX ST Futures ETN (NYSEARCA:VXX) have attracted much interest. Since these products provide the greatest value when used in conjunction with trading strategies that seek to avoid large drawdowns, numerous strategies have been developed. However, both ETNs were brought into existence after the most recent major crisis in 2009. That means they've only been through part of the full volatility cycle that runs through boom, bust, and recovery. It's therefore desirable to backtest these trading strategies against longer time periods to include a diversity of conditions.

Two data sources are available for simulating ETN performance beyond their lifetimes: 1) the index these ETNs track, which goes back to 1/31/2006, and 2) the VIX futures on which this index is based and for which there is data going back to their introduction in 2004. It's become common practice for anyone presenting a trading strategy based on these ETNs to demonstrate and compare their strategy against other strategies using one or both of these data sources.

The indexes are generally considered a safe substitute for these ETNs when backtesting and comparing trading strategies because 1) the ETN managers have an obligation to track these indexes and have set up safeguards to correct tracking errors, and 2) various people have found that the ETNs track their indexes fairly well (within a few percent) year over year. Figures 1 and 2 show the ETNs over their full lifetime with their indexes. VXX looks pretty good. XIV obviously has some drift.

VXX and its index

Figure 1. VXX tracking its index.

XIV tracking its index

Figure 2. XIV tracking its index.

Despite its longterm drift, XIV tracks reasonably well on a yearly basis, generally slipping 0.5 to 3%, although sometimes slipping 4-6%. Monthly, XIV does better yet, slipping less than 4%. Weekly slippage for XIV is also below 4%. VXX, as you might guess, does better on a yearly and monthly basis. However, on a week-to-week basis, VXX actually sometimes sees wider swings than XIV, slipping as much as 5-6%.

However, where things get really interesting is with the daily slippages for XIV and VXX. Figures 3 and 4 show what these look like.

VXX daily slippage by percent index change

Figure 3. VXX daily slippage by percent index change.

XIV daily slippage by percent index change

Figure 4. XIV daily slippage by percent index change.

As you can see in Figures 3 and 4, there's a distinctive non-linearity when the index change is large in magnitude. Above 7.5% and below -7.5%, the ETNs tend to compress their index. The other thing that happens is the range of slippage becomes generally wider at these extremes. The range of slippage is quite wide in several other bins as well. This wide range of scatter has the potential to be a significant problem when backtesting with simulated data. Imagine if 10% increases were handed out from time to time to some strategies and not to others. That would clearly skew the results!

However, we also see that the distributions are fairly well balanced. While there's a lot of scatter, it's spread around in both negative and positive directions. So is this a problem or not? We can't answer that from just these charts. It looks like there's a possibility of problems, but there's also a re-assuring symmetry in these tracking errors that might cancel out.

The only way to find out is to test. I did that by running backtests of several common trading strategies on both the ETNs and their index within the time since the ETNs began. Let me note up front that there are some peculiarities to backtesting with the index. It has no open and close prices, just a daily number from settle to settle. To match that as closely as possible using the ETNs, I buy and sell only the close. I also use that day's trading signal as the decision rule for opening and closing positions that same day. This roughly simulates buying and selling at the close based on a signal that fires just before the close. In this way, I use end-of-day data for both the ETN and the index, which should help make the tests more comparable.

The trading strategies I backtested are

  • Vratio
  • Vratio10
  • CB_10_9_-8_-7
  • CB_5_2_-8_-7
  • CB_5_2_-5_-4

Vratio and Vratio10 are from Tony Cooper's Easy Volatility Investing. The CB strategies are contango-backwardation with four thresholds: XIV-Buy, XIV-Sell, VXX-Buy, and VXX-sell. I picked these strategies because I think they are widely used. The first CB strategy is what got me started investigating slippage. A commenter asked about using a high threshold for contango as a conservative buy indicator. He proposed 10%, but didn't supply any further thresholds, so I picked three more in the same conservative spirit and ran a backtest. It gave a decent return. I thought it would be a good idea to backtest over a longer timeframe, so I set up to test with the index, checked my setup by backtesting with the index over the same timeframe as the ETN...and well, you'll see what happened! The third CB strategy uses thresholds that I already knew would probably do fairly well, and the one between is a hybrid. All backtests ran from 03-Jan-2011 to 12-Feb-2016.

Let's get straight to the results:

Backtesting with ETN v with Index
Strategy ETN Gain Index Gain Net Slippage
Vratio 259.1% 238.6% 6.1%
Vratio10 312.3% 299.3% 3.3%
CB_10_9_-8_-7 170.1% 64.2% 64%
CB_5_2_-8_-7 227.8% 84.2% 78%
CB_5_2_-5_-4 328.8% 162.0% 64%

With the two Vratio tests, we see a small amount of slippage, consistent with the expectation that positive and negative slippages would likely cancel out. But the three contango-backwardation backtests had extremely large net slippages. So much so that while its index performance puts CB_5_2_-5_-4 in the middle of the pack for net gains, tracking errors moved it to first place in the ETN results!

If these results are correct (and I'd encourage readers to check me on this since it's quite surprising), we must accept that backtesting with index data is not a good proxy for ETN performance -- at least for some trading strategies.

Is there anything we can do to get around this problem? One possibility is to check each strategy for excessive slippage in the overlapped period when both ETN and index are available. If slippage is mild, that suggests the strategy is evenly distributing the conditions that give positive and negative tracking errors, and may be more reliably backtested over a longer period. With strategies that do show bias, a deeper analysis may make it possible to adjust for that bias. On a related note, if short-lived tracking errors can make this much difference, it would be helpful to occasionally re-evaluate the slippage effects of strategies one uses, to see if they've changed.

Finally, it's my opinion that futures data prior to 2006Q3 is not valid for backtesting these ETNs in any case. The reason is that M1 and M2 are not consistently present in the futures prior to that time. Since, even with M1 and M2 data, it's questionable whether we can do a meaningful backtest, the substantial additional uncertainty of missing futures data is surely over-reach.

Notes

Definition of "slippage" as used in this article:

I define a change in the ETN as the product of the change in the index and the change due to slippage:

(1+P) = (1+I)*(1+s)

Where

P is the gain/loss rate of the ETN over some time T,

I is the gain/loss rate of the Index over time T, and

s is the slippage factor over time T.

Rearranging this definition, slippage is calculated as

s = ((1+P)/(1+I))-1.

Does the ProShares Short VIX Short-Term Futures ETF (SVXY) have less slippage than XIV?

While I did not backtest with SVXY, I did plot its daily slippages, and they're very similar to XIV.

Disclosure: I/we have no positions in any stocks mentioned, but may initiate a long position in EITHER XIV OR VXX over the next 72 hours.

I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it. I have no business relationship with any company whose stock is mentioned in this article.