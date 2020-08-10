Figure 1: overview of our backtesting results (basis: 1)

This is the first article of the article series where we (Andrii and I) will give you our take on how you can, how you should and how you shouldn’t trade volatility. In our series we plan to cover a range of volatility forecasting models (Historical Volatility/ARMA/GARCH), different optimization methods for them (SR/OLS/ML) as well as answer other volatility trading related questions. This article gives an introduction to volatility trading for those not familiar with the concept and also aims to answer the following research questions:

Which of the Historical Volatility models tested (Historical Mean/Simple Moving Average/Exponential Moving Average) yields the highest out-of-sample Sharpe Ratio? Which optimization method is more suitable for trading: Sharpe Ratio or OLS? Which volatility should you forecast: Implied or Realized?

Whether you are doing research to conduct your first investment or you already are an experienced trader you most likely do know what COVID-19 did to the financial markets. Since the beginning of 2020 more ETFs closed than opened and the behaviour of major equity indices could remind one more of a rollercoaster than an investment opportunity. To make matters even worse, fixed income markets do not look particularly attractive either, with interest rates being negative or only slightly positive in most developed economies.

As it happens after each economic downturn now the market is extra curious about alternative investments, which aim to deliver positive returns irrespective of the market environment. In this article, we would like to present to you one of the alternative investment strategies - volatility trading.

The idea of volatility trading is simple - you attempt to estimate the future volatility of a certain underlying and then trade the volatility of the underlying to make profits. In our case, we will be forecasting the volatility of the S&P 500 and trade it by comparing our forecast with the markets expectation of volatility. As it is mentioned in the preface the forecasting models we will work with in this article are Historical Volatility models. We will use them to predict either implied or realized volatility. We plan to test common (OLS) and uncommon (Sharpe) optimization methods. The two following sections are meant to give you a brief introduction to to the VIX, VIX products, methodology and forecasting models.

ABOUT THE VIX AND WHY IT IS DIFFICULT TO TRADE

The expected volatility of the S&P 500 is measured by the VIX, an index that aims to derive the market's expectation of future volatility from the implied volatility of options on the S&P 500 that expire on average in 30 days. There is a problem with the VIX though - you can’t trade it. Thank you for reading the article, we hope you enjoyed it! As the VIX is no more than a mathematical indicator, what you actually can trade are VIX futures and VIX ETNs. The product that we will use to trade the VIX is the VXX, which is a VIX ETN where the weights of each VIX future are selected in such a way that the average expiration of the futures is 30 days. Going more into detail on why we decided to use the VXX instead of other VIX products would be beyond the scope of this article, but our choice is based on a more extensive analysis of VIX products, which you can find here.

Before we move on, it is, however, important to emphasize that neither VIX futures nor VIX ETNs can actually track the VIX. As the figure below demonstrates, while the VIX always reverts back to its long-term average over time, the VXX slowly decays towards 0.

Figure 2: VXX vs VIX. Please note the logarithmic scale of the VXX. In 2008, the SPVXSTR index was used as a proxy for the VXX

The reason for that behaviour is very simple: as it is a generally known fact that the VIX reverts to its mean, the further its current price is above the mean, the fewer investors would be willing to buy it and vice versa. In order to compensate investors for that, the term structure of the VIX futures changes depending on the current VIX level. In case the VIX is above its mean, the futures are in backwardation, implying that investors collect a roll yield when they are willing to buy the VIX, despite its mean-reverting property. When the VIX is below its long term mean, VIX futures are in contango, resulting in the opposite behaviour.

Figure 3: Terms Structure of VIX futures

VOLATILITY FORECASTING MODELS: HISTORICAL VOLATILITY

Historical Volatility models are the simplest models one can use to forecast volatility. All you need to know are the previous volatility levels and that’s it, nearly no data manipulation is required. We have picked three models that we will test in this article:

Historical Mean

The main idea behind this model is the mean-reversion of the VIX. We will calculate the mean of all the previous closing prices of the VIX for our IVOL forecast and the average of all the previous annualized realized volatilities for our RVOL forecast daily. Then each day we will take either a long or short position in the VXX depending on whether the RVOL/IVOL average is below or above the current VIX level. This is the only model that does not require optimization, as it has no variables.

Simple Moving Average

The Simple Moving Average is a model that is based on the phenomenon called volatility clustering. The best way to describe it is to quote Mr Mandelbrot “large changes tend to be followed by large changes, of either sign, and small changes tend to be followed by small changes”, which means that in the short-run volatility is likely to stay on the same level, be it high volatility level or low volatility level. To capitalize on this phenomenon we calculate the mean of last n closing prices of the VIX/mean of the last n annualized realized volatilities daily and expect VIX to revert back to that short-run mean i.e. stay on the same level. As you might have already guessed this model has to be optimized for the number of days n.

Exponential Moving Average

An extension to Simple Moving Average model, here we assume that more recent observations are more useful for us than less recent and thus we assign more weight to them. In other words, we smooth the time series by exponentially decreasing weight over time. The weights assigned depend on the exponential smoothing factor, which in turn depends on the number of days n. Therefore this model also has to be optimized for n.

Thus in order to derive a trading decision, the forecasts of the beforehand listed volatility models will be compared with the level of the VIX. Whenever the forecast of the model is higher than the VIX, we will enter a long position in the VXX and vice versa. If you are interested in the methodology of our backtest and optimization - they can be found in the appendix below the article.

RESULTS

In this chapter, we will provide you with an overview of our backtest results.

Figure 4: performance of realized volatility models where the models were OLS optimized (basis: 1)

Figure 4 shows that when RVOL predictions are OLS optimized, SMA outperforms EMA and EMA outperforms the historic mean. When these models are Sharpe ratio optimized the results are slightly different. While SMA and EMA keep outperforming the historic mean, we can see SMA underperforming compared to EMA, as it is demonstrated in figure 5. The Comparison of the two figures allows us to see the dominance of SR optimization over OLS optimization with SR optimized models demonstrating much better performance than OLS optimized ones.

Figure 5: performance of realized volatility models where the models were Sharpe ratio optimized (basis: 1)

When the figures 4 and 5 are compared to figures 6 and 7, it can be clearly seen that all RVOL models outperform IVOL models, with all IVOL models yielding negative results. Figures 6 and 7 demonstrate that for IVOL the historic mean is the best estimation, even though “best” has to be taken with a grain of salt, given that all models generated a negative return.

Figure 6: performance of implied volatility models where the models were OLS optimized (basis: 1). SMA was excluded as the optimized n was 1, implying that todays volatility is tomorrows volatility. As this would imply that markets are perfect, the optimal thing would be not to trade, which is why we excluded it from the figure

Figure 7: performance of implied volatility models where the models were Sharpe ratio optimized (basis: 1)

While the reasons for the performance differences will be discussed in the next paragraph, it is important to briefly address one crucial point: luck. In the out of sample period, there was one day (05.02.2018) where the VIX spiked more than 100% in a single day, leading to a massive spike in the VXX as well. The details of this event are analyzed in more detail in this article, but for our article, it is sufficient to mention that this skews the analysis we do. The IVOL Hist. Mean strategy (figure 6) is the best example: it would usually massively underperform the SMA and the EMA model, but since it “got it right” on that one day, it becomes the best performing IVOL model.

Table 1: out-of-sample Sharpe ratios

As this one-day event has a massive impact on the Sharpe ratios we decided to show you the final results in two tables. The table above contains virgin Sharpe ratios, while the table below contains the Sharpe ratios assuming we do not trade on that Monday in February 2018. In such a way the table above is a better reflection of the real world as it shows the risks associated with such sort of trading, while the table below is probably better for comparing the performance of our models.

Table 2: out-of-sample Sharpe ratios excluding 05.02.2018

INTERPRETATION

IVOL vs. RVOL

When it comes to the performance of the strategy, it becomes quite evident that RVOL models clearly outperform IVOL models. A potential reason for this is the input into the models. The IVOL models consider the last n closing prices of the VIX and weight them in different ways. The problem is that this is a mean reversion strategy since as long as the VIX is dropping, the model will suggest a long position and vice versa. While the VIX is indeed mean-reverting, the VXX is not, which explains the poor performance. It is a different story with the RVOL models, however, which deliver decent returns. The reason for them showing such attractive performance is that the models always consider the last n RVOL values. What is important to know is that RVOL models constantly underestimate implied volatility when the markets are calm and overestimate implied volatility when markets are crashing, as demonstrated by figure 8 below.

Figure 8: comparison of the VIX/100 (NYSEARCA:IVOL) with realized volatility (RVOL)

As a result, the model would mostly short volatility when futures are in contango and long volatility when futures are in backwardation, allowing the investor to collect the previously mentioned premium and therefore implicitly takes the term structure into consideration. Whereas we mainly focus on the term structure in our analysis, it is worth mentioning that what we are actually measuring by comparing RVOL with the VIX is the volatility risk premium. As a discussion of how term structure and volatility risk premium are related would be beyond the scope of this article, we will skip this part at this point. Nevertheless, it is important to have it mentioned as the thesis that IVOL and RVOL predictions are comparable is one that certainly can be challenged.

OLS vs. SHARPE RATIO OPTIMIZATION

As the evidence suggests Sharpe ratio optimized models perform better in the out-of-sample period than the OLS optimized models. Since Sharpe ratio optimization for the models of such kind is not very common, we believe it is worth discussing this topic in more detail. The OLS optimization aims to find the best fitting model to correctly estimate the next day’s VIX, whereas the Sharpe ratio optimization only aims to optimize the parameters in such a way that the maximum Sharpe Ratio is generated. In the end the best performing strategy will be the directionally correct one, not the one that delivers the smallest prediction error. This is why even if the OLS model’s prediction was closer to the final value of VIX than the SR model’s if the SR model was directionally correct, it will deliver a better performance. To demonstrate that higher precision has nothing to do with good trading performance, we plotted the loss function of the best performing RVOL-OLS and the best performing RVOL-SR model in the figure below.

Figure 9: comparison of squared error loss functions for the EMA OLS RVOL and EMA SR RVOL optimization

Please keep in mind that the aim of this article is NOT to develop a volatility forecasting model that is ideal to price options for example, but one that can act as an indicator for a trading strategy. If the goal was to get the best estimation of volatility possible, our approach would have been totally different.

CONCLUSION

To sum it up, this article quite clearly demonstrates that due to the term structure of VIX futures, predicting realized volatility is a more sensible choice for a long/short VXX trading strategy than predicting implied volatility. Moreover, it was demonstrated that while OLS optimization might result in a higher prediction accuracy, Sharpe ratio optimization is better suited for a trading strategy. In addition, we saw that both SMA and EMA models yield quite attractive results and clearly outerperform the historic mean model. Would we consider trading these strategies ourselfes? Hell no! Having 25% of your money in an un-hedged short VXX position is a very risky trade that can lead to massive day-over-day losses, as demonstrated by some of our backtests. How a short VXX position can be hedged was discussed in this before-mentioned article and we consider it a topic of further research to assess whether VXX put options would be a suitable substitute for “cash” VXX.

As we mentioned in the very beginning, this article is the first one in our series of articles about volatility trading. The topic of our next article will be more sophisticated historical volatility models. So if you are interested in one or another please subscribe to our accounts (Andrii, Michael). As we will be posting the articles alternately, the next one will be posted on Andrii's page!

If you have any thoughts about what we did please share them in the comment section, we’d love to have a discussion!

You can also follow us on Medium:

Andrii

Michael

APPENDIX: Methodology

Backtest

The dataset we will be using is historical daily closing data from 01.01.2008 - 09.06.2020 of the SPX, VXX and VIX. The in-sample period for which we optimize our models is 01.01.2008 - 31.12.2015 and the out-of-sample period where we test their performance is 01.01.2016 - 09.06.2020. We assume that we start with 10 000 000 $, trade daily at market close and always allocate 25 % of our capital into either a long or short position in the VXX based on our estimation of volatility tomorrow. Furthermore, it is assumed that the cash position does not earn any interest, the bid/ask spread for the VXX is 0.01 $ and there are no shorting fees for the VXX.

Optimization

As we mentioned above we will have several variations of each model we use. The first distinction is the type of volatility forecasted - we will try forecasting both implied and realized volatility (if you made it this far in the article and still don’t know the difference - the time is now!). The second distinction is the optimization method used, we will test OLS and Sharpe ratio optimization. OLS optimization is adding all the squared differences between actual volatility and forecasted volatility in the in-sample period and running the optimization tool to minimize the sum of the squared differences. In such a way you can minimize the forecasting error of your model. Optimizing for Sharpe ratio is nothing more than just telling the optimization tool to pick the values of the variables in such a way, that the in-sample Sharpe ratio is maximized. As what we are developing are trading strategies all the models will be compared based on the out-of-sample Sharpe ratio.

Disclosure: I/we have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it. I have no business relationship with any company whose stock is mentioned in this article.

Additional disclosure: My portfolio trades an algorithm that goes long/short/neutral VXX and hedges short VXX positions with VIX call options. By the time this article gets published, I, therefore, might have a long/short/no exposure to the VXX and a long/no exposure to VIX call options.