By Leo Chen, Ph.D.

On day 1, our econometrics professor warned us to be careful with correlation: A strong correlation doesn't necessarily mean causality. This is the case with the VIX. It's closely related to the S&P 500 with a negative correlation, but the relationship may not be causal. I will never forget the example our professor gave us: The number of people who have drowned by falling into a swimming pool, it turns out, is highly correlated with the number of movies Nicolas Cage has filmed. Unlike school vending machines that can be causally linked to childhood obesity, Nicolas Cage didn't need a scientific study to prove his innocence.

However, investors sometimes mistakenly assume causality between seemingly related events. If a rise in VIX is accompanied by some down days in the stock market, did the VIX cause stocks to fall? To answer this question, we need to know how the VIX is calculated. The VIX uses factors from the options market, one of which is the observed forward index price calculated by out-of-money options in the short term. But this correlation calls for caution about endogeneity: The VIX is derived from traders' perceptions about the future, but actual futures prices also affect traders' perceptions. How can we be sure that the VIX causes stock price to change? If anything, VIX is more likely the effect than the cause in this relationship. Analogously, the more firemen are sent to a fire (effect), the bigger the fire is (cause). From the perspective of future perceptions, VIX appears more likely to correspond to the firemen who respond to the fire in numbers according to its size than to the fire itself.

What is causation as opposed to correlation, then? Here's an example of causation: Consuming alcohol can cause a hangover the next morning. Unfortunately, causality is not as easy to identify in finance as it is in the case of having too much to drink. Oftentimes, what appears to be causation turns out to be just correlation. But why is causation so important for investors? The reason is simple: If a variable can cause stock prices to change, then it is a predictor. Who wouldn't like to know tomorrow's stock prices? If we have causation, we have a way to anticipate what's coming. Why is it so difficult to identify a causal relationship? It is not because there is a limited supply of crystal balls; instead, the stock market is too complex. The stock market is sensitive to all information pertaining to the future. Therefore, it's not just one variable but many that affect the stock market. Those variables can be as simple as some new product or quarterly earnings, or more intricate ones such as a tax bill or an interest rate change.

At this point, we seem to be mired in a paradox: If there are so many factors that can cause stock prices to change, why is it so difficult to identify one? The answer, once again, falls upon the complexity of the market. Investors and financial engineers have studied the stock market extensively. Although there have been many models, such as the Fama-French three-factor model, created to attempt to explain stock prices, unfortunately, no model yet invented captures the complex interactions of the stock market. This limitation makes it extremely difficult to test any variable against the so-called benchmark. Currently, one of the most-used ways to mitigate this problem is to test one variable at a time against a bundle of existing factors. Nevertheless, our modern research methodologies still only allow us to conclude with a probability rather than certainty. To complicate matters further, the stock market is dynamic. This challenges any model used to predict the stock market. In other words, a factor may have a causal relationship with the stock market at a certain time but not at all times. The explanation is straightforward. If a factor were known to cause stock prices to change, then investors would use that factor repeatedly. By Goodhart's Law, that factor would not be rendered ineffective over time.

Finally, the key difference between causation and correlation is simple. Causation means A happens before B, while correlation suggests A and B both happen at the same time. Let's use the figure below, for instance. By rule of thumb, household income is likely to lead personal expenditures, which may explain why there is a slowdown in income preceding each decrease in expenditures.

Personal Consumption Expenditures and Median Household Income in the United States
Source: St. Louis Fed.