Summary
- While we cannot totally eliminate our bias, we can look at data objectively to reach a less biased conclusion.
- Black Monday has a very significant impact on the statistical risk measures for the S&P 500.
- It is justified for long-term investors to remove Black Monday from the risk analysis.
- If Black Monday is included than gold has somewhat lower risk measures than stocks; excluding Black Monday and stocks appear safer.
There is no such thing as emotionless investing, and few investments bring out strong emotions as much as gold does. For some, gold is the ultimate safe haven for their wealth; shielded from the influence of central banks' "money printing." An even more extreme example is the belief that a return to the gold standard is inevitable. Ron Paul is likely the most prominent figure to advocate gold (and the gold standard) having said: "the United States ought to link its currency to gold or silver again," The New York Times (July 22, 2007). On the other side of the argument, figures like Keynes have called gold a "barbarous relic." It's not hard to find criticisms of gold as an investment; I published an article critical of gold a short time ago on Seeking Alpha, to give one example (other, non-self-promoting examples: Bloomberg View, and The Irrelevant Investor). Google searches bring up countless other examples of both positive and negative opinions on gold.
Given how much has already been said about gold, it's fair to ask what can be said that has not already been covered somewhere else. There are several aspects of gold that have remained, relatively, unexplored including some of its statistical properties. This brings me back to my first point about emotionless investing being impossible. We cannot, unfortunately, consider any statistical analysis to be free from the analysts' bias. It's all too easy to make the data say what you want it to say; cherry picking start and end dates is one of the most common ways to do that. All too often, readers are not told what assumptions were made to arrive at a given conclusion. Even so, I think there is a great deal that can be learned by correctly using statistics to analyze gold; that is what I plan to do in this article. I will also explain exactly how I arrived at my conclusions, and I will provide a few counterpoints to think about. My goal is to provide you with an analysis of gold that is as removed as possible from my personal opinions about gold. All of the data I used is publicly available and all assumptions I make will be stated clearly.
I'll start with the data I used in my analysis. I used historical data from the Federal Reserve Bank of St. Louis for gold. As you can see in the image above, their source is the London Bullion Market Association. All of these results also hold true for the gold ETF, GLD. I used all of the provided data, that is, daily prices between 04/01/1968 and 07/18/2014. This provides a sample of 11,707 data points. As a point of reference, Nixon ended dollar to gold convertibility in August of 1971. For the stock market I used the S&P 500 (SPX) as a proxy for the overall market. The data is the daily closing value provided by Yahoo Finance (Pending:GSPC). Again, I used the widest date range possible rather than selecting a specific range myself; the data is from 01/03/1950 to 07/17/2014. There are 16,238 data points in the SPX time series.
Log Returns:
I used the daily log returns in my risk analysis. If you are unfamiliar with the reason for log returns, here is a brief explanation (otherwise, skip this section). In most situations, investors are interested in percentage returns, but there is a problem with using percentage returns in a statistical analysis. Consider a stock with an initial price of $100 per share as of today. If it takes a 50% loss tomorrow, than it will now have a price of $50 per share. Now, it will require a 100% gain for the price to return to breakeven after the 50% drop. Contrast with the logarithmic returns, defined as:
So, in the case of the price going from $100 to $50 per share, -0.69315 is the log return. In order for the stock to return to breakeven, a log return of +0.69315 is required. That means the log return over an entire year can be calculated by simply adding the daily log returns. I will be stating all my results using log returns; however, if you would like to convert to percentage returns the formulas are:
Results:
SPX | Gold | |
Mean | 0.0002936 | 0.0003024 |
Standard Deviation | 0.0097358 | 0.0129182 |
Variance | 0.000094787 | 0.0001669 |
Skewness | -1.03088 | 0.0660539 |
Kurtosis | 27.728668 | 13.263068 |
Minimum | -0.228997 | -0.160286 |
Maximum | 0.109572 | 0.1253449 |
Median | 0.0004741 | 0.0000000 |
Data Points | 16,238 | 11,707 |
Start Date | 1/3/1950 | 4/1/1968 |
End Date | 7/17/2014 | 7/18/2014 |
The two images above show the summary statistics for the two data sets (gold is the top image and SPX the bottom). To make it easier to compare, I've recorded the important information in the table above. I should also point out that I am not using the S&P total return index (so no reinvested dividends), so that somewhat lowers the average return relative to what it would really be. At first glance, this seems to strongly support gold for several reasons. First, the average returns for gold are slightly higher than for SPX (though SPX would be higher if total returns were used). Now, SPX has a lower standard deviation (therefore lower variance), indicating that it has a lower volatility than gold; however, the values for skewness and kurtosis suggest SPX has far more tail risk than gold. I've talked a little about skewness and kurtosis in the past, but it's worth going over again here. Let's start with skewness (skew for short).
Source: Wikipedia.
I find the picture above to be very helpful for understanding skew. The dotted line shows what a normal distribution looks like and the red line shows the result of a high positive or negative skew. A good rule of thumb for what constitutes significant skew is when skewness is above +1 or below -1. A strong negative skew means the left tail is heavier than the right. In the context of investing, we can interpret that to mean that there is a greater likelihood of significant negative outliers (think Black Swans). The kurtosis is generally considered to measure how heavy a distribution's tail is. Higher kurtosis means extreme events occur more often. Now, there are two common definitions for kurtosis and statistics software doesn't always say what is used. In the case above, it is excess kurtosis; that means the kurtosis of a normal distribution is 0. In the other definition, the kurtosis of a normal distribution is 3. Otherwise, the two definitions are identical. I will just write kurtosis from now on, but I am referring to excess kurtosis. Also, I will consider the -1 to +1 threshold as a rule of thumb for "significant" kurtosis. Kurtosis is significant for both gold (13.26) and SPX (27.73) indicating both have much heavier tails than a normal distribution. That SPX's kurtosis is slightly more than twice the kurtosis of gold suggests stocks have a much heavier tail than gold.
So, it seems that gold is a safer investment according to every statistical measure except the standard deviation. However, I don't think that is a very accurate conclusion for one major reason, Black Monday (October 19, 1987). On Black Monday, the log return for SPX was -0.228997 or a percentage loss of 20.4669% in just one day. Black Monday was more than twice as large as the next largest one day loss (in the time period after 1950). Just how significant is Black Monday in the results above. I ran the same analysis, but this time I excluded Black Monday. The results are very different.
SPX | SPX ex-Black Monday | Gold | |
Mean | 0.0002936 | 0.0003077 | 0.0003024 |
Standard Deviation | 0.0097358 | 0.0095684 | 0.0129182 |
Variance | 0.000094787 | 0.0000915 | 0.0001669 |
Skewness | -1.03088 | -0.242797 | 0.0660539 |
Kurtosis | 27.728668 | 9.6227044 | 13.263068 |
Minimum | -0.228997 | -0.094695 | -0.160286 |
Maximum | 0.109572 | 0.109572 | 0.1253449 |
Median | 0.0004741 | 0.0004751 | 0.0000000 |
Data Points | 16,238 | 16,237 | 11,707 |
Start Date | 1/3/1950 | 1/3/1950 | 4/1/1968 |
End Date | 7/17/2014 | 7/17/2014 | 7/18/2014 |
By excluding Black Monday, all of the possible risk measures for SPX are considerably improved and the average return becomes slightly higher than for gold. The standard deviation (most common measure of volatility) for SPX is now about 25% lower than for gold. SPX still has a more negative value for skewness than gold, but it's now below the threshold for significance. The kurtosis becomes less than gold's kurtosis; however, both still have a very significant kurtosis, so tail risk is meaningful for both. It is a large enough difference between the two values of kurtosis that I would say gold has a greater tail risk than SPX. My overall conclusion is that, based on the data above, gold is measurably more risky than stocks. Now, removing outliers in order to perform statistical analysis is not something I invented. There are a wide variety of good reasons to remove outliers, but it usually comes down to saying that they are not representative of situation you are analyzing. Choosing what to remove is always highly subjective for the simple reason that outliers can often have a major impact on the results (see results above). Here are my justifications for removing Black Monday from the analysis.
The graph above shows the 1-day log returns for SPX from 1970 to today. The first thing I want to point out is the difference between Black Monday and the 2008 financial crisis. You can see that the volatility of the returns is far from constant over time, and, furthermore, the volatility is clustered. So high volatility is usually followed by high volatility and low volatility by low volatility. Periods of high volatility are strongly associated with bear markets and crashes, while low volatility mostly occurs during bull markets. The 2008 crisis show a spike in volatility that appears (visually) symmetric. The highest one-day return occurs during the crisis as well as the second lowest one-day return. In contrast, Black Monday appears to be a one-off event that had a more limited impact over the next several years. The volatility after Black Monday is not significantly different from the volatility before for at least 3 years in both directions. To me, this suggests that Black Monday is a different kind of event than either the financial crisis, or even the dot-com crash. However, by looking at rolling returns with periods longer than a day makes an even stronger case about the limited reach of Black Monday.
This is a chart of the rolling log returns over time periods ranging from 1-day to 20-years between 1980 and today. You will likely need to zoom in to see the image more clearly. The zero line is marked on each chart, but the scales are different for each. Black Monday does not reverse the long-term trends for nearly as long or as much as the financial crisis does. For investors with a long holding period, events like the 2008 crisis are much more significant risks than events like Black Monday. Obviously, Black Monday should be included in my analysis above if you are assessing risk with a shorter holding period.
That leads to my final conclusion, stocks are less risky than gold for long-term investors, but gold seems less risky for short-term traders.