A 2-Factor Model Of Gold Industry Stock Returns In The U.S. Market

|
Includes: GLD, IAU, SPY
by: Alberto Abaterusso

Summary

I combine in a two-factor model, the variables of which have been suggested and tested in the empirical literature as having explanatory power for the returns on gold stocks.

S&P 500 and COMEX are the independent variables I use in the two-factor model. I run regression over the period 2001-2015 and three sub-periods.

Gold (Comex) is always the variable with significant explanatory power to gold industry returns in the U.S. stock market.

Statistical evidence of negative autocorrelation is observed during 2006-2010 sub-period. It was likely due to the effects of the Great Recessions on the financial markets.

I want to combine in a two-factor model, as "those variables that have been suggested and tested in the empirical literature as having explanatory power for the returns on gold stocks" (Faff, Robert, and Howard Chan) ("A multifactor model of gold industry stock returns: evidence from the Australian equity market." Applied Financial Economics 8.1 (1998): 21-28).

As index for gold stock industry returns, I use the NYSE Arca Gold BUGS Index.

The NYSE Arca Gold BUGS Index, together with the Philadelphia Gold and Silver Index (XAU), is the most watched gold index on the market. The HUI Index takes into account only gold mining stocks.

I examine a two-factor model of the form:

Rgt = alfa + BetamktRmkt,t + BetaGprRGpr,t + ut

Where:

Rg,t is the monthly return on the NYSE Arca Gold BUGS Index over the period from 2001 to 2015;

Rmkt,t is the monthly return on the S&P 500 over the period observed; and

RGpr,t is the monthly return on the COMEX.

All prices are in U.S. dollars.

Table 1 shows the results of the regression:

Click to enlarge

The results in the table 1 show that the gold industry has been less risky than the stock market (S&P 500, SPY), but more risky than the COMEX (gold price) (relevant ETFs include GLD, IAU) over the entire period observed from 2001 to 2015.

Both variables used in this two-factor model have an explanatory power to the return on the stock industry since both are statistically significant. Multiple R and R Square are sufficiently high. The data observed fit very well the two-factor model. The period observed includes the "Great Recession", Dec. 2007-June 2009, in the USA (see the darker area in the picture below).

Graphic 1

Click to enlarge

The picture below shows summary statistics over the entire period observed, from 2001 to 2015:

Gold industry and gold (COMEX) delivered higher returns than the stock market (S&P 500) over the entire period observed.

I also examine this two-factor model of gold industry stock returns over three sub-periods:

  1. From 2011 to 2015;
  2. from 2006 to 2010; and
  3. from 2001 to 2005.

The tables 2, 3 and 4 show the results of the regression over the three sub-periods observed:

Table 2

Click to enlarge

Gold price is the only variable which produces explanatory power to the return on the gold industry.

Table 3

Click to enlarge

Both the market and the COMEX are statistically significant.

Table 4

Click to enlarge

COMEX is the only statistically significant variable over the sub-period from 2001 to 2005.

Rgt = alfa + BetamktRmkt,t + BetaGprRGpr,t + ut

Ut is un observed error terms which may be potentially auto correlated and it is what I am going to test for.

I want to make sure that there aren't patterns between Rg,t series and residuals. I will estimate ut by OLS (ordinary least squares or linear least squares) to obtain u*t, t = 1, 2, 3, ..., 181 observations (months) from January 2, 2001, to January 4, 2016.

A key assumption in regression is that the error terms are independent of each other.

1: Inspection of the residual plot.

Absence of autocorrelation would imply that error terms are independent. I would expect from this graph to show no dependence from one error term to the next one, so they would look random. When I look at the graphic 2, I cannot identify "runs" of data. The error terms are on average around zero, and I don't see occasions of time period where error terms are above zero and other time periods where error terms are clustered below zero. I would expect that the error terms are independent.

Graphic 2

Click to enlarge

So from the inspection of this graphic, autocorrelation seems not be a problem with my estimate. Just to be sure, I can run a test statistic to detect the presence of autocorrelation, called Durbin-Watson statistic (here).

Second: Formal test for autocorrelation (Durbin-Watson statistic).

I first calculate Corr (u*t, u*t-1) = ρ*

DW = 2 (1 - ρ*)

Ρ* = -1 DW = 0

P* = 0 DW = 2

Ρ* = 1 DW = 4

Anything between 0 and 2 is an origin of positive autocorrelation, instead anything between 2 and 4 indicates negative autocorrelation. A value of d = 2 means there is no autocorrelation.

The Durbin-Watson test uses the following statistic:

Figure. 1

The Durbin-Watson test, d, is very close to 2 (d = 2.030); presumably ρ = 0.

Testing for negative autocorrelation: 4 - d = 4 - 2.02960454 = 1.970.

Click to enlarge

It seems that there aren't patterns between Rg,t series and residuals. We are between no autocorrelation and an indication of very mild negative autocorrelation.

Hypothesis Testing

If d < dL reject H0 : ρ ≤ 0 (and so accept H1 : ρ > 0)

If d > dU do not reject H0 : ρ ≤ 0 (presumably ρ = 0)

If dL < d < dU test is inconclusive

Figure 2 - From the Durbin-Watson Table

In our case, with α = .05, n = 181 and k = 2. From the Durbin-Watson Table, we see that dL is between 1.706 (n = 150 and k = 2) and 1.748 (n = 200 and k = 2) and dU is between 1.760 (n = 150 and k = 2) and 1.789 (n = 200 and k = 2). Hence 1.97 > dU, I do not reject H0 : ρ ≤ 0 (presumably ρ = 0), and conclude that there is no statistical evidence that the error terms are negatively autocorrelated.

I used the Durbin-Watson test and table (n=60 months; k=2 independent variables [S&P 500; COMEX]; alfa = 0.05) over the three sub-periods. The table below shows a summary report:

Click to enlarge

Sub-period 2011-2015, d = 2.13292; 4 - d = 1.86708 > du = 1.652. There is no statistical evidence that the error terms are negatively autocorrelated.

Sub-period 2006-2010 (which includes the Great Recession period); 4 - d = 1.33836 < dL =1.514. there is statistical evidence that the error terms are negatively autocorrelated.

Sub-period 2001-2005, d = 2.0999; 4 - d = 1.900 > du = 1.652. There is no statistical evidence that the error terms are negatively autocorrelated.

Conclusion

I combine in a two-factor model, as "those variables that have been suggested and tested in the empirical literature as having explanatory power for the returns on gold stocks."

Gold (COMEX) is the variable with significant explanatory power to gold industry returns in the U.S. stock market when observed during the entire period and over each of the three sub-periods.

Betas: Gold values in excess of unity in all periods suggest that gold industry is super-cyclical. The stock market (S&P 500) produces significant explanatory power to gold stock returns over the entire period and over the 2006-2010 sub-period.

Sub-period 2006-2010 shows statistical evidence of negative autocorrelation, hence there may be issue affecting the proper specification of the two-factor model. An explanation of autocorrelation may be the Great Recession that characterized the second sub-period observed.

Disclosure: I/we have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours.

I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.