With 2010's first month of trading in the record books, the perennial references to "The January Barometer" are being espoused ad nauseum, and throughout a variety of media outlets. In a January 29th WSJ article - "January Proves Tough for Stocks" - the venerable newspaper makes the following statement:
History suggests that a weak January performance is a worrisome sign for the rest of the year...In years when the Dow has risen in the first month of the year, the median rise for the rest of the year is 10.4%. In years when the Dow has fallen, the median rise for the next 11 months is just 0.28%.
Feeling dissatisfied with the statistical methodology (Given January % Return > 0, median return) used above, I set out to apply a regression analysis to the data. I pulled data on the S&P 500 for every year from 1952 to 2009, setting each year's January % Return as the explanatory (X) variable, and the subsequent full year % Return as the response (Y) variable. The results are plotted below to provide a visual:
In the event that a strong relationship between X and Y existed, the plot above would display at least some modicum of linearity. This isn't quite the case. In fact, the r-squared value for this data is 0.1012, meaning that only 10.12% of the variation in Y (entire year's stock market return) is explained by X (return in January).
However, I realize that regression analysis of this nature is not very resistant to outlying/extreme values; that is, a few extreme observations have the potential to significantly affect the portion of Y's variation that is attributable to X. For this reason, I arranged January's % returns into quartiles, calculated the inter-quartile range (IQR), and deleted any observations that fell greater than 1.5*IQR from either the first or third quartile.
Interestingly, only two observations throughout a 57 year period passed the above test for being considered "extreme" - the S&P 500's return during January 2001 and January 2009. I deleted both observations, and recalculated below:
After removing the two extreme values, the new r-squared value is 0.1544 ( 15.44% of Y explained by X) - a near 50% improvement, but still well below any reasonable threshold which might prove the predictive value of January.
In conclusion, I will not be using January's stock market decline as a basis for any prediction concerning the full year performance of the S&P 500. That determination is better made - in my opinion - by recognizing that not all recoveries are created equally, via close monitoring of the mortgage market's response to the Federal Reserve's exit from its mortgage backed security purchase program, and by examining the structural implications ofsustained double digit (real) unemployment .
Disclosure: Long several stocks in the S&P 500