Seeking Alpha

UNRATE: A Market Timing Indicator Tested From 1949

|
Includes: IVV, SPY, VOO
by: Fred Piard
Summary

UNRATE is the identifier of the Civil Unemployment Rate in the St Louis Fed database.

It can be tested as a timing indicator through all recessions since 1949.

Here is a way to use it.

The Civilian Unemployment Rate (UNRATE) is calculated by the U.S. Bureau of Labor Statistics from a sample. It is usually released on the first Friday of every month for the previous month. The reading on 10/4/2019 was at 3.5%, a value not seen since December 1969. Data are available from 1948.

U.S. Bureau of Labor Statistics, Civilian Unemployment Rate [UNRATE], retrieved from FRED, Federal Reserve Bank of St. Louis, October 7, 2019.

Because it is measured as a percentage with a 1-decimal precision, UNRATE is less sensitive to sampling errors than payroll jobs. However, UNRATE values may be revised, which raises a question: should we use the initial releases or the revised data?

Doing a backtest means simulating a series of decisions in the past using data that were available at each decision point. From this point of view, initial releases are preferable. But, is it always smart to use point-in-time data to evaluate an indicator? Is a “backtest” the best tool to study data that may change after publication? I don’t think so. The purpose of a backtest is to simulate a time machine and do “as if” we were in the past. The purpose of my market timing study is identifying relations between entry signals from various indicators, and an output signal: a stock index. My preference is not to do “strict” backtests, but having the less noise possible in entry signals. When a data series is revised, I want to know if revisions are part of the information, or if it is noise due to measurement errors. If they are part of the information, it is better to use point-in-time data. If they are noise due to measurement errors, it is better to use the latest version of revised data, because they are the closest to the real states of the system at every point in the past. I think a model should be based on the real states of the system. Moreover, we can expect the measurement processes to improve with time. Not only may we expect it, but we can prove it. The next chart plots a metric of UNRATE’s “noise”. It is a 10-year sum of absolute values of revisions, defined as revised values minus initial releases.

Spikes of noise are possible, but the measurement process of UNRATE has greatly improved until 2006, and is stable on average since then. It doesn’t make sense to build a model including errors from a time when data were collected and consolidated manually. Therefore, I systematically use revised data for UNRATE.

I will show market timing test results based on monthly decisions. Indicators are observed on the 1st day of every month. Each indicator gives a binary signal “bullish” (0) or “bearish” (1). Every indicator is tested by calculating the performance of an investment in the S&P 500 (VOO, or IVV, SPY) with a market timing strategy going gradually out of the market during the month of a bearish signal. Gradualness is simulated using the average of daily closing prices as monthly price. It means a trade “off” or “on” is smoothed along the month, as if 1/21 of the trade was executed every day on market closing for an average month of 21 trading days. The first advantage is that it is easy to get a free and reliable price data series based on this rationale on a very long period (Robert Shiller’s online data). The second advantage of using smoothed monthly prices is a lower sensitivity to short-term moves. There is no risk to design a model unwillingly curve-fitted to a series of specific daily prices (the first trading days of every month). There is a third advantage: it is more realistic for investors who cannot make a big move on a single day because of capital size or compliance (especially fund managers).

The following tests simulate going to cash on a bearish signal. Obviously, this is rarely the best strategy. Opening or increasing hedging positions is usually a better way to manage riskier periods, incurring lower trading costs when the portfolio is in many positions or when holdings are not very liquid. It also keeps dividends coming when there are some.

In theory, on the 1st day of a month “m”, we can only use the value published about 4 weeks earlier for the previous month, which I name UNRATE(m-2). However, I think it doesn’t make sense to simulate waiting systematically 4 weeks to make a decision from a data, then to simulate trades diluted on an additional month. It is a lot of delay. Therefore, I choose to use UNRATE(m-1) to simulate decisions on the 1st day of month “m” and trades at the average monthly price of the same month. It means a look-ahead bias of a few days, which most quantitative analysts may consider a heresy. Once again, I do it on purpose because it is more realistic than simulating a trade out/in the market starting almost 1 month after data are published and spreading on one more month. I have used a safety net, in the form of robustness tests with data delayed by 1 month, and simulations of weekly decisions with data delayed by 1 week (not reported in this article).

I have tested several indicators on the UNRATE data series. I will show results for three of them. The first one is bearish when UNRATE went up in 3 months and bullish otherwise. The second one is bearish when it went up in 6 months and bullish otherwise. The third one is bearish when the 3-month average (3mma) is above the 1-year average (12mma) and bullish otherwise.

In the tables below:

  • CAGR is the annualized return in percentage points.
  • Ddmax is the maximum drawdown depth also in percentage.
  • DLmax is the maximum duration in months.
  • MAR ratio is a risk-adjusted performance metric defined as MAR = CAGR / Ddmax.
  • The first column gives the starting year for each test, the end date is always 1/1/2019.
  • For all tables, benchmark data are repeated in italic to facilitate comparisons (S&P 500, buy and hold).

In the first series of tests below, the bearish signal is given by UNRATE(m-1) > UNRATE(m-4), meaning the 3-month momentum is positive.

Since

CAGR

MAR

Ddmax

DLmax

CAGR

MAR

Ddmax

DLmax

1993

7.12

0.14

50.82

80

9.44

0.65

14.62

37

1949

7.61

0.15

50.82

89

6.23

0.23

27.45

81

In the second series of tests below, the bearish signal is given by UNRATE(m-1) > UNRATE(m-7), meaning the 6-month momentum is positive.

Since

CAGR

MAR

Ddmax

DLmax

CAGR

MAR

Ddmax

DLmax

1993

7.12

0.14

50.82

80

8.97

0.73

12.29

40

1949

7.61

0.15

50.82

89

5.56

0.21

26.84

127

In the third series of tests below, the bearish signal is given by 3mma > 12mma.

Since

CAGR

MAR

Ddmax

DLmax

CAGR

MAR

Ddmax

DLmax

1993

7.12

0.14

50.82

80

8.45

0.70

12.14

45

1949

7.61

0.15

50.82

89

5.83

0.22

26.84

66

Chart since 1993:

The three indicators based on UNRATE improve the MAR ratio and drawdown on both intervals. However, the return lags the benchmark since 1949. The 3-month momentum gives slightly better results. Robustness tests show a deterioration using delayed data (not shown here), but the risk-adjusted performance metric (MAR ratio) stays above the benchmark, or very close to it in the worst case.

These tests have a look-ahead bias, but I explained why I think they are more relevant than true backtests using point-in-time data. Anyway, there is no guarantee about future risks and returns.

Disclosure: I am/we are long SPY. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.