Tomorrow the Labor Department releases the monthly employment report (Non Farm Payrolls - NFP) which is arguably the most anticipated report coming out each month. About 2 weeks ago I came across this chart on a blog, with some hyperbole in the title, about a very large negative NFP number possibility because of the large decline in the Philly Fed Survey.

Source: Bloomberg via http://www.zerohedge.com/news/scariest-chart-ever-philly-fed-versus-non-farm-payrolls

This led me to start researching some econometric models to forecast a possible distribution of outcomes in the NFP number. I also looked at other economic reports that would have august data available. I found that the Philly Fed Survey, Dallas Fed Survey (lagged), 4 week average of job claims (change from month ago), and the differences lagged of the NFP were significant. This is correlation matrix table is for data from June 2004 to July 2011, and the correlation critical value at the 99% level is approximately .283. One can see that all variables are statically significant.

Philly | NFP | NFP-1 | NFP-2 | Dallas -1 | |

1 | 0.5974 | 0.553 | 0.507 | 0.6479 | Philly |

1 | 0.6889 | 0.6967 | 0.8047 | NFP | |

1 | 0.689 | 0.8144 | NFP-1 | ||

1 | 0.7911 | NFP-2 | |||

1 | Dallas -1 |

I ran a multivariate regression, using HAC standard errors, and received the following results:

Coefficient | std. error | t-ratio | p-value | |

constant | -51.3032 | 18.9197 | -2.712 | 0.0082 |

dallas_1 | 3.79735 | 1.10702 | 3.43 | 0.001 |

d_claims | -0.00252297 | 0.00052 | -4.85 | 6.09E-06 |

philly | 2.629 | 1.05501 | 2.492 | 0.0148 |

nfp_d_1 | 0.373798 | 0.107461 | 3.478 | 0.0008 |

nfp_d_2 | 0.207402 | 0.066711 | 3.109 | 0.0026 |

The coefficient of determination (Adjusted R^2) was 84.4% and the Standard error was 111.75.

(Need help understanding statistics http://en.wikipedia.org/wiki/Statistics)

Plugging in the data for August, this "model" gives a point estimate of -42,773 of jobs created. Now the current consensus estimate for this report is + 60,000 jobs created as reported by Bloomberg (although there is a report that Goldman cut its estimate today). The range of estimates is from -5,000 to +150,000.

I then made a statistical estimation of the probability values of various scenarios using the point estimate and the standard error.

Above +150,000 = 4.2%

Above +60,000 = 17.9%

Above Zero (no job creation) = 35.1%

Below = -5,000 = 63.2%

This data shows a much greater likely hood of a downside surprise when the NFP report is released. What could be wrong here? First, I don't like the constant being statically significant, in academic circles it should not be. Second I downloaded the NFP data from theSt. Louis Fred but it might be revised data and would introduce a bias. I tried using the Alfred data, and received significant results, but there were large jumps in the data also which appeared not much different from the most popular NFP data.

Here is how this model performed running the regression from June 2004 until January 2009 and then forecasted into the future.

Bottom line this is not a tool for gambling a trade on until it proves itself in real time, and often at that. However, it does make me shift more time to thinking through what I will do if a "bad" employment number is reported.

I will try and update the model more over the next month and see how it does plugging in other data series, such as the ISM data. But for the time being, statistics clearly show a downside surprise is more likely. Get your trading plans ready!

(Any Quant's out there please let me and other readers know if there is a mistake I made. :-) Software used was free from http://gretl.sourceforge.net/ )

**Disclosure:**I have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours.

**Additional disclosure:**Past erformance is not indicative of future results. Just askthe statistical quant managers at AIG!!! (no position)