The False Confidence That Comes From Cheating In Investment Forecasting Models

 |  Includes: DIA, QQQ, SPY
by: Jeff Miller

Most investors are paralyzed with fear.

They are not thinking about the right stock to buy. They are scammed by online gold sites (see this official FINRA warning.) They are scared witless [TM OldProf euphemism] by sensational predictions of the end of the world.

The Most Important Issue: Asset Allocation

Here is a clue: There are very few investors for whom the right stock allocation is zero!

Sometimes the most important issues for the individual investor are too complicated for a single neat article ending with "actionable investment advice." I regularly complain that journalists and bloggers alike play for page views by their choice of story. I do not want to do the same.

To illustrate my point I am devoting some time this week to basic investor education. These may not be the most popular of articles, but I am encouraged by the emails from those who are playing along -- people who are doing some critical thinking about the typical claims they see every day and trying to analyze probability questions.

I'll return to the "doctor question" tomorrow when everyone has had a chance to play.

Meanwhile, let's turn to cheating in investment forecasting models.

The Illusion from Cheating

As the first step, I encourage readers to consider the online poker scandal of 2007. Some players had information about the software, so they could see everyone's hand. Even the worst poker player can win when he knows the hands. The story was reported on 60 Minutes and recently featured on CNBC. Here is a brief video that provides the essence of the cheating story.

From our perspective the lesson is simple: If you know the hand, you can win even if you are a terrible poker player.

The 2012 Election

One of my political science colleagues in the professoriat has a new book and a prediction for the 2012 elections. He is challenged by a young upstart lacking the formal poli-sci credentials. Some readers think I place too much emphasis on formal training and credentials, so you might be surprised at my reaction.

I am a big fan of Nate Silver and a regular reader of his work. He commands respect with his use of data. He is not writing about investments, which caters to this week's theme of taking lessons from other disciplines. He caught my attention with this tweet:

@fivethirtyeight Nate Silver

Anyone who, at this point in time, claims that Obama is certain to win or certain to lose doesn't understand forecasting, full stop.

In a full article on the topic he reviews the methods. He carefully discusses the various "keys" used by the author and then does a statistical analysis of past elections with an error margin for the key differential. It is excellent, and worth reading as we look to the election.
Here is his conclusion:

By the way — many of these concerns also apply to models that use solely objective data, like economic variables. These models tell you something, but they are not nearly as accurate as claimed when held up to scrutiny. While you can’t manipulate economic variables — you can’t say that G.D.P. growth was 5 percent when the government said it was 2 percent, at least if anyone is paying attention — you can choose from among from dozens of dozens of economic variables until you happen to find the ones that pick the lock.

These types of problems, which are technically known as overfitting and data dredging, are among the most important things you ought to learn about in a well-taught econometrics class — but many published economists and political scientists seem to ignore them when it comes to elections forecasting.

In short, be suspicious of results that seem too good to be true. I’m probably in the minority here, but if two interns applied to FiveThirtyEight, and one of them claimed to have a formula that predicted 33 of the last 38 elections correctly, and the other one said they’d gotten all 38 right, I’d hire the first one without giving it a second thought — it’s far more likely that she understood the limitations of empirical and statistical analysis.

Applying This to Investing

What Nate Silver is saying about this type of analysis is that they have cheated -- perhaps inadvertently.

You can easily find predictions from big time economists and fund managers who think that the odds of a recession are now 100%. (Let us assume that they stipulate a time frame of a year or so). Silver's rule tells you to say NO! to this. The apparent certainty of these methods comes from the omniscience of "knowing all of the hands."

Your first reaction should be the same as Silver's when he saw the 100% Obama forecast. How can anyone be 100% sure of a recession? One of the perpetrators of the investment mythology had a 100% recession forecast last year. It didn't work, but he is back with another one this year featuring new variables! Doesn't anyone monitor this stuff? This guy is on financial TV with fawning anchors explaining how he has been "right."

To emphasize, if you take hundreds of variables, selectively choose the ones that fit your thesis, throw out those that do not, and then adjust the levels to fit the forecast -- well-- you can prove anything.

An Honest Approach

Honest analysis is more difficult and it does not sell as well. Even those of us who have a long-term record of significant edge over the market averages experience plenty of fluctuation. About five years ago I wrote this true story about the importance of honest research methods. I encourage you to read it and enjoy a chuckle at my expense.

Meanwhile, you will appreciate the need to start with the hypothesis and then test. Those who start by looking at all of the hands and then tell you that the data proves they are right did not take the right classes, even if they did get a PhD.