Please Note: Blog posts are not selected, edited or screened by Seeking Alpha editors.

Does The Feuerstein-Ratain Rule Predict Failure For The Aldoxorubicin Phase 3 Study?

|Includes: CytRx Corporation (CYTR)

CytRx (NASDAQ:CYTR) is not a large or small cap developmental stage biotech company--they're a micro cap biotech company with a candidate that has progressed to phase 3 (P3) development. Generally I feel an aversion towards investing in such companies, especially when they are close to reporting topline P3 data for an oncology product. The reason being that the market has been pretty efficient at rewarding companies with P3 oncology trials advanced on spurious grounds with low market caps ("MC") over the last 16 years. Actually, CYTR is the first such exception I've made.

A rule known as the Feuerstein-Ratain (F-R) rule exemplifies just this sort of efficient market hypothesis. However, though highly predictive thus far (CPXX the most prominent exception), the sample being used to validate the F-R rule is admittedly "small."

...we calculated the market capitalization (ie, the total shares outstanding times the price per share) for each company at 120 days before each of the public announcements using data derived from Supplementary Tables 1 and 2 in Rothenstein et al. and publically available information. This analysis demonstrated a remarkable difference between companies that had positive and negative announcements.

Specifically, the median market capitalization was approximately 80-fold greater for the companies with positive trials vs companies with negative trials ($17.8 billion vs $220 million, P < .001, two-sided Mann-Whitney test). There were no positive trials among the 21 micro-cap companies (ie, companies with less than $300 million market capitalization), whereas 21 of 27 studies reported by the larger companies analyzed (greater than $1 billion capitalization) were positive.

The F-R rule plainly states that no--or at this point amended to "very few"--phase 3 studies will be successful if sponsored or co-sponsored by a micro cap company (< $300mm MC). 48 studies were cited in the above retrospective analysis, which included 21 sponsored by micro caps. F-R updated their rule in early 2014, uncovering a further 15 such failures.

Adam stated:

When Dr. Mark Ratain and I came up with the concept for the F-R Rule in 2011, we were limited to analyzing a data set of 59 phase III oncology clinical trials conducted between 2000 and 2009.

For the update, I asked the research staff at BioMedTracker to help me bridge the gap between the initial analysis and the present by compiling a new list of phase III oncology studies conducted from 2009 through February 2014. BioMedTracker delivered to me a list of 72 oncology phase III trials. [These were the findings]:

Companies with a MC > $1B:

-37 pivotal P3 trials

-20 were positive (54%)

-17 were negative (46%)

Small cap companies (MC $300mm - $999mm):

-11 pivotal P3 trials

-6 were positive (55%)

-5 were negative (45%)

Micro cap companies (MC < $300mm)

-15 pivotal P3 trials

-0 were positive

-15 were negative

(9 not included in analysis because either privately held or could not determine MC).

He concluded:

We're beyond calling the F-R Rule a fluke finding at this point. Combined, the two data sets encompass 112 oncology phase III clinical trials from 2000 to February 2014. Of these 112 trials, 36 were conducted by companies with market caps of $300 million or less, measured four months prior to results being announced. NONE of these 36 trials reported a positive outcome.

There is a real and noticeable phenomenon here. However, any time we find a correlation there is also the potential to wrongfully assume causation.

Journalists are constantly being reminded that "correlation doesn't imply causation;" yet, conflating the two remains one of the most common errors in news reporting on scientific and health-related studies. In theory, these are easy to distinguish-an action or occurrence can cause another (such as smoking causes lung cancer), or it can correlate with another (such as smoking is correlated with high alcohol consumption). If one action causes another, then they are most certainly correlated. But just because two things occur together does not mean that one caused the other, even if it seems to make sense.

Have Adam and Dr. Ratain identified a correlation? Yes, it seems they have. But have they identified the cause? Does a sub-$300mm MC, four months before topline data are reported, cause a phase 3 study to fail? No, of course not--that should be obvious. So what is the true cause of failure? Adam sought to provide a reasonable explanation as follows:

The internal discipline at larger companies that kill off weaker cancer drug candidates before they reach phase III studies are totally absent in micro and small-cap companies.

All drugs are "stars" at small companies. R&D budgets are tighter, which steers these companies towards running smaller phase II studies -- most non-randomized and uncontrolled -- which too often deliver clinical data skewed positive. The positive signals seen in phase II studies cannot be replicated in larger, better-designed phase III studies.

The management teams of small cancer drug companies are incentivized to ignore signals their drugs may not work and just push ahead with new trials.

Larger companies certainly don't have perfect cancer drug track records, but they do have fatter pipelines, which forces early-stage drugs to compete for R&D dollars. The bar for moving a drug into phase III studies is higher at larger companies, which is why their success rates are higher.

The market also incentivizes larger companies to be more disciplined with R&D expenses. Wasted R&D steals from earnings -- something investors do not like.

And so in essence what he is suggesting is the market, being a collective of minds with experience and insight, tend to come to accurate conclusions about the potential success or failure of a company or drug candidate. This is reflected (in large part) in the MC.

But when has the market ever been wrong? The answer is: many a time. It doesn't take that long to come up with any number of examples. APPL, BRK.A, GILD, CPXX are all stocks that were undervalued, or severely undervalued at one point or other even when information to decide otherwise had already been disseminated. The market is much better at accurately valuing what it can see clearly than it is at predicting anything. In fact, the market seems at times completely incompetent at predicting. Those times that stand out in my mind in which it has tried were often failures--such as the tech bubble. The market, like us, because it is us, is either hopeful or pessimistic prior to everything being "out on the table," and often expresses exaggerations of either predisposition.

Is there an alternative explanation for why we are seeing this supposed correlation (< $300mm MC = 0% oncology P3 success)? It is my opinion that Adam mostly has it right. The market has predicted it very accurately here. Almost shockingly so. Although I think he overextends when he also claims it cannot be a fluke, now that the sample has grown to a massive n=36... (I jest). Instead, we are likely seeing an inflammation of probability. In other words, variance.

There have been hundreds of P3 oncology studies sponsored by big pharma during the time those 36 micro cap sponsored P3 studies reported topline (13 years). The meta-analyses are skewed in this respect. The number of big pharma P3 studies are vastly under-represented in the sample. It would be very easy to find 50 or 100 such BP failures during those years. Were there stretches of time in which BP recorded 10, 15, 20 failures in a row? Certainly. Especially (and this is key) if futility halts were included in the list of fails.

Most larger companies have futility stopping boundaries included in their study designs. This of course saves them money and time put to better use in developing therapies that have a greater chance of success. Micro caps often forgo the futility analysis, or make the stopping boundaries that may trigger a halt for futility abnormally high (or make them "non-binding"), because, as Adam points out, they have their whole companies essentially riding on the result. It's unpalatable for them to potentially halt a study for futility before its completion, even if the chance of success after an interim review is extremely small.

So when BP reports topline data, most of these studies have already passed one or multiple futility looks. You actually don't even hear about many of the ones that didn't pass, as it is not a material event for them. The F-R rule is making an unequal comparison by not factoring in this variable. It's a classic example of ascertainment bias. But even if it did avoid such bias, the sample is too small to derive a reliable conclusion from (from a statistical point of view).

The F-R rule is an artifact of consequence. What we are seeing with the rule is an observation made in a window of time. It would likely take 1,000 or more studies to begin to have a grasp on the probability inherent to this exercise, and only if all other variables could be controlled (say with flipping an unbiased coin). In other words, with all other things being equal.

Of course, by the time 1,000 micro cap sponsored studies were conducted, everything in the "environment" will have changed; there is no way to control the variability that may impact data in this exercise. The extraneous circumstances surrounding micro cap bios running P3 studies from 2000-2014 will never be repeated--at least not in exactly the same way. Therefore the high predicting power of the F-R rule at present is limited, and may even be imaginary. Perhaps the "true" probability is closer to 65% - 80%. As with flipping a coin, anything can happen in a short window of time (12 heads vs 2 tails) that will not accurately reflect the true probability of outcome. I say "short" relative to the time it would take to have a statistically meaningful sample.

So the question naturally arises, how reliable is an observation noted from a small sample in a "short" window of time in a vastly dynamic environment? I would posit that the reliability of such an observation continually repeating, and to what degree, is largely unknown.

Not that the F-R rule was the intention of the original analyses anyway, which I'll get into in a moment. But considering CPXX recently broke the rule at a very small MC (approx $64mm -120 to -1 days before topline--they were acquired a few months later for $1.5B, representing an incredible and utter "misread" by the market, which also only awarded CPXX with a $350mm MC after topline was reported), and also CTIC not long before that--it is possible that the true predicting power of the F-R rule is much less than the 95% accuracy it currently touts. In fact, if we took only the last 4 micro cap sponsored P3 studies that reported topline data, and compare that success rate with big pharma's last 4, the micro caps would be doing better. I'm making a faulty conclusion, of course, based on a very small sample--but then probably so does the F-R rule.

The rule began as a retrospective analysis of a meta-analysis, one that sought to find trends in stock prices before positive data or regulatory approval. It was a follow-on study, inspired by this one:

We examined stock prices of biotechnology products before and after announcement of Phase III clinical trial and Food and Drug Administration (FDA) Advisory Panel results for indirect evidence of insider trading.

Biotechnology stock prices were recorded for 98 products undergoing Phase III clinical trials and 49 products undergoing FDA Advisory Panel review between 1990 and 1998. Prices were recorded for 120 consecutive trading days before and after public announcement of these two events. We compared the average change in stock price of successful products ('winners') with unsuccessful products ('losers') before the public announcement of results for both critical events.

The difference between average stock price change from 120 to 3 days before public announcement of results of Phase III clinical trial winners (+27%) and losers (-4%) was highly significant (P = 0.0007). A similar but non-significant difference was observed between the average stock price of winning (+27%) and losing products (+13%) before FDA Advisory Panel review announcements (P = 0.25).

I would be very curious to learn how many < $300mm MC companies were "winners" in the above analysis (pre-2000), if any, that were omitted in F-R's analysis (post-2000). It's an interesting question: how has this rule performed historically, adjusted for inflation?

The above meta-analysis shows a clear trend from 120 days to 3 days before topline data were divulged, wherein the stock price of companies that reported successful outcomes trended up in the 4 months leading to data, while those with unsuccessful outcomes trended down. This led the authors to suspect leaked data from one source or another (CRO perhaps, or someone with access to confidential patient data at a site that had enrolled a significant number of patients, MOB, etc.).

The follow-on meta anlaysis examined these findings further, and also found an up-trend in stock price from companies that reported positive or negative data from 120 days in, though not significant (p=0.09). They also found a stronger and statistically significant uptrend from 60 days in (p=0.03).

Adam and Dr. Ratain made a retrospective discovery from the second of the two above analyses. A significant discovery, in my opinion, though the meta analyses were not designed to determine such a thing. Have there really only been 36 phase 3 oncology studies from micro cap companies that reported topline data between 2000-2014? No where do the authors make the claim that the list is exhaustive. Also, no claims are made on micro cap bio companies prior to 2000. And it appears based on the first of the above two meta-analyses that there may have been a few of them among the "winners."

So although helpful as a guide it appears the F-R rule is lacking on several fundamental points. For one, the sample is small (n=36) and confined to a "short" window of time (2000-2014). Hence, and due to variance, the true probability of this rule has not yet been determined. Currently it is trending very high (just 2 out of some 40 P3 studies sponsored by micro caps were successful--95% predictive), but would take a while to derive reliable predictive power from. Over the last 4 topline readouts it has only been 50% predictive (with micro caps doing better than BP).

As an example of variance, let's say you rolled a die 36 times and never saw a "1." It would be very rare (1/500), but it is possible--and given enough time, actually must happen. If that were to occur on a given round of 36 rolls, you might come to conclude, "The '1' very, very rarely ever comes--maybe it never does." And let's say you also noted that the "6" came up 15/36 times (41%), and further concluded the "6" comes up more often than any other number. Of course all of that would be wrong--you were only experiencing variance. The true predictive power for rolling a "1" or a "6" on a six-sided die are equal at about 16.7%. After rolling the die 1,000--or better yet, 10,000 times, you would see a much more balanced distribution of outcome, with each number coming up closer and closer to 16.7% of the time.

What I see as the main issue with the F-R rule is that there is no underlying law of nature to force the probability to abide by, with calculated variance. An "odds-on" scenario that '95% of P3 oncology studies sponsored by a micro cap will fail' (currently, until it goes down even lower), cannot strictly apply for fundamental reasons. This is not a vacuum in which probability can be determined simply by dividing 1 by the number of potential outcomes. It just isn't that simple. The 'magic' number "$300mm" will actually not tell you anything.

There are some things we do know: a) the market will be wrong at predicting outcomes, at least a significant % of the time; b) the F-R rule has been broken 2 out of 40 known times, and interestingly 2 out of the last 4; d) the true predictive power of the F-R rule is unknown due to variance, variability of external stimuli, and small sample size; e) in the case of the F-R rule, correlation does not imply causation; f) the F-R rule does not consider futility halts from larger market cap sponsors that go unreported, committing ascertainment bias; and lastly, e) a more complete analysis should be conducted going back to 1980 (adjusted for inflation), with verification that every single oncology P3 study sponsored by a micro cap was covered, with none left out of the analysis. Although even this much larger data set would be lacking--past performance does not guarantee future outcome.

The F-R rule is certainly not something that can be ignored, however, and should be weighed against any long-thesis--but it is no law, and it's 95% predictive power (currently) could be, and probably is, overblown. It is still very much a hypothesis, or at best a general maxim or 'trend' that should be allowing of exceptions. And in my opinion it places too much confidence in the market's ability to predict, something that has been spurious in the past.