ACADIA Pharmaceuticals Inc. (ACAD) recently announced successful top-line results from its Phase III trial evaluating pimavanserin in patients with Parkinson's disease psychosis (PDP). The company claims that pimavanserin met the primary endpoint in the Phase III trial by showing highly significant (p = 0.001) antipsychotic efficacy as measured using the 9-item SAPS-PD scale. Furthermore, Acadia claims that the drug also met a key secondary endpoint for motoric tolerability as measured using Parts II and III of the Unified Parkinson's Disease Rating Scale (UPDRS). And these positive results were supported by a highly significant improvement in the Clinical Global Impression Improvement, or CGI-I, scale (p=0.001). In sum, Acadia Pharmaceuticals is arguing on the back of this Phase III trial that pimavanserin is both a safe and effective treatment for PDP patients.
These top-line results caused investors to cheer, and ACAD shares more than doubled in a single day upon the release of the much anticipated data. Even though the results appear to be stellar at first glance, a deeper reading of the press release reveals some problems regarding the statistical treatment of the data, and hence, the efficacy claims of pimavanserin. I explore those concerns in this article.
I know such a statement will be controversial, and attempts will be made to misconstrue this article as a hit piece. As such, I would like to make some initial disclosures: First and foremost, I have no position in ACAD, and no plans to ever take one. I follow ACAD for scientific purposes only, and yes, there are people in the world interested in science, not money. Secondly, my view on the company's clinical trial of pimavanserin comes from the viewpoint of a physiologist, not a psychiatrist or behavioral biologist. And this is a critical distinction to note. Most of my criticisms can be applied to a wide array of similar behavioral studies, but I firmly believe most physiologists will share my view of the statistical treatment carried out by Acadia in its Phase III trial. Why does this matter? Because Acadia's ultimate goal is to get pimavanserin approved by the FDA, and to do so, it will be reviewed by a panel of medical experts with a diversity of backgrounds, not a group of psychiatrists. So it is my belief that the current data analysis could be problematic during a formal FDA review. To explain my reasoning to a general audience, I will give a brief introduction to variable types, and then explain why Acadia choose an inappropriate test for the type of data collected in their Phase III trial.
Variables types and why they matter in statistical analysis
There are two broad kinds of measurement variables: Qualitative (discrete) and Quantitative (continuous). In their Phase III trial of pimavanserin, Acadia researchers scored patients on what are known as "summated psychological scales." Psychological scales generate variables that have two or more categories that can be ordered or ranked, and these types of variable are commonly known as "ordinal" variables. To be clear, ordinal variables are qualitative in nature. Even so, psychiatrists frequently assume an interval structure underlying the scale, and thus treat scale data as continuous. Scientists with a formal biostat background often view this transformation of qualitative into quantitative data as controversial. The reason is because one is making a major assumption about the underlying structure of the data without direct evidence.
To remedy this situation, a number of behavioral studies have integrated physiological measurements, and directly tested this assumption prior to a formal statistical treatment. However, Acadia did not incorporate any physiological data (hormone levels, neuroimaging, etc.) into its study. And without a strong correlation between the scale rankings and a physiological parameter directly associated with the disease, there is no way to know that the scale indeed has the assumed interval structure. While it could be argued that the trial design itself (i.e., double-blind, placebo-controlled) should take care of this concern, I am suspicious of this claim because of my personal experience performing behavioral trials relying on subjective categorical rankings (i.e., they turned out to be purely manmade superimposed constructs, not real biological phenomena). Furthermore, the highly skewed distributions derived from these types of studies generally yield belie an assumption of continuity. For these reasons, I believe Acadia should have taken a cautious approach to its statistical analyses, and treated ordinal data, as well, ordinal data.
Acadia's statistical analysis and possible alternatives
Nevertheless, Acadia decided to take a controversial approach and treat its scale data as a continuous variable in a mixed model repeated-measures (MMRM), resulting in highly significant p-values for all endpoints. From a more conservative viewpoint, Acadia could have used any number of cluster-specific logit models. For example, cumulative logit modeling would have been an excellent choice, and would be hard to argue with from a statistical standpoint. An Information Theory approach or Bayesian analysis would have also been much more appropriate and less controversial. From what I've seen of the experimental design, a Directed-Graphical Analysis employing a Bayesian network would have been particularly appropriate. All of these statistical treatments would be more rigorous than treating the scale data as continuous in a MMRM. In this way, Acadia could have done away with making unsupported assumptions about the data structure.
So this leaves me with a question: Did Acadia perform more robust statistical analyses, but failed to achieve the level of statistical significance it was seeking? We have a saying in science: "if you don't like the result, change the statistical analysis." I have no idea why Acadia choose the controversial statistical treatment it did, although I suspect differences in disciplines play a major role. In fact, I would bet dollars to donuts that psychiatrists do not find the analysis controversial in the least. All I know is that I personally do not believe the results the company announced in regards to the safety and efficacy of pimavanserin on November 27, 2012, and this is how physiologists on an FDA panel will likely react to the study.
Even so, I am hopeful the company will take steps to bolster the broader scientific community's confidence by publishing the p-values from more conservative statistical approaches. And for the sake of patients suffering from PDP, I sincerely hope my concerns are only academic in nature.
In conclusion, my intent with this article is to inform the Acadia investor base about alternative viewpoints of company's recent Phase III results. Not everyone was bowled over, and it's important to hear an alternative voice when making investing decisions. Remember Charlie Munger's famous mantra: invert, always invert! As a physiologist, I always want to know that a subjective behavioral rank has a physiological underpinning, and is thus a "real" biological phenomenon. Otherwise, a behavioral categorical ranking could all too easily be a subjective construct of the human mind.