Kevin Gorman - President and CEO
Neurocrine Biosciences, Inc. (NBIX) Morgan Stanley Healthcare Conference September 10, 2013 3:30 PM ET
Yesterday, I am sure a lot of people have seen it maybe not everyone. Can you just walk us through what you’ve …
Thank you very much, and thanks to Morgan Stanley for the opportunity to be here today. First off I’d like to say that I’ll be making forward-looking statements. So I direct you to our recent SEC filings here on the risks with the company. So we have a drug that’s in Phase II clinical trials for tardive dyskinesia. That’s the movement disorder that's cause by antipsychotic drugs. The first of the two Phase IIb trials that we have running was rolled out, data was rolled out yesterday at the close of market and we got a few surprises in that trial, some pleasant and some not so pleasant.
To start out we missed the primary efficacy endpoint in the trial. We have two doses in there, a 50 milligram dose and a 100 milligram dose. We expected 50 milligrams to be a very efficacious dose and well tolerated. We put in the 100 milligram dose because we thought certainly we’d see efficacy with it but we thought that would be in intolerable dose. The data that we had at the time in normal healthy volunteer safety data told us that we are going to have a lot of safety side effects with that.
As it turned out, the 100 milligram dose was extremely well tolerated in the study. The 50 milligram dose did not show its efficacy in the trial and that is a bit different from a small Phase IIa crossover study that we had reported out last year where the 50 milligram dose did show efficacy. What was confusing to us is that not only did the 50 milligram dose not show any efficacy, the 100 milligram dose didn’t show efficacy when using the scores on the primary endpoint, which is the abnormal and voluntary movement scale.
Prospectively in the design of the trial we had said that the primary endpoint is going to be the AIMS score coming from the AIMS raters at each of the 33 sites. It was going to be their AIMS score that gives you the primary endpoint. We had already designed into the trial and in previous trials to always have a central video AIMS expert blinded AIMS expert data. And when we saw that we basically had no movement at all, whether you’re talking about placebo 50-100 from the onsite AIMS raters that we had trained up over time to do this. We then turned to and ran the data from the blinded AIMS video expert and there we saw a substantial and robust efficacy signal from the 100 milligram dose. The 50 milligram dose gave you a directional signal but certainly not clinically or statistically significant.
So you have two conflicting AIMS scores here, one from the investigators onsite who we spent a descent amount of time training on applying the AIMS and the other from a blinded experts AIMS reader from video, which one do you believe, which one do you put more confidence in? Well, so what you look for is you look for some independent, other signals in the study. Well, we had also incorporated into this study a key secondary endpoint and that’s the global impression of change in tardive dyskinesia by the treating physician onsite. So at each of the 33 clinical sites you had an AIMS rater who just applied the AIMS test. That could have been a nurse practitioner or PhD applying the AIMS test to the patient.
And then you have the treating physician, the treating psychiatrist onsite. And these two are the top. The treating psychiatrist onsite is responsible for the patient's wellbeing. They apply the global impression of change. So what we did was we took a rigorous look at the CGI-TD that they'd applied and we looked at only those top two scores on that scale of being very much improved or much improved. And here we found that they recognized a large change in the 100 milligram patients. They recognized approximately 36% of their patients had very much improved or much improved. And in the placebo group it was quite low, it was about 7% that they had seen there.
That hung together with what the video AIMS rater was looking at. As a matter of fact when we looked at those 36% of patients with the very much improved, the much improved CGI so that responders by the treating physician to their blinded video expert, those patients have on average a seven point decrease in AIMS, over 50% decrease from baseline, highly significant clinically.
We then looked at the PK. As far as PK sample and each time the patients came in for an AIMS assessment at week two and week four and week six. We then looked at PK and checked the blood levels of our drug [indiscernible] and you will find that you have a PKPD relationship that correlates just fine with the CGI and what the video AIMS meaning higher blood levels of the drug lead to greater responsiveness on AIMS and on CGI. None of those held together with the onsite AIMS raters.
So we have two Phase II clinical trials now in row. Last year it was with the small Phase IIa crossover study where the onsite AIMS raters did not give us useful data and a post-talk analysis by a blinded video expert did. This time it's not a post-op analysis, it’s already in the protocol from the very beginning that there would be video and that there would onsite and again despite our best efforts, the onsite AIMS did not work out but the video clearly did. So there is a little bit of a summary here. We took a lot of learnings from last year’s Phase IIa, we applied them to our Phase IIb program. We had the wrong patients, we had too much mild tardive dyskinesia patients in that earlier Phase IIa.
So we put in safeguards such that we only got moderate to severe tardive dyskinesia patients in our Phase IIb and that worked. With this trial the patient population was perfect, we had very severe, very much suffering AIMS, tardive dyskinesia patients in here.
We have a large placebo effect in that Phase IIa. We put in safeguards in order to minimize the placebo effect. That worked also. Our placebo effect was right in the range that we would have predicted to be approximately at 20% here and that was very good. We monitored the study very closely and for compliance and that seemed to work out very well too, but despite our best efforts at being able to train up on site people to apply the AIMS exam, that didn’t work, a take away for us and now a future phase 2b trial and in the pivotal program as we will go for central video AIMS readings going forward.
So maybe we can dig into the data a little bit. There seems to be a little bit of an inconsistency; onsite rater there was no dose response and then when you move to the video rater, the placebo and the 50mg arm changed I think about 1.5 to 1.8 on AIMS scale. The 100mg arm though seemed to change 3 points on the AIMS scale. So, one what do you think is going on there and two how can you be confident that this signal that you're seeing is a real given the variation in the changes there.
A couple of things there is that, how can you be confident. There is multiple basically sensitivity tests that you can subject the data to and I think one of the best is looking at the cumulative change over a response continuum and to see that here is placebo responding with if you decide that you are going to look between people who have 30% decrease in AIMS to up to a 70% decrease in AIMS, how does placebo and the 100 milligrams track in that? If they are touching and spreading then you know you don’t have a real signal. If they stay completely spread in that entire range then you can feel better that that looks real, that’s going on there and that’s exactly what we see. It's part of a slide set that is up on our website now. We have tried to avail the investment community with basically all the data that we have from the study and we’ve put it on the website. So you can look at it and interrogate it for yourself and you do see that big separation there. And then secondly as I said before, what are other independent assessments that hang with that data or don’t hang with that data. The CGI hung with it very well. The PK hung with it. The exposure levels of the drug hung with it very well.
So, that gives us confidence and we do believe we found in this space to be -- we found a very efficacious and very well tolerated dose. It happened to be a higher dose than what we thought. It's at the 100, it's not at the 50.
So then how confident are you that two weeks is kind of enough here to gauge safety and efficacy. We know that Tetrabenazine for example had depression. That happened longer term and maybe you're not seeing it and efficacy wise, this kind of drug, does that difference between placebo and drugs tend to stay over time or does it get smaller two weeks?
No, great question and I’m glad we’re talking about safety here. So let’s talk about that. And there is one thing that during our conference call last night we were asked a very good question and that was that there were two drop-outs in the study due to adverse events. And the question came is to what were those adverse events in the 109 patient populations, what were the AEs that caused them to drop out and do we know what dose level they were in?
We did not have that at our fingertips before the conference call ended. We got it immediately afterwards and just to be clear that those two adverse events, one of them was in the placebo group and that was a patient who stopped taking placebo, stop taking all their antipsychotic meds, basically stopped taking all their meds and two weeks later went into a psychotic episode. That was one of the AEs. It was in placebo.
The second AE that led to discontinuation was in the 100 milligrams group and it was a woman who in that first two weeks got a urinary tract infection, not dose related and she dropped out. I think it happened fairly shortly after the trail just begin.
Now, what gives you confidence that you can extrapolate out 100 milligrams in two weeks to longer? You can only extrapolate out so far on certain AEs, on certain add intolerabilities. If you’re going to see somnolence, wellness of thinking, akathisia, real restlessness and uncomfortableness, those would be an exaggeration of the phonology drug. Those you’ll definitely see within the first two weeks. You’re going to see those within the first few days and what you see at two weeks is pretty much what you’re going to see later on. Parkinsonism you will see that starting in two weeks. Sometimes they can take a little longer. So that question might be a little bit open there but we didn’t see any signal and we tested for that. We have a proactive test in the study for Parkinsonism.
When you start going up and saying things like a real over suppression of VMAT2 so that even though you are going to initially be suppressing dopamine, eventually high enough doses you’re going to start to suppress the other monoamines such as norepinephrine, serotonin, histamine and those are where we would think okay now you might start seeing a depression signal. That you’re going to have look farther out. That’s just by definition you have to have about three months of dosage to see that. So, that is an open question at this point.
The only thing I would say is with the 100 milligrams of two weeks, we didn’t see any exaggeration of pharmacology on dopamine. So the only thing I could say now is I wouldn’t seem to think we would be seeing an over suppression of any of the other model monoamines at this point but it’s an open question under you do the longer term trials.
Okay. Then you spoke about the discontinuation due to AEs. Looking at the overall discontinuation rate, if I'm remembering correctly, I think it was -- the placebo in the 50mg arm overall had lower discontinuation rates than the 100mg, even though the last four weeks of the 100mgs were really 50mgs and I'm just wondering in that 100mg arm with the higher continuation rate, did a lot of those happen in the first two weeks and what were those other discontinuations for?
Yes, no you are right that it goes over the period of time. So these are people who were on 100mg for two weeks and then the 50mg for the following four weeks. And the way that it worked out is in the trial it was a 50-50 randomization. So you have approximately 50 patients who are on placebo and 50 on drugs. There were seven discontinuations in placebo, there were nine discontinuations in drugs, three and 50, six in the 100 milligram if I am recalling correctly.
Again one of those in the 100 milligram arm was that UTI that was there. We can take that one off the table. The other five are things that don't have anything to do with tolerability. They moved. They just moved and left town and couldn’t be in the study anymore. They have used cocaine or heroin. Therefore they have to be removed from the study. They are not discontinuations that have anything to do with the drugs. So not seeing any signal there.
Really overall the discontinuation rate, just overall in this trial was about 35%. You are working with a population of schizophrenics who have severe tardive dyskinesia and they have been this way for 20 years or 30 years. They are polypharmacy, just on the prescribed drugs. It was a very nice low discontinuation rate that we had. We were expecting somewhere around 50, 55 if you would start seeing things.
Does anybody in the audience have any questions?
Can you just walk us through the AIMS score and how could you have such variability between [indiscernible] sites and an independent video site. Can you give us more detail on the video, how that works, how they were blinded? How many independent assessors there were behind the video et cetera?
So it's a good question. The AIMS, the Abnormal Involuntary Movement Scale, it's a zero to four point scale on seven body regions. One at the legs, one at the arms, one at the trunk and then there are four that all have to do an oral facial buckle tongue. Zero is that there is nothing going on there, one is minimal, two is mild, three is moderate, four is severe. It's a structured neurological exam that's done by the AIMS rater in a room. It takes about 10 to 12 minutes to do. There is high definition video camera that is being utilized there and then it's easily uploaded.
It is a team when it came to who qualified for the trial, it was a team of video reviewers, experts who made the call, the determination of who was in and who was out just on a global impression of are they mild, moderate or severe AIMS patients.
Then when it comes to applying the AIMS test on the video scoring as the trial went on, it was a single AIMS expert, blinded AIMS expert who did the video scoring of the AIMS that went on. To the heart of your question which is the one real unknown that we are curious to get to and that we currently are still blinded on an individual patient level, because the study is ongoing. There is a six week open label at 50 milligrams. So we continue to be blinded.
We look forward to in the coming few weeks to then being un-blinded finally on a per patient basis so that we can try to get to the answer to that question, why did these onsite AIMS raters, their assessments were completely insensitive to showing change. There clearly was change. Their treating physician colleague at each site saw profound change. The video expert rater saw change in these patients, and at some point to give you an idea, the patients come in with a real range of AIMS the total score but there are all moderate to severe suffering. You ask some patients whose AIMS score went down by 100 milligrams by 20 points, that's an enormous change. The onsite raters didn't pick that up. We do not understand why that is yet. We will get back to you before year end when we are completely unblended to see if we have more information that comes to bare on that.
There's a question in the back.
Did the central rater also do a baseline reading or did they only look at the kind of treatment reading compared to the set readings of baseline?
No, the video AIMS rating by the external review or the blinded AIMS video rater is done for each of the visits, that the AIMS has given in the study. So each time that any of the sites who are doing an AIMS rating, a video AIMS rating is done as well.
Because the change that you saw at the Central site was based on their baseline reading with Central Research [indiscernible]
That’s exactly right. One of maybe the confusing things here is, as is well known and well-studied in our hands and in others is that a video rating does shorten up the scale. So you would expect by that, if two experts, one in the room with the patient and one looking at video were to rate that same patient, and you would say that the person in the room would rate that patient as 15; and on average the video rater who would be as expert. You don’t have to worry about expertise here; would rate him as a 12 or 13. It is a little flatter. Video is just flatter than being in real life. So you lose some fidelity there. Now you might say then, well if you lose fidelity if you shortened your dynamic range some light, then video shouldn’t be as sensitive as the rater. And I would say, yes, that’s the way it should have worked. But we now got two Phase II trials in a row where no, that’s not the way that it does work. The video raters have had greater sensitivity and more reliability than have been the onsite raters. Did that answer your question?
Is the fact that the video assessments, they lose a little bit of the subtlety in movement, is that a problem for the FDA? And what’s their view on using video then?
The FDA actually very much likes central reading, in a number of different situations. And certainly for neurologic diseases, they like central rating. As you’ve seen with psychiatric diseases such as depression, most of depression trials have gone to central reading. So across the FDA, by and large central reading is actually very welcome. But you have to bring them to Phase II data that justifies your design that utilizes a central reading. And that’s what we plan on doing.
So you’ve got, still another tardive dyskinesia trial ongoing with the next two trials little different in design, testing doses that don’t get to a 100mgs and now that you've seen KINECT-1, what are your expectations going into KINECT-2. If you don’t see a dose response in KINECT-2, what’s the decision from there?
So let’s first describe KINECT-2 and the couple of important differences between KINECT-2 and KINECT-1. KINECT-2 again is moderate to severe suffering tardive dyskinesia patients. KINECT-2 is very different about their underlying disease. All of the KINECT-1 patients, all had schizophrenia or schizoaffective disorder. All of them during the trial were on a variety of anti-psychotics meds.
KINECT-2 is a different patient population, still TD, moderate to severe, but half the patients are schizophrenics, schizoaffective, the other half of the patient are bipolar or major depressed patients. So two very different psychiatric diseases that have been treated with anti-psychotics, they cause the irreversible tardive dyskinesia disease and virtually all the patients in KINECT-2 who are bipolar and major depressive, they are no longer on an antidepressant. They have been off of them, the TD persists.
When we started out doing all of this, we said there is as many arguments by experts that the bipolar patients would need a lower dose of our drug, for that dampens presynaptic dopamine as there were arguments that said the bipolar patients would need a higher dose that were dampened on, with presynaptic. So at that time we had thought 50 was going to be that dose. So that’s why we found it on either side of 25 milligrams and 50 milligrams, and it’s a titration study, the titration dose optimization study. So you have placebo and you have active. Everyone one active starts out at 25 milligrams for two weeks. And AIMS rater onsite says I’m still seeing tardive, goes to the treating physician, still seeing tardive, they’re available for up titration. That treating physician will then do a CGI and will also then examine the patient, look for adverse events, how are they tolerating? If the tolerating is great, we are going to go up, the patient walks out unknown to them that they have been up titrated and they may have been getting twice as much placebo where they going up to 50 milligrams.
They come back two weeks later and now the same is done, but now the decision can be different. We leave them there. We bring them down where they get to go up to 75 milligrams and they go up. At the end of the six weeks, it is all patients regardless of dose that they’re on because you’ve optimized the dose through this process for efficacy and safety against placebo, is what the primary endpoint is.
You’re right there is no 100 in this. So we may end up in a situation that we have us on high enough in dose. I would say that for the half of the patients who are schizophrenic and schizoaffective disorder, the KINECT-1 study has shown us that 100 where you seeing it. We still don’t know. The jury is out, let’s see what the date is. What we have learned is that we can’t depend on onsite AIMS raters. So we are putting in an amendment of the protocol right now to change the primary endpoint from being the onsite rater to having the video rater be the primary endpoint and that’s fine. We can do that now. That’s closer to do that.
So we’re doing that and we will see what that looks like. Primarily important to us right now is that we see that these doses are safe and well tolerated in the bipolar patients and then let’s see what efficacy we come out but primarily that and then we’re going to get actually for the first time some experience with what does titration look likes in these patients. What we are doing independent of that data coming out yet, with the data that we have so far from our KINECT study is we have begun designing and additional Phase IIb study.
Now, we know that we’ve got one dose that’s very efficacious and well tolerated, it’s the 100 milligram. We know we have plenty of tox coverage preclinical and clinical to go higher in dose. So we have a lot of room up there. So we’re designing that. We’re going push ourselves now in yet another Phase IIb study utilization 100, at least 150 and then actually go up to 200 milligrams. So we’re in the study design there and what this, the basic effect of the data that we have today tells us, that we’re delayed one year from our anticipated end of Phase II meeting. We anticipated that would be February of next year, it will be February of 2015.
Okay, we’re almost out of time so I'm going to just sneak in one more here. [Indiscernible] in the middle of 2015, are you confident that 854 is differentiated enough so that benzene being generic wont’ be a problem.
Oh, I think so for multiple reasons. I’ll just get to it. The first is that benzene has a black box warning for QTC prolongation. It has warnings for suicidality on it. It’s priced even as a generic for the exceptionally tiny ultra-orphan [ph] disease of Huntington's chorea patients, just those that have chorea. No, benzene is not going to move into that tardive patient population. It’s a restricted formulary access. So I’m confident of that and if anything what the data told us today is one of the fundamental underpinnings of our program has always been that we have a drug that's as efficacious as benzene, but safer and now we appear to have one that hits that, but is even much safer than what we even talked.
All right great. Unfortunately, we’re out of time so I’ll stop there. Thanks everyone for joining us and thanks Kevin.
Thank you very much Sarah [ph].