Recently, Exact Sciences (NASDAQ:EXAS) of Madison, WI, released the top line results from their pivotal Deep-C trial to nail down the performance characteristics of their proprietary stool DNA test. The release of the top-line results led to one of the strangest trading days you'll ever see with the news initially dropping the stock to around $7/share pre-market (around 30% drop) and the stock gradually recovered to end the day at $9/share. Some of the drop was driven by inaccurate reporting by analysts like Zarak Kurshid of Wedbush Securities (the same fellow who recently had a $0.5 price target on SQNM) who reported that the sensitivity missed the 50% cutoff for pre-cancer detection mandated by the FDA, when the FDA has no such target for pre-cancers whatsoever.
Since this is a company with one product in the near-term pipeline, it is important to ask is this product going to be a game-changing cancer screening tool? Currently, colorectal cancer is the number two cancer killer in the US, yet it is also one of the most preventable cancers since the cancers grow so slowly and they can be detected (and removed) with an invasive colonoscopy.
There are two primary problems with colonoscopy as a screening tool - even though its performance is still the gold standard. The first is that it is very expensive, and so represents a lousy choice as a primary screen since most people will screen negative. The second major problem is that fully one-half the population chooses to never get a colonoscopy over their lifetime thus compromising the ability to detect the cancers and pre-cancers early when they are either preventable or treatable. The EXACT test, Cologuard, has been years in development and is based upon mutations detected in DNA extracted from the stool. Early validation studies were quite promising and they have just finished what is the largest ever screening trial of stool DNA in an average risk population.
So the question is whether the Deep-C clinical trial met its primary and secondary endpoints or not. The trial is the largest yet attempted to assess the performance of a test composed of examining gene mutations and methylations known to be associated with colorectal cancer as a screening tool. The trial enrolled over 12,000 subjects with the primary and secondary endpoints being:
The primary objective is to determine the sensitivity and specificity of the Exact Colorectal Cancer (NYSEMKT:CRC) screening test for colorectal cancer, using colonoscopy as the reference method. Lesions will be confirmed as malignant by histopathologic examination.
The secondary objective is to compare the performance of the Exact CRC screening test to a commercially available FIT assay, both with respect to cancer and advanced adenoma. Lesions will be confirmed as malignant or precancerous by colonoscopy and histopathology
Thus, this is the first, truly definitive study of sDNA with comparisons to the two most commonly performed CRC screens, colonoscopy and fecal occult blood testing in the same subjects and stool samples (for the FIT and sDNA).
As judged by articles and blogs one reads in the financial press it is clear that many, if not most, commentators don't understand the science of CRC screening nor do they seem to possess much understanding of medical statistics. By way of a little background, so we can understand whether this test was a game-changer or not, it is helpful to understand the origins of stool DNA (sDNA) testing and its rationale.
The pioneer of this method was Dr. Bert Vogelstein of John's Hopkins University. He characterized many of the mutations in the pathways that lead from pre-cancerous lesions to colorectal cancer and is so honored with the use of the term "Vogelgram" a nice, concise picture is shown here.
Most pre-cancerous adenomas grow, very slowly, as stalks, invading the wall of the colon and rectum when they may finally progress into full-blown CRC. There are two facts here that are important to understand the rationale of CRC screening using sDNA. First, the fact that the pre-cancers are slow growing means that there is a good deal of time to detect them before they become too dangerous. Second, the fact that they don't invade the wall of the colon until they become much larger means that tests that detect blood borne products will inherently have lower sensitivity at the early stages when you might have a better chance of removing the polyps. This was the original, and still scientifically sound, rationale for why one would like to use the feces instead of blood for testing since stool comes in intimate contact with the cells sloughed off by the adenomas.
Exact Sciences was founded by Stan Lapidus, and developed a large patent estate around isolation of DNA from stool (much of it being the work of their star scientist at the time Anthony Shuber), and from licensing patents for gene markers from Johns Hopkins University and the Mayo Clinic. The chief scientific driver of Exact now is Dr. David Ahlquist of the Mayo Clinic whose lab has been responsible for the discovery of many of the methylation markers used in the Cologuard test. Exact came pretty close to folding in the mid-2000s after all their work failed to yield a commercially viable, medically compelling test. It was then that Kevin Conroy, fresh from his success at developing a molecular diagnostic for human papilloma virus at Third Wave Technologies (bought out by Hologic in June of 2008), was appointed CEO by the board. Conroy focused on commercializing and streamlining development of the test and recruited Dr. Ahlquist to the company. The fruits of these labors have now been tested with the Deep-C trial. In that time the stock price has shot from around $1.4 to its current value of $10.53 (05/15/2013).
To understand whether the top-line results reported are a true game-changing medical diagnostic we have to be able to make sound comparisons with the existing technology. In making such comparisons one has to be careful to compare apples to apples when it comes to the quality and type of clinical trials. There is a field of medical science called meta-analysis where one tries to group together a large number of clinical trials, and perform statistical analysis on the larger sample size, to obtain results that are more robust than any single trial alone. Paramount in the combination of the trials is the necessity of determining the quality of the trials to be combined - and rejecting those of lower quality. The Deep-C trial was a "gold-standard" trial of the highest quality. Why? It was a large, randomly selected clinical trial of an average risk population. The test was compared against the gold-standard of colonoscopy (which is an imperfect gold-standard - more on this later) and was also compared against the current non-invasive standard of fecal immuno testing (NYSE:FIT) which tests for the presence of hemoglobin (from blood that comes from lesions that bleed in the colon) all in the same subjects and stool samples. Further, all the cancers and pre-cancers detected in the colonoscopies were evaluated histologically. The results were run by a CRO and were blinded.
Therefore, to compare the results of the Deep-C trial to other technologies one must examine other, similarly large trials from average-risk populations. Further, one would like to only include trials where results were reported as a function of adenoma size as well as cancer stage and also trials where the latter values were determined using colonoscopy. When restricted to these criteria, it turns out there is a small sample pool to compare to. For instance, a competing technology using blood samples tests for mutations in a single gene septin-9. The large trial run in these subjects (called the PRESEPT trial) was completed in 2010 and yet the full results have never been published from the full trial broken down into cancer stage and adenoma size.
Therefore when comparisons with this test are done to sDNA one could just as easily cherry pick sDNA tests where the performance is always better than the Septin-9 test. Dr. Ahlquist published a preliminary comparison of the sDNA test to the septin-9 test using the same company used by Epigenomics (Arup Laboratories) to run the septin-9 tests (Clin Gastroenterol Hepatol. 2012 Mar;10(3):272-7). In the direct head-to-head septin-9 had sensitivity of 14% to adenomas with a median size of 2cm whereas sDNA had 82% sensitivity. Note, however that this was a very small trial with only 49 subjects with CRC and advanced adenomas. The point here is don't start comparing results from small trials to trials with large, average-risk population when trying to determine whether Cologuard is a success or a failure. Unfortunately, we are still waiting to hear about the full results for the septin-9 test.
The appropriate trial to compare to Deep-C for FIT is that conducted by Morikawa et al. (Gastroenterology. 2005 Aug;129(2):422-8.). This is the largest trial of FIT (over 21,000 subjects), by far, in an average risk population with validation by colonoscopy. That trial detected advanced adenomas with a sensitivity of about 17% (11% in the proximal colon and 24% in the distal colon - Cologuard has no difference in sensitivity to location). For high grade dysplasia the sensitivity was about 33%. To quote from the conclusions of that article:
Although the screening of asymptomatic patients with immunochemical FOBT can identify patients with colorectal neoplasia to a certain extent, the sensitivity is relatively low and different according to the tumor location. Therefore, programmatic and repeated screening by immunochemical FOBT may be necessary to increase sensitivity for colorectal cancer detection.
The sensitivity of Cologuard does not depend upon the cancer location, which provides an advantage to Cologuard. Before we compare to colonoscopy, let's examine FIT a little more closely since this is the test that Cologuard would like to replace. FIT has been proven to reduce cancer mortality. It needs to be used programmatically i.e. regularly to have maximal benefit. If one uses a sensitivity of 17% then the overall sensitivity for repeated tests rises to about 43% for three independent tests. This is what is meant by programmatic sensitivity.
For Cologuard we don't have the full breakdown of the numbers from the Deep-C trial. For adenomas of >1cm the sensitivity was 42% and for adenomas >2cm it was 66%. Thus the overall sensitivity (to compare to the Morikawa trial) will be somewhere between 42-66%. Let's use an average of 54%. Thus, for FIT to approach one Cologuard test one has to use four yearly tests of FIT. Since Cologuard will likely be priced at about $300 and FIT is about $25 the FIT would appear to have a cost advantage.
This comparison, however, breaks down when looking at proximal colon where sensitivity is only 11% and it would require 6-7 tests to attain the sensitivity of one Cologuard. Frankly, a sensitivity of 11% is on the borderline of being medically worthless. Obviously, it will require the full data set to perform a true cost effectiveness analysis of Cologuard, but CEO Kevin Conroy stated in the conference call describing the top-line results of the Deep-C trial that they will have beat the FIT by a "wide margin." Thus, the release of all the data stratified, classified and fully analyzed and published in a peer-reviewed article should provide another catalyst for the stock.
The other comparison one needs to make is with colonoscopy. While it is clear that colonoscopy will remain the gold-standard it has three major problems: First it is awful expensive to use as a primary screening tool. Second, it is clear that we have a major compliance issue with colonoscopies. Even though they are recommended for everyone over 50, only about 50% of the population has a lifetime compliance and fully one third of adults over 50 never get screened. The preparation for a colonoscopy requires a day off work to thoroughly cleanse the bowel before going in for the colonoscopy. Numerous studies show the preparation is more of a deterrent than the actual test. Third, recent data indicate that colonoscopy may not be as good as previously thought. For instance it is now clear that many colonoscopists don't do well at detecting sessile serrated flat lesions. To quote from an excellent article by the National Cancer Institute:
In 2011, authors of one study reported variability of detection rates for proximal serrated polyps. They studied 15 colonoscopists on faculty at one university and showed, during the years 2000 to 2009, a wide variation in detection rate for proximal serrated polyps, ranging (per colonoscopy) from 0.01 to 0.26, suggesting that many proximal serrated lesions may be missed on routine exam. The overall proportion of polyps that are "serrated" is unknown, in part because these lesions have been unappreciated and/or difficult to identify.
In that same article they also point out evidence that colonoscopies may miss many right-sided lesions and this was associated with a substantial difference in mortality outcomes for right and left-sided CRC diagnoses from colonoscopy.
Therefore, it is clear that the "gold standard" may not be so golden. This may have affected the results of the Deep-C trial. Prior validation studies suggested that sDNA was good at picking up these flat lesions, and that there is no left-right, proximal-distal difference in detection rates. Therefore, the somewhat disappointing specificity of the Deep-C trial, which came in at 87% instead of the targeted 90%, may actually be due to the presence of false negatives in the colonoscopy. This issue is a thorny one and may be difficult to pin down quantitatively in the near term since it is hard to design truly definitive studies to assess the accuracy of your "gold standard." However there is enough evidence to suggest that some of the specificity hit in Cologuard's performance may be due to problems with colonoscopy as the gold standard. Another hit in the specificity performance likely came from the inclusion of a large number of older subjects. Age, similar to cancer, leads to increased methylation of DNA. Since a number of the markers in the Cologuard test are methylation markers they may be subject to some age-related specificity hits. It will be interesting, therefore, to see if the Deep-C data is broken down with respect to age to see if specificity decreases in the older cohort.
To summarize - the Deep-C trial is the largest, most comprehensive assessment of this novel screening tool. Colorectal cancer is the second biggest cancer killer in the US, but is also the most preventable since the cancers grow slowly. The ACS, NCI and USPHS all recommend colon cancer screening using almost any technique as being preferable to no screening. The topline data from the Deep-C trial might not be considered as the "home-run" necessary to render colonoscopy obsolete as a screening tool.
However Cologuard's performance - in terms of sensitivity to adenomas and insensitivity to colon location - combined with its relatively modest price, is certainly good enough to warrant the gradual replacement of FIT. Remember, for FDA approval, it only has to meet non-inferiority to FIT and at least 78% sensitivity to all cancers overall. Cologuard hit 92% sensitivity for cancer overall, and beat FIT by a wide margin. It is almost a certain bet, as well, that the test marker profile can be improved, and that the DNA sequencing prices will decrease over time to make it the non-invasive screening tool of choice - especially for the 50% of the population who refuse to get colonoscopies.
Exact Sciences has a large patent portfolio around the markers used (licensed from the Mayo Clinic and John's Hopkins University) as well as for stool DNA extraction. These patents have been successfully defended in Europe, thus providing the potential for traction in that large, aging, population. When this test is approved by the FDA later this year or early next, it isn't hard to predict that it will become the non-invasive screen of choice. Currently, fecal blood screening tests sell over 10 million annually in the US. A large market penetration could be reached by having a test that performs much better than FIT since there is some evidence that fecal blood testing is under-performed due to the perception by many primary care physicians that the test performance characteristics are not good enough to warrant its use.
If one goes by the current demographic profile of the US it is quite possible that the market could asymptote 20 million tests/year assuming a screening interval of three years. The FDA review will also occur concurrent with national coverage review by CMS to determine the reimbursement eligibility and parameters. This will facilitate market penetration when the test is approved. In short, this is a game-changing medical tool that will likely meet its most important goal - to increase the rate of CRC screening with a test that will have excellent programmatic sensitivity to pre-cancers. At the current price, this stock still has a great deal of head-room to run in the next couple of years.
Disclosure: I am long EXAS. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.