Seeking Alpha

How Many Natural Products Are Being Found? And How Many Are There?

by: Derek Lowe
Derek Lowe
Biotech, healthcare, tech

Time to talk natural products for a post. There was a paper earlier this year in PNAS that looked at the entire set of microbial and marine natural producrs reported from 1941 to 2015, looking for trends that might be useful. (Plant-derived compounds were not included, unfortunately, because there was no suitable database as there was for the other classes). They found that the number of new structures reported went up steeply until the mid-1990s, and then leveled off (but doesn't seem to have come back down). Interestingly, measures of chemical diversity (and the number of structural outliers) also seem to have leveled off - new compounds and compound classes are still being found, apparently, but not at the rate of increase of former years. Say the authors:

Overall, this analysis indicates that the discovery rate of new molecular architectures among natural products has increased since the origins of this field and has remained at a significant rate despite the ever-increasing number of published natural products. However, it should also be noted that an increasing number of the total reported natural products do have structural precedent in the literature, and thus constitute de- rivative structures. Overall, structurally unique compounds rep- resent a decreasing percentage of the total number of compounds isolated from natural sources. Therefore, if structural novelty is an important and valued component of natural products research, a central question for the field becomes how do we prioritize the discovery of these unique molecules from within this large pool of natural products with known structural scaffolds.

The paper goes on to an optimistic conclusion, though, pointing out that structurally new molecules have continued to be discovered at a roughly constant rate over the last twenty years, even as the field has continued to mature. The problem will be to keep this rate of discovery going, and to find better ways to exploit what's been found. One interesting sidelight is the amount of natural product space that's actually filled by real natural products (that we know of). Just to pick the cyclic tetrapeptides, out of over 40,000 possibilities (just the 20 natural amino acids, and accounting for symmetry), there are only 65 known examples, and those bin into just a few general classes. So there's what would have to be described as "natural-product-like" chemical space that has apparently either not been filled out by nature, or has not been so in a way that we've ever been able to find.

The paper attracted a published response, which made the case that the effects seen in the number and diversity of natural products didn't have much to do with natural product discovery at all, but were statistical phenomena to be expected with any large data set of this type. Instead of the rate of discovery of new products falling off, as they took the original paper to be saying, they felt that things were actually in much better shape. The original authors responded in turn, saying that's not at all what they were trying to imply - the percentage of truly novel natural products as compared to the whole is going down, but their rate of discovery isn't. Looking over the three articles, it really seems like the exchange can be boiled down to "The future for natural products is bright!" "No, that's wrong - actually, natural products have a bright future!" "We appreciate your response, but we have to continue to insist on a bright future for natural products!"

Varying interpretations (or misinterpretations) aside, I think that the original paper's points probably stand: as time goes on, the number of novel structures is bound to be a smaller percentage, simply because you're adding to an increasingly large database of things that have already been discovered. At the same time, that doesn't mean that there are fewer novel things being described every year, in absolute numbers. But keeping that line from dipping, which it might well do at some point, will probably require new collection, culture, and isolation techniques. That, though, is just what the field provides incentives for - there's less interest in describing the seventeenth (or seventy-second) member of a well-trodden class of compounds than there is in finding new frameworks, unusual structures, and new biological activities. As long as those are what we're seeking - and we should be - we should be able to keep finding them for some time. When that rate does start to decrease, though, it'll be time to start wondering if the end is in sight or not.

And the whole field of near-natural products should take heart, too. If the arguments here about chemical space are correct, there's a lot of useful work to be done. The counterpoint is that - to use the example of the tetrapeptides - that we don't see those products because they don't do much of anything, and have had no evolutionary selection to keep them around for us to find. That's surely true up to a point (they can't all be biologically active or of the same degree of utility), but I find it had to imagine that that's the whole reason. I think that evolution takes a lot of random blind turns, and those could easily have led to different natural products than the ones we have. We'll be putting that to the test as more compounds get made in such space...