The breakthroughs in Artificial Intelligence powered by deep learning algorithms (specifically deep neural networks) opened doors to soaring commercial interests and applications. According to Tractica, annual worldwide AI revenue will grow from $643.7 million in 2016 to around $37 billion by 2025, representing a more than 50% annualized growth rate. The state of the art platforms for deep learning all rely on GPUs due to the parallel computing architecture that makes GPU well-suited for training and evaluating neural networks which involve heavy matrix computations. As the dominant player in the GPU arena, Nvidia (NASDAQ:NVDA) naturally and quickly becomes the go-to hardware solution provider for the booming AI markets.
The two biggest segments in Nvidia's AI business are data center and automotive. Even though the current AI revenue is much smaller compared to the company's core gaming business (as in the latest quarter, the data center and automotive segments captured $240 million and $127 million respectively, versus gaming's revenue of $1.2 billion), the growth demonstrated is very strong. Nvidia's stock had enjoyed a spectacular ride in 2016 with a return of 227%, which clearly reflects investors have high expectations for the future growth of the company's AI business. Nvidia currently trades at about 55 times trailing P/E, dwarfing giant Intel's (NASDAQ:INTC) 17x.
Given the hefty AI premium baked into Nvidia's stock price, this article will try to explain why the author thinks investors should not be too optimistic about Nvidia's AI business. The author's opinion is not based on the potential competition in the GPU domain from rival AMD (NASDAQ:AMD), but on the fact that GPU itself is far from being the optimal hardware for deep learning. An order of magnitude improvement is achievable in two critical areas - computing and power - based on products that already have been deployed or are coming in 2017.
Why GPU has become the go-to hardware for deep learning.
Training deep neural networks (DNNs) is a compute-intensive task involving significant matrix and vector operations, especially when the data set is big, model is complicated, and parameters require tuning. While CPUs can be used for this purpose, they are designed for general-purpose computing and their speed falls short desperately when it comes to the type and amount of workloads involved in deep learning.
In contrast GPUs are less flexible hardware specifically designed to handle graphics. Such processing happens to be largely defined by linear algebra such as matrix operations and hence the GPU architecture offers built-in parallel-computing acceleration to efficiently perform the computations involved in deep learning. It's not rare for GPU to achieve one order of magnitude or even more speedup over CPU when training DNNs.
Deep learning has been largely contained in the academic world until recent years. There are no dedicated deep learning hardwares (which require substantial R&D expenses) available on the market by the time the commercial interests broke out. As a result the hardware choice for AI applications resides only between CPU and GPU where GPU has a clear edge. It doesn't take long for Nvidia to leverage its GPU dominance to capture this golden opportunity.
What GPU still lacks.
Being the most accessible and state-of-art hardware for deep learning does not change the fact that GPU is still far from optimal for such applications. After all, GPU has its DNA optimized for video gaming, not for deep learning. There are substantial improvements that can be gained with dedicated deep-learning designs.
- Precision: Research has shown that DNN performances actually degrade little by lowering numeric precision to 16-bit or even 8-bit if compensated with larger networks. This indicates the high-precision arithmetic logic in GPU is an overkill. The equivalent amount of data bandwidth could be utilized more wisely.
- Memory access and caching: DNN typically accesses more predictable data than the highly non-linear texture data frequently encountered in GPU rendering. But GPUs as game engines are by-design optimized for the latter rather than taking benefit from the more friendly DNN-type of data access. Thus data throughput can be significantly increased with DNN-orientated designs.
Moreover high-performance GPUs are also known to be power-hungry monsters, making them not suitable for data centers. A typical server can consume around 500 to 1200 watts per hour. If we take 850W as the average, a single 200W graphics card would easily increase the server's total power consumption by 25%. This will definitely add to the already-terrible picture of data centers' notoriously gigantic energy-consumption. Fortunately much more power-efficient designs can be achieved as we'll discuss below.
Why ASIC is better.
Application-specific integrated circuits (ASICs) designed for deep learning could be more promising for future AI applications than GPU. This is because by being very task-specific, ASIC is capable of delivering higher computing performance at lower power consumption, most likely with cost advantage as well.
GPU is less-flexible compared to CPU, but much more powerful at performing specialized jobs (i.e. graphics in this case). Similar comparison also holds between ASIC and GPU, with ASIC designed to be the less-flexible version of GPU that handles deep learning tasks much more effectively.
- ASIC delivers much higher performance.
Intel acquired Nervana in 2016, a two-year-old AI startup, for an estimated $350-400 million. Nervana is going to release their Nervana Engine in 2017 which is an ASIC optimized for deep learning. It addresses aforementioned GPU inefficiencies such as precision, caching and data throughput. Nervana claims that their ASIC "achieves unprecedented compute density at an order of magnitude of more computing power than today's state-of-the-art GPUs." They also indicated further breakthrough is possible with the design since the ASIC "achieves this feat with an ASIC built using a commodity 28nm manufacturing process which affords Nervana room for further improvements by shrinking to a 16nm process in the future."
While we don't know if this Nervana chip is going to dent Nvidia's AI share in the market, it does indicate that: First, huge performance gain over traditional GPU is possible with ASIC that is customized just for deep learning, and second, given the fact that Nervana is only a two-year-old company, the moat for designing deep-learning chips is much lower than designing GPU which needs to address much more complicated applications and requires decades of expertise.
- ASIC consumes much less power.
With minimum hardware redundancy and by utilizing low-power standard cells, ASIC is the unquestionable winner when it comes to power consumption. And we have real-world numbers from deep learning applications to back up this. Google (NASDAQ:GOOG) (NASDAQ:GOOGL) has created and already been using their own ASIC called Tensor Processing Units (TPU) for a number of their own machine learning tasks. The same chip was also used in the well-known AlphaGo vs. Lee Sedol Go series. Google claims their ASIC offers 10 times improvement on performance per watt versus traditional solutions for specific tasks. Such dramatic power savings achieved with ASICs could well motivate other tech giants to start developing and deploying customized ASICs for their data centers as well.
- ASIC could be more cost effective.
While we don't have actual numbers of how much the Nervana or Google ASIC would cost, it's well-known in the semiconductor industry that ASICs, being dedicated hardware, cost significantly less than more general solutions. Therefore it's reasonable to assume the ASICs will have a cost advantage over GPU as well once they get commercialized.
To summarize, Nvidia's GPU dominance has enabled it to become the go-to hardware provider for AI applications. Investors appear to be confident that the company's AI business has a secured growth trajectory ahead as reflected by the stock price. But this situation might change in the long term as more ASIC solutions mature and become available on market. These ASIC alternatives do not require decades of GPU expertise to design but have the potential to deliver significantly higher performance and with much lower power consumption and at a better price. Therefore Nvidia's wide moat in the GPU space does not directly translate into an equivalently formidable moat in the AI business. The author thinks Nvidia might face tough competitions in the AI battlefield and investors should take this risk into consideration.
Disclosure: I am/we are short NVDA.
I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.