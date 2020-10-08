Investment Thesis

AI was one of the main investment theses for another competitor in this space, Intel (INTC), which I detailed here: Intel To Lead The AI Revolution Over The Next Decade (NASDAQ:INTC). AI is widely seen as one of the most important and quickly growing workloads both in the data center and at the edge. As such, it will be a major driver for compute demand.

In particular, AI consists of training deep learning (DL) models and applying them. The latter is called inference and expected to become much larger than training over time. For comparison, the majority of Nvidia’s (NVDA) $1B per quarter data center business (pre-Mellanox) is likely still based on training. Nevertheless, since Turing in late 2018 and more recently with Ampere, Nvidia is also targeting this market specifically. However, the breadth of inference workloads and competitors can be seen easily by looking at the CPU offerings of Intel, the FPGA offerings of Xilinx (XLNX) and Intel, and the dedicated accelerator offerings of plenty start-ups.

Qualcomm (NASDAQ:QCOM) is now also entering this space with its Cloud AI 100. Qualcomm recently detailed its specs, and it looks quite competitive on paper, if not leading. While as just mentioned there is a lot of competition, Qualcomm may become one of the key players in this nascent market, which could become a new revenue driver over time. The Cloud AI 100 will ship in H1’21.

Cloud AI 100 Performance

AnandTech for example has covered the Qualcomm announcement, so I won’t rehash all that information, but focus on the key performance metrics instead.

In the last year or so, recognizing that AI models keep increasing in complexity, the widely used ResNet-50 benchmark has become somewhat criticized, but should still serve as one relevant performance indicator.

To summarize the performance and how it stacks up:

Intel Nervana NNP-I, 10nm: ~3600 inferences/sec at 10W with efficiency of 4.8 TOPS/W (source: Intel Details Its Nervana Inference and Training AI Cards - ExtremeTech)

Intel Habana Goya, 16nm: ~15000 inferences/sec at ~100W (~2.5 TOPS/W?)

Nvidia V100, 16nm: ~5000 inferences/sec at ~275W

Nvidia T4, 16nm: ~4000 inferences/sec at ~70W

Nvidia A100, 7nm: ~24000 inferences/sec at ~325W

Qualcomm Cloud AI 100, 7nm: ~26000 inferences/sec at ~70W

The Cloud AI 100 is rated at 400 TOPS in its highest-end configuration.

Looking at the list above, it should be clear that two products are a league above the rest: Qualcomm’s AI 100 and the Nervana NNP-I.

Indeed, this looked like it would be the battle for dedicated inference accelerators when both products were announced in 2019, but in late 2019, Intel acquired Habana. Intel said it would still ship the NNP-I product, but going forward would focus on Habana, and also hasn’t said much about its data center accelerator portfolio or roadmap since. So while customers may probably be able to get hold of the NNP-I, the relevance of a product with no further roadmap is small.

But for reference, while now being mostly academic, Intel’s NNP-I actually delivered the closest power and performance metrics to that of Cloud AI 100 of the products considered in this article (Intel had also planned versions with higher power and hence performance). Benefiting from Intel’s 10nm process, it also had an additional unique value proposition as it included two Ice Lake (Sunny Cove) CPU cores with Intel’s DLBoost, to cater exactly to the breadth of AI workloads mentioned in the investment thesis (which is the reason why CPUs are still mostly used for inference).

Looking at Intel's new alternative, the Habana Goya product, given its inferior process node (16nm), it actually seems to be a step down in competitiveness compared to the NNP-I, at least until perhaps a 7nm (or 5nm?) Habana. In fact, unless Habana launches its next-gen accelerator next year, either Sapphire Rapids Xeon with its dedicated AMX units for AI or the recently launched Stratix 10 NX for AI might actually be Intel’s fastest solution for inference next year.

So with Intel’s AI portfolio still in quite some mystery given its roadmap changes and at least the Nervana roadmap having been shelved, this seems to make Qualcomm the leadership provider in this space for now, at least based on the information above. Although also Qualcomm seems a bit delayed vs. 2019 expectations (of H2 2020 launch).

Takeaway

Qualcomm may benefit from some of the dynamics in the market for AI accelerators.

On one hand, Nvidia continues its strategy of offering adapted GPUs for AI workloads, which may not be as efficient as dedicated AI chips. This is what Qualcomm’s data suggests is indeed the case: Qualcomm is delivering competitive performance (400 TOPS) at a much lower power consumption.

On the other hand, Intel’s AI roadmap has seen drastic changes over the last year as it has shelved many years of Nervana work. (To be sure, as the 10nm process shows, the NNP-I was homegrown and not a product from the 2016 Nervana acquisition.) Given where the current 16nm Goya product lands on the performance and power curve, at least in the near term this may not be optimal and hence Intel may not have a competitive product for a while longer. At least on the inference side, Intel seems to be taking a step back given its focus on a unified Habana product (hardware and software) stack, even though the NNP-I likely would have been competitive.

Nevertheless, it is likely Habana will move to 7nm as well sooner or later. Intel often holds an AI Day in November in conjunction with the Supercomputing conference, so it might still be a while until more information about the Habana plans gets disclosed.

In any case, with Intel stumbling on/changing its roadmap, Qualcomm might be in a league of its own once it launches next year. So for now, Qualcomm seems on track to offer a leading product for this still quite nascent accelerator market.

As evidence of the growth opportunity, Intel ascribed $1.8B in data center CPU revenue due to AI workloads in 2018, and $3.8B revenue in AI across all business in 2019, growing ~30% YoY. If a portion of those workloads could transition to dedicated accelerators, this could become a market (new revenue opportunity for Qualcomm) of billions of dollars, although likely shared between multiple competitors.