When I first began this journey into the world of modern CPUs/SoCs/APUs I saw a window of opportunity for Advanced Micro Devices (AMD) to find its way back to profitability and possibly long-term relevance. This was mostly based on the early bullish talk surrounding Jaguar cores and their potential to simply provide superior all-around performance at specific high volume price points in traditional mobile computing like laptops and notebooks. As this story has unfolded, however, the full weight of how flexible and competitive Jaguar is across a number of different market segments began to become clear and with it my estimation for AMD's future rose.
When AMD announced that it was releasing its Kyoto server chips - branding them Opteron-X -- and they were based on Jaguar, I didn't react to it immediately with some form of article to capitalize on the headline. Instead I wanted to take some time to digest the news, do more research and let the story unfold a bit more.
Act I: Entering Stage Right
The story with Jaguar always comes down to the same plot point: price. In a price comparison, Intel's (INTC) S-1200 series ATOM processors are at $54-64 per and AMD's Operton-X 1150 is at the same price of $64. However, AMD's offers literally twice the potential performance due to it being a quad-core chip with higher memory bandwidth, more L2 cache and more addressable memory- so it's pretty obvious that Jaguar stacks up very nicely. Throw in that the TDP is programmable in BIOS and the chips offer even greater flexibility depending on the task they are being asked to perform.
The Opteron-X 1250 has the GPU turned on for $35 more, operates in a higher TDP window and retains the ability to fine tune both the CPU and GPU in BIOS to match the load requirements of more compute-intensive workloads. This is something that Intel cannot touch at the moment but that is likely to change with Silvermont-based Avotons, at least on the CPU side of things.
Act II: Setting the Stage
At this point it looks to me that the microserver environment is a completely open field. There are three competing solutions trying to find the right mix of performance per watt and blend it with the right connection fabrics to create a new generation of great hardware solutions while not requiring the building of nuclear power plants to run our data centers. ARM Holdings (ARM), Intel and AMD all have low-power, small core CPUs that are going to compete in this market.
But, what is obvious from the initial reports of Calxeda's Boston Viridis Server is that the idea of hundreds of tiny ARM Cortex A-9s is competitive with massive single-threaded performance of Intel Xeon's at distributed workloads. The news that Calxeda has revamped its card format to be simpler to build, maintain and operate - hence, making it cheaper - mitigates one of the few drawbacks associated with its initial products, the price.
The real fight, however, is not today, but tomorrow. For now, AMD and Jaguar have the best all-around solution in this space with the Opteron-X.
Performance per watt is superior to current Atom and ARM
Full SoC architecture simplifies designs
Price is more than competitive
Integrated Radeon GPU gives real options for compute-heavy workloads while not increasing board complexity.
It's a good mix that will see quick adoption in the market. AMD's Seamicro division has kept its plans for the Opteron-X mum but to use that to make the argument that Seamicro will use Silvermont over its in-house solution is silly. For now? Yes, Seamicro is shipping Atom S-1200 x86 servers but that should change by the end of Q3 at a guess, once Jaguar production really takes off.
AMD has Jaguar committed to a lot of different product lines and it really does highlight how flexible their reusable IP block strategy is. From tablets and notebooks in the consumer space to embedded and server chips as well as powering semi-custom designs for Sony's (SNE) PS/4 and Microsoft's (MSFT) Xbox One, there is a lot of mileage on the R&D of that one design alone. The Register went over the price per teraflop that AMD is offering with the Opteron-X 1250 and it is nearly 1/10th the cost of adding a discrete Quadro K5000 from Nvidia (NVDA), an area that Nvidia has been making a lot of money in.
The issue then becomes how many of these Kyoto chips can you cram onto a card and feed with memory. With an HP (HPQ) "Redstone" Moonshot 1500 microserver enclosure, you could put about 277 teraflops into a rack (not counting the calculating capabilities of the CPUs, that is just the GPUs) at single precision at a cost of $178,000 for just the processors, or about $643 per teraflops at the GPU level. And if you want to be really fair, you have to multiply the cost of the X1150 by 1.9 divided by 2 (the clock speed ratio), which comes out to $60.80, and subtract that from the cost of the X2150, which comes out to $38.20, to figure out the cost of the GPU on the X2150 chip. When you do that, the GPU cost of a rack of Moonshot servers with 277 teraflops comes to $248 per teraflops.
For comparison, a Quadro K5000 card from Nvidia, based on the K10 graphics chip that is designed for SP work, is rated at 2.1 teraflops single precision. It costs $2,249, which works out to $1,071 per teraflops for just the GPU card.
And, since HP (HPQ) has announced it will be utilizing the Opteron-X for just this purpose, this above analysis is very relevant to the discussion.
Intel will ship 22nm Avotons which should negate most, if not all, of AMD's current performance advantage. However, I remain skeptical about the price, which is always Intel's Achilles' heel. ARM has no problem with price but the Cortex-A9 is not in the same league in terms of performance. In a sense, the players are still putting their pieces on the board and AMD has made the first big move.
Act III: The Big Brawl
I see Calxeda's and Seamicro's current offerings as proofs of concept. The future for ARM is the Cortex-A57, which even AMD is developing a version of, presumably to run in conjunction with Jaguar/Beema in an HSA-compliant architecture to take advantage of each one's strengths.
This is a tremendous growth market opportunity for all of the manufacturers and, frankly, discussions of who is going to win the war are fruitless. IHS iSuppli has the market growing to 300,00 units in 2013 and growing at a CAGR of 230% between now and 2016.
All of these solutions have their pluses and minuses and look to fill slightly different workload niches. AMD's advantage will most definitely be graphics-based with its flexible and powerful compute offerings. As Jaguar grows up in terms of hUMA and HAS implementation, this advantage will only grow. Since ARM is a member of the HSA Alliance, the future picture you can expect to see is ARM vendors like Calxeda push in this direction as well.
The arrival of Kyoto/Opteron-X chips from AMD is well-timed for the company's near-term plans and re-affirms its presence in the server market. For now it is well-placed, offering superior performance in the x86 server market versus Atom. But the future, like much of AMD's future, lies with hUMA and HSA which I'm convinced is a fundamental game changer for future computing.