Seeking Alpha
Newsletter provider, tech, gold & precious metals, currencies
Profile| Send Message|
( followers)

Since the launch of Kaveri a month ago by Advanced Micro Devices (NASDAQ:AMD) real world information about the effects of the new technology the chips support has trickled in slowly. Specifically, the performance advantages of both HSA - Heterogeneous Systems Architecture - and Mantle - a new API for Graphics Core Next GPUs - had more questions surrounding them than answers. Not surprisingly, the major technology media sites have been silent since the initial reviews, or in the case of Mantle hastily published previews. Now that both are no longer part of the news cycle and rival Nvidia (NASDAQ:NVDA) has released its first cards based on the Maxwell chip architecture I believe it's important to discuss what answers we have found out so far. And those answers should clear up any misconceptions about the potential for Mantle, HSA and both consumer and server versions of Kaveri.

HSA Has the Goods

Over the weekend, WCCFTech linked to some results from the Korean site, IYD, which is the most comprehensive look at Kaveri I've seen so far. They tested a variety of Intel (NASDAQ:INTC) and AMD chips with and without HSA enabled on the Kaveri A10-7850k; everything from a K10 core-based Llano A8-3870k to a Haswell Core i5-4670k. Before I get to the results, for those that are not familiar with what Heterogeneous Systems Architecture (HSA) is, I covered it in detail in an article last summer. The amount of HSA-enabled software in the market is still limited but these results cover most of what is available at the consumer level today, including results from PCMark 8.

(click to enlarge)

This graph condenses the results into a simple picture which takes the $173 A10-7850k from uninteresting to a market leader. Intel's Core i5-4670k is clearly better in every way, offering a real world 50% increase in computational performance over the A10-7850k. But, that changes radically once HSA is enabled and tasks can be more efficiently pushed to the computing core that is most appropriate. In essence, what you are looking at is the degree to which nearly every computer sold to this point before Kaveri has been hampered by its design in treating the GPU as a non-equal to the CPU.

These results point out simply how fundamentally inefficient past designs have been and how we as consumers have had to pay far more for a particular level of performance than was necessary. Enabling HSA on smaller processors like ARM (NASDAQ:ARMH) Cortex A57s and AMD's small cores similar to the SoCs designed for Sony's (NYSE:SNE) Playstation 4 and Microsoft's (NASDAQ:MSFT) Xbox One for small notebook and tablet chips would be similarly disruptive and create mobile computational devices offering a far greater value proposition as a functional work device.

The crowd-funded Tango micro-PC is based on an A6-5200 APU and is a perfect example of where the traditional business-class PC will be moving. HSA performance enhancements for complex math can be brought to not just lower price points but to whole new ways to designing PCs. The success of Apple's (NASDAQ:AAPL) iPhone initially was this type of disruption. It was a full featured device that created a new way of looking at your computing needs. The iPad began revolutionizing the mobile office but there is still a need for improved real world performance.

Moreover, these results answer questions about how much potential impact the upcoming server version of Kaveri, code-named Berlin, can have on high performance computing, an $11 billion dollar market in 2012 due to grow at nearly a 7% CAGR over the next 2 years. In this segment performance per watt is extremely important and HSA should help AMD claw back some market share from Intel. In this IDC report on big data the need to keep costs of calculation and transmission down is what will drive data analytic growth. And, again from the IYD review of HSA, the above slide shows just how much potential there is for big data adoptions at the server level.

The Performance Mantle

I talked about Mantle's impact on the low end of the CPU market in my last article on the subject as it is designed to alleviate CPU-bound situations in modern games. Since then I've been waiting for a review to show how strong Mantle is at improving performance at the high end of the gaming spectrum. The results there are actually more impressive than they are at the low end, and confirm my suspicions that wider, more parallel chip architectures will provide a better platform for GPU throughput efficiency than more vertical designs.

Here are some results from Tweaktown.com that highlight the difference Mantle makes in performance between the 6-core Intel Core i7-4930k and the 8-core AMD FX-8350 running the latest version of the Starswarm demo suite from Oxide games. I have reservations about these results as the article is not clear which demo was run so it is difficult to cross reference these with others out there. However, under similar circumstances we can see the higher core CPU properly scheduling parallel tasks improve relative performance.

Battlefield 4 (FPS)

Direct X

Mantle

% Increase

Core i7-4930k

34.8

47.5

36.5%

FX-8350

22

33.8

53.6%

I uncovered this thread at extremesystems.com where users ran tests with 8 core Xeon 5570 and 4 core i7-3770ks and relative performance increases were seen with the wider, slower CPUs. This speaks to why both Sony and Microsoft went for 8-core jaguar-based APUs for their consoles. The graphics API in the Playstation 4 is said to be very similar to Mantle and it is why it can achieve its performance without needing a huge GPU to overcome API inefficiencies.

Over the Next Hill

Looking ahead it makes sense then to expect AMD when they make the switch to GlobalFoundries 20nm bulk process to double the x86-64 CPU core counts from the current two and four while adding many GCN compute cores on-die as possible depending on the application. Whether this will happen with Carrizo next year is anyone's guess, but odds are unlikely.

My feeling is that we'll see an eight-core A57 ARM APU built on 20nm first after the introduction of 20nm discrete graphics cards. Until AMD updates us with new information on their x86 CPU plans, however, this is idle speculation. The upcoming Beema/Mullins will be 28nm and will not be fully HSA compliant.

What is not speculation, however, is how quickly the industry will move to adopt Mantle and HSA with improvement numbers like these and the relative ease for developers to support them. A recent interview with Oxide Games Developer Dan Baker made it clear that supporting Mantle was not a huge hurdle in terms of resources.

Maximum PC: Do you see a world where developers will have to write for DX and Mantle? How much of a challenge is it to write for both APIs?

Baker: APIs come and go. Once you support more than one, it's pretty easy to support a dozen-assuming there is parity in the hardware features, and assuming you don't have to rewrite your shaders in an entirely different language. If you release a title right now, you would end up with likely six paths. An Xbox360, a PS3, a PS4, a Xbox One, a DX9, and a DX11. For us, the graphics system is just a module that talks to the API. All we did for Mantle was replace the D3D module with a Mantle one. It's about 3,000 to 4,000 lines of code for the Mantle version, which took me personally about two months to write. In terms of support, at least for us, it wasn't terribly difficult.

So, a couple of man months' worth of work to double the performance of their game engine. Mr. Baker is right to call this disruptive technology. Whether this news is driving GPU sales for AMD right now or not is questionable, but in the U.S. reports of continued shortages of the R9 series of graphics cards continues. There are other factors, including MacPro sales and cryptocurrency mining demand.

Regardless, the more information we get about the real performance increases from these new technologies the more it becomes clear that they will drive fundamental changes to the industry as a whole. The stock price has pushed up against strong resistance near $3.70 per share and is consolidating in a very tight range right now. Any significant news will likely send it through that and back to over $4.00 in short order due to the gap in the daily chart.

Disclosure: I am long AMD. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.

Source: New Technology Proving Its Worth For Advanced Micro Devices