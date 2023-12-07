Advanced Micro Devices, Inc. (NASDAQ:AMD) Advancing AI Event December 6, 2023 1:00 PM ET

Lisa Su

Good morning, everyone. Welcome to all of you who are joining us here in Silicon Valley and to everyone who's joining us online from around the world. It has been just an incredibly exciting year with all of the new products and all the innovation that has come across our business and our industry. But today, it's all about AI. We have a lot of new AI solutions to launch today and news to share with you. So let's go ahead and get started.

Now I know we've all felt this year. I mean it's been just an amazing year. I mean if you think about it a year ago, OpenAI unveiled ChatGPT and it's really sparked the revolution that has totally reshaped the technology landscape. In this just a short amount of time, AI hasn't just progressed, it's actually exploded. The year has shown us that AI isn't just kind of a cool new thing, it's actually the future of computing. And at AMD, when we think about it, we actually view AI as the single most transformational technology over the last 50 years. Maybe the only thing that has been close has been the introduction of the Internet. But what's different about AI is that the adoption rate is just much, much faster.

So although so much has happened, the truth is, right now, we're just at the very beginning of the AI era. And we can see how it's so capable of touching every aspect of our lives. So you guys just take a step back and just look, I mean, AI is already being used everywhere. Think about improving health care, accelerating climate research, enabling personal assistance for all of us and for greater business productivity things like industrial robotics, security and providing lots of new tools for content creators.

Now the key to all of this is generative it requires a significant investment in new infrastructure. And that's to enable training and all of the inference that's needed. And that market is just huge. Now a year ago, when we were thinking about AI, we were super excited. And we estimated the data center, AI accelerated market would grow approximately 50% annually over the next few years, from something like 30 billion in 2023 to more than 150 billion in 2027. And that felt like a big number. However, as we look at everything that's happened in the last 12 months, and the rate and pace of adoption that we're seeing across the industry, across our customers, across the world, it's really clear that the demand is just growing much, much faster.

So if you look at now to enable AI infrastructure, of course, it starts with the cloud, but it goes into the enterprise. We believe we'll see plenty of AI throughout the embedded markets and into personal computing. We're now expecting that the data center accelerator TAM will grow more than 70% annually over the next 4 years to over 400 billion in 2027. So does that sound exciting for us as an industry.

I have to say for someone like me who's been in the industry for a while, this innovation is faster than anything I've ever seen before. And for us at AMD, we are so well positioned to power that end-to-end infrastructure that defines this new AI era, thinking about massive cloud server installations to -- we're going to talk about on-prem enterprise clusters to the next generation of AI and embedded in PCs. Our AI strategy is really centered around 3 big strategic priorities.

First, we must deliver a broad portfolio of very performant energy-efficient GPUs, CPUs and adaptive computing solutions for AI training and inference. And we believe, frankly, that you're going to need all of these pieces for AI. Second, it's really about expanding our open, proven and being very developer-friendly in our software platform to ensure that leading AI frameworks, libraries and models are all fully enabled for AMD hardware and that it's really easy for people to use. And then third, it's really about partnership. You're going to see a lot of partners today. That's who we are as a company. It's about expanding the co-innovation work and working with all parts of the ecosystem, including cloud providers, OEMs, software developers. You're going to hear from some really AI leaders in the industry, to really accelerate how we work together and get that widespread deployment of our solutions across the board.

So we have so much to share with you today. I'd like to get started. And of course, let's start with the cloud. Generative AI is the most demanding data center workload ever. It requires tens of thousands of accelerators to train and refine models with billions of parameters and that same infrastructure is also needed to answer the millions of queries from everyone around the world to these smart models. And it's very simple. The more compute you have, the more capable of the model, the faster the answers are generated. And the GPU is at the center of this generative AI world.

And right now, I think we all know it, everyone I talk to says it the availability and capability of GPU compute is the single most important driver of AI adoption. Do you guys agree with that? So that's why I'm so excited today to launch our Instinct MI300X. It's the highest performance accelerator in the world for generative AI. MI300X is actually built on our new CDNA 3 data center architecture, and it's optimized for performance and power efficiency. CDNA 3 has a lot of new features. It combines a new compute engine. It supports sparsity, the latest data formats, including FP8. It has industry-leading memory capacity and bandwidth, and we're going to talk a lot about memory today. And it's built on the most advanced process technologies and 3D packaging.

So if you compare it to our previous generation, which frankly was also very good, CDNA 3 actually delivers more than 3x higher performance for key AI data types like FP16 and BF16 and a nearly 7x increased INT8 performance. So if you look underneath it, how do we get MI300X? It's actually 153 billion transistors, 153 billion. It's across a dozen 5-nanometer and 6-nanometer chiplets. It uses the most advanced packaging in the world. And if you take a look at how we put it together, it's actually pretty amazing. We start with 4 I/O die in the base layer. And what we have on the I/O dies are 256 megabytes of Infinity Cache and all of the next-gen I/O that you need things like 128-channel HBM3 interfaces, PCIe Gen 5 support, our fourth gen Infinity Fabric that connects multiple MI300Xs, so that we get 896 gigabytes per second. And then we stack 8 CDNA 3 accelerator chiplets or XCDs on top of the I/O die, and that's where we deliver 1.3 petaflops of FP16 and 2.6 petaflops of FP8 performance, and then we connect the 304 compute units with dense through-silicon-vias or TSVs, and that supports up to 17 terabytes per second of bandwidth. And of course, to take advantage of all of this compute, we connect 8 stacks of HBM 3 for a total of 192 gigabytes of memory at 5.3 terabytes per second of bandwidth. That's a lot of stuff. I have to say it's truly the most advanced product we've ever built and it is the most advanced AI accelerator in the industry.

Now let's talk about some of the performance and why it's so great. For generative AI, memory capacity and bandwidth are really important for performance. If you look at MI300X, we made a very conscious decision to add more flexibility, more memory capacity and more bandwidth. And what that translates to is 2.4x more memory capacity and 1.6x more memory bandwidth than the competition.

Now when you run things like lower precision data types that are widely used in LLMs, the new CDNA 3 compute units and memory density actually enable MI300X to deliver 1.3x more teraflops of FP8 and FP16 performance than the competition. Now these are good numbers, but what's more important is how things look in real-world inference workloads.

So let's start with some of the most common kernels used by the latest AI models. LLMs use attention algorithms to generate precise results. So for something like FlashAttention-2 kernels, MI300X actually delivers up to 1.2x better performance than the competition. And if you look at something like the Llama2-70b LLM, and we're going to use this a lot throughout the show, MI300X again, delivers up to 1.2x more performance. And what this means is the performance at the kernel level actually directly translates into faster results when running LLMs on a single MI300X accelerator. But we also know we talked about these models getting so large. So what's really important is how that AI performance scales when you go to the platform level and beyond.

So let's take a look at how MI300X scales. Let's start first with training. Training is really hard. People talk about how hard training is. When you look at something like the 30 billion parameter model from Databricks, MPT LLM, it's a pretty good example of something that is used by multiple enterprises for a lot of different things. And you can see here that the training performance for MI300X is actually equal to the competition. And that means it's actually a very, very competitive training platform today.

But when you turn to the inference performance of MI300X, this is where our performance really shines. We're showing some data here on measured data on two widely used models, BLOOM 176B. It's the world's largest open multi-language AI model. It generates text in 46 languages, and our Llama2-70b, which is also very popular, as I said, for enterprise customers. And what we see in this case is a single server with 8 MI300X accelerators is substantially faster than the competition, 1.4 to 1.6x.

So these are pretty big numbers here. And what this performance does is it just directly translates into a better user experience. You guys have used it. When you ask the model something, you'd like it to come back faster, especially as the responses get more complicated. So that gives you a view of the performance of MI300X. Now as excited as we are about the performance, we are even more excited about the work we're doing with our partners. So let me turn to our first guest, very, very special. Microsoft is truly a visionary leader in AI. We've been so fortunate to have a deep partnership with Microsoft for many, many years across all aspects of our business. And the work we're doing today in AI is truly taking that partnership to the next level.

So here to tell us more about that is Microsoft's Chief Technology Officer, Kevin Scott. Thank you so much for being here with us. Kevin, it is so great to see you. Thank you so much for being here with us.

Kevin Scott

It's a real pleasure to be here with you all today.

Lisa Su

We've done so much work together on EPYC and Instinct over the years. Can you just tell our audience a little bit about that partnership?

Kevin Scott

Yes, I think Microsoft and AMD have a very special partnership. And as you mentioned, it has been one that we've enjoyed for a really long time and started with the PC. It continued them with a bunch of custom silicon work that we've done together over the years on Xbox. It's extended through the work that we've done with you all on EPYC for the high performance computing workloads that we have in our cloud. And like the thing that I've been spending a bunch of time with you all on the past couple of years, like actually a little bit longer even is on AI compute, which I think everybody now understands how important it is to driving progress on like this new platform that we're trying to deliver to the world.

Lisa Su

I have to say we talk pretty often.

Kevin Scott

We do.

Lisa Su

But Kevin, what -- so what I admire so much is just your vision, Satya's vision about where AI is going in the industry. So can you just give ourselves, give us a perspective of where are we on this journey?

Kevin Scott

Yes. So we have been with a huge amount of intensity over the past 5 years or so been trying to prepare for the moment that I think we brought the world into over the past year. So it is almost a year to the day since the launch of ChatGPT, which I think is perhaps most people's first contact with this new wave of generative AI. But the thing that allowed Microsoft and OpenAI to do this was just a deep amount of infrastructure work that we've been investing in for a very long while. And one of the things that we realized fairly early in our journey is just how important compute was going to be and just how important it is to think about the sort of full systems optimization.

So the work that we've been doing with you all has been not just about figuring out like what the silicon architecture looks like, but that's been a very important thing in making sure that like we together are building things that are going to intercept where the actual platform is going to years in advance, but also just doing all of that software work that needs to be done to make this thing usable by all of the developers of the world.

Lisa Su

I think that's really key. I think sometimes people don't understand, they think about like AI as this year. But I mean, the truth is we've been building the foundation for so many years. Kevin, I want to take this moment to really acknowledge that Microsoft has been so instrumental in our AI journey, mean the work we've done over the last several generations, the software work that we're doing, the platform work that we're doing. We're super excited for this moment.

Now I know you guys just had Ignite recently, and Satya previewed some of the stuff you're doing with 300X but can you share that with our audience?

Kevin Scott

We're super enthusiastic about 300X, Satya announced the MI300X, VMs were going to be available in Azure, like it's really, really exciting right now sort of seeing the bring up of GPT4 on MI300X, seeing the performance of Llama2, like getting it rolled into production. And the thing that I'm excited here today is we will have the MI300X VMs in preview available today.

Lisa Su

I completely agree with you. The thing that's so exciting about AI is every day we discover something new and we're learning that together. So Kevin, we're so honored to be Microsoft's partner in AI. Thank you for all the work that your teams have done, that we've done together and we look forward to a lot more progress.

Kevin Scott

Yes. Likewise. Thank you very much.

Lisa Su

All right. So look, we certainly do learn a tremendous amount every day, and we're always pushing the envelope. Let me talk to you a little bit about how we bring more people into our ecosystem. So when I talk about the Instinct platform, you have to understand, our goal has really been to enable as many customers as possible to deploy Instinct as fast and as simply as possible. And to do this, we've really adopted industry standards.

So we built the Instinct platform based on an industry standard OCP server design and I'd actually like to show you what that means because I don't know if everyone understands. So let's bring her out. Her or him? I don't know. Let me show you the most powerful Gen AI computer in the world.

Now those of you who follow our shows know that I'm usually holding up a chip. But we've shown you the MI300X chip already. So we thought it would be important to show you just what it means to do generative AI at a system level. What you see here is 8 MI300X GPUs, and they're connected by our high-performance Infinity Fabric in an OCP-compliant design. Now what makes that special? So this board actually drops right into any OCP-compliant design which is the majority of AI systems today. And we did this for a very deliberate reason. We want to make this as easy as possible for customers to adopt. So you can take out your other Board and put in the MI300X Instinct platform. And if you take a look at the specifications, we actually support all of the same connectivity and networking capabilities of our competition. So PCI Gen 5, support for 400-gig Ethernet, that 896 gigabytes per second of total system bandwidth. But all of that is with 2.4x more memory and 1.3x more compute server than the competition. So that's really why we call it the most powerful Gen AI system in the world.

Now I've talked about some of the performance in AI workloads, but I want to give you just a little bit more color on that. When you look at deploying servers at scale, it's not just about performance. Our customers are also trying to optimize power, space, CapEx and OpEx and that's where you see some really nice benefits of our platform. So when you compare our Instinct platform to the competition, I've already showed you that we deliver comparable training performance and significantly higher inference performance but in addition, what that memory capacity and bandwidth gives us is that customers can actually either run more models, if you're running multiple models on even server, or you can run larger models on that same server.

So in the case where you're running multiple different models on a single server, the Instinct platform can run twice as many models for both training and inference than the competition. And on the other side, if what you're doing is trying to run very large models, you'd like to hit them on as huge GPUs as possible. And so with the FP16 data format, you can run twice the number of LLMs on a single MI300X server compared to our competition. And this directly translates into lower CapEx. And especially if you don't have enough GPUs, this is really, really helpful.

So to talk more about MI300X and how we're bringing it to market, let me bring our next guest to the stage. Oracle Cloud and AMD have been engaged for many, many years in bringing great computing solutions to the cloud. Here to tell us more about our work together is Karan Batta, Senior Vice President at Oracle Cloud Infrastructure.

Lisa Su

Hey, Karan.

Karan Batta

Hi, Lisa.

Lisa Su

Thank you so much for being here and thank you for your partnership. Can you tell us a little bit about the work that we're doing together?

Karan Batta

Yes. Thank you. Excited to be here today. Oracle and AMD have been working together for a long, long time, right, since the inception of OCI back in 2017 and so we've launched every generation of EPYC as part of our bare-metal compute platform. And it's been so successful, customers like Red Bull as an example. And we've expanded that across the board for all of the portfolio of past services like Kubernetes, VMware, et cetera. And then we are also collaborating on Pensando DPUs, where we offload a lot of that logic so that customers can get much better performance, flexibility. And then earlier this year, we also announced that we're partnering with you guys on Exadata, which is a big deal, right? So we're super excited about our partnership with AMD and then what's to come with 300X.

Lisa Su

Yes. I mean, look, we really appreciate OCI has really been a leading customer as we talk about how do we bring new technology into Oracle Cloud. Now you're spending a lot of time on AI as well. Tell us a little bit about your strategy for AI and how we fit into that strategy?

Karan Batta

Absolutely. We're spending a lot of time on AI, obviously. We're doing that across the stack from infrastructure all the way up to applications; Oracle is an applications company as well. And so we're doing that across the stack. But from an infrastructure standpoint, we're investing a lot of effort into our core compute stack, our networking stack. We announced the clustered networking. And what I'm really excited to announce is that we're going to be supporting MI300X as part of that bare-metal compute stack.

Lisa Su

We are super thrilled about that partnership. We love the fact that you're going to have 300X. I know your customers and our customers are talking to us every day about it. Tell us a little bit about what customers are saying.

Karan Batta

Yes. We've been working with a lot of customers. Obviously, we've been collaborating a lot at the engineering level as well with AMD and customers are seeing incredible results already from the previous generation. And so I think that will actually carry through with the 300X. And so much so that we're also excited to actually support MI300X as part of our generative AI service that's going to be coming up live very soon as well. So we're very, very excited about that. We're working with some of our early customer adopters like Naveen from Databricks. So we're very excited about the possibility. We are also very excited about the fact that the ROCm ecosystem is going to help us continue that effort moving forward. So we're very pumped.

Lisa Su

That's wonderful, Karan. Thank you so much. Thank your teams. We're so excited about the work we're doing together and look forward to a lot more.

Karan Batta

Thank you, Lisa.

Lisa Su

Thank you. Now as important as the hardware is, software actually is what drives adoption. And we have made significant investments in our software capabilities in our overall ecosystem. So let me now welcome to the stage AMD President, Victor Peng, to talk about our software and ecosystem progress.

Victor Peng

Thank you, Lisa. Thank you, and good morning, everyone. Last June at the AI event in San Francisco, I said that the ROCm software stack was open, proven and ready. And today, I'm really excited to tell you about the tremendous progress we've made in delivering powerful new features as well as the high performance on ROCm, and how the ecosystem partners have been significantly expanding the support for Instinct GPUs and our entire product portfolio.

Today, there are multiple tens of thousands of AI models that run right out of the box on Instinct and more developers are running on the MI250 and soon, they'll be running on the MI300. So we've expanded deployments in the data center at the edge, in client, embedded applications of our GPUs, CPUs, FPGAs and adaptive SoCs, really end-to-end. And we're executing on that strategy of building a unified AI software stack, so any model, including generative AI can run seamlessly across our entire product portfolio.

Now today, I'm going to focus on ROCm and expanding ecosystem support for our Instinct GPUs. We architected ROCm to be modular in open source to enable very broad user accessibility and rapid contribution by the open source community and AI community. Open source and the ecosystem are really integral to our software strategy. And in fact, really open is integral to our overall strategy. This contrasts with CUDA, which is proprietary and closed.

Now the open source community, everybody knows moves at the speed of light in deploying and proliferating new algorithms, models, tools and performance enhancements. And we are definitely seeing the benefits of that in the tremendous ecosystem momentum that we've established. To further accelerate developer adoption, we recently announced that we're going to be supporting ROCm and our Radeon GPUs. This makes AI development on AMD GPUs more accessible to more developers, start-ups and researchers.

So our foot is firmly on the gas pedal was driving the MI 300 to volume production and our next ROCm release. So I'm really super excited that we'll be shipping ROCm 6 later this month. I'm really proud of what the team has done with this really big release. ROCm 6 has been optimized for Gen AI, particularly large language models, has powerful new features, library optimizations extended ecosystem support and increases performance by factors. It really delivers for AI developers. ROCm 6 supports FP16 and BF16 and the new FP8 data types for higher performance while reducing both memory and balance needs.

We've incorporated advanced graph and kernel optimizations and optimized libraries for improved efficiency. We're shipping state-of-the-art attention algorithms like like FlashAttention-2, PagedAttention, which are critical for performing LLMs and other models. These algorithms and optimizations are complemented the new release of RCCL, our collective communications library for efficient, very large-scale GPU deployments. So look, the bottom line is ROCm 6 delivers a quantum leap in performance and capability.

Now I'm going to first walk you through the inference performance gains, you'll see with some of these optimizations on ROCm 6. So for instance, running a 70 billion Llama2 model, PagedAttention and other algorithms speed up to token generation by paging attention keys and values delivering 2.6x higher performance. HIPGraph allows processing to be defined in graphs rather than single operations, and that delivers a 1.4x speed up. FlashAttention, which is widely used kernel for very high-performance LLM performance delivers 1.3x speed up.

So all those optimizations together deliver an 8x speed up on the MI300X with ROCm 6 compared to MI250 and ROCm 5. That's 8x performance in a single generation. So this is one of the huge benefits we provide to customers with this great performance improvement with the MI300X.

So now let's look at it from a competitive perspective. Lisa had highlighted the performance of large models running on multiple GPUs. What I'm sharing here is how the performance of smaller models running on single GPUs, in this case, the 13 billion Llama2 model, the MI300X and ROCm 6 together delivered 1.2x higher performance than the competition. So this is the reason why our customers and our partners are super excited about training the next innovations in AI on the MI300X.

So look, we're relentlessly focused on delivering leadership technology and very comprehensive software support for AI developers. In the field that drive, we've been significantly strengthening our software teams through both organic and inorganic means, and we're expanding our ecosystem engagements. So we recently acquired Nod.ai and Mipsology. Nod brings world-class expertise in open source compilers and runtime technology. They've been instrumental in the MLIR compiling technology as well as in the communities. And as part of our team, they are significantly strengthening our customer engagements and they're accelerating our software development plans. Mipsology also strengthens our capabilities and they're especially in delivering to customers in very AI-rich applications like autonomous vehicles and industrial automation.

So now let me turn over to the ecosystem. In addition to working closely with the -- sorry, we announced that we had the partnership with Hugging Face just last June. Today, they have 62,000 models running daily on Instinct platforms. And in addition, we worked closely on getting these LLM optimizations as part of their Optimum library and toolkit.

Our partnership with PyTorch Foundation has also continued to thrive with CI/CD pipelines and validation, enabling developers to target our platforms directly. And we continue to make very significant contributions to all the major frameworks, including upstream support for AMD GPUs in JAX, OpenXLA, GPAI and even initiatives like DeepSpeed for science. Just yesterday, the AI Alliance was announced with over 50 founding members that also include AD, IBM and Meta and other companies. And I'm really delighted to share some very late-breaking news. AMD GPUs, including the MI300 will be supported in the standard OpenAI Triton distribution starting with the 3.0 release.

We're really thrilled to be working with Philippe Tillet, who created Triton and the whole OpenAI team. AI developers using the OpenAI Triton are more productive working at a high level of design abstraction, and they still get really excellent performance. This is great for developers and aligned with our strategy to empower developers with powerful and open software stacks and GPU platforms. This is in contrast to a much greater effort to developers we need to invest working on a much lower level of abstraction in order to [eke out] performance.

Now I've shared a lot with you about the progress we made on software. But the best indication of the progress we've really made are the people are using our software and GPUs and what they're saying. So it gives me great pleasure to have 3 AI luminaries and entrepreneurs from Databricks, Essential AI and Lamini to join me on stage. Please give a very warm welcome to Ion Stoica, Ashish Vaswani and Sharon Zhou.

Okay, great. Welcome Ion, Ashish and Sharon. Thank you so much for joining us here. Really appreciate it. So I'm going to ask each of you a bit about, first, with the mission of your company, and share about the innovations you're doing with our GPUs and software and what the experience has been like. So let me start with you. Now Ion also not only Founder of Databricks but you're on the staff the Department of UC Berkeley, Director of Sky Computing Labs and also you've been working with Anyscale and many AI start-ups. So maybe you could talk about your engagement with AMD as well as your experience with the MI200 and MI300?

Ion Stoica

Yes. Thank you very much. Very glad to be here. And yes, indeed, I collaborated with AMD wearing multiple hats, Director of Sky Computing Lab, at Berkley which AMD is supporting and also founders of Anyscale and DataBricks. And in all my work over the year, one thing I really focus on is democratizing the access to AI. What this means, it's improving the scale, performance and cost, reducing the cost to run these large AI applications which means everything from AI workloads, everything from training, fine-tuning, inference and generative AI applications.

Just to give you some examples, we developed vLLM, which is arguably now the most popular open source inference engines for LLMs. We have developed Ray and other open source framework, which is used to distribute machine learning workloads, Ray has been used by OpenAI to train ChatGPT; and more recently, Sky Computing, one of the projects there is SkyPilot, which helps you to run your applications or machine learning applications and workloads across multiple clouds. And why do you want to do that is because you want to alleviate the scarcity of the GPUs and reduces the cost.

Now when it comes to our collaborations, we collaborate on all these kind of projects, and one thing which was very pleasant surprise is that it was very easy to run and include ROCm in our stack. It really -- it runs out of the box from day 1. Of course, you need to do more optimization for that. And this is what we are doing, and we are working on. So for instance, we had the support for MI250 and to Ray, and we are working actually collaborating with AMD like I mentioned to optimize the inference for vLLM, again, running on MI250 and MI300X. And from the point of view of SkyPilot we are really looking forward to have more and more of MI250s and MI300X in various clouds, so you have more choices.

Victor Peng

Thank you so much for all the collaboration. So Ashish, why don't you tell us about Essential's mission and also you are experiencing with ROCm and Instinct?

Ashish Vaswani

Good to be here, Victor. Essential, we're really excited. We're really excited to push this -- push the boundaries of human-machine partnership at enterprises. We should be able to do -- we're at the beginning stages where we'll be able to do 10x or 50x more than what we can just do by ourselves today. So we're extremely excited. And what that's going to take, our belief it's going to be a full stack approach. So you're building the models, serving infrastructure. But more importantly, understanding workflows in enterprises today and giving people the tools to configure these models, teach these models to configure them for their workflows end-to-end. And so the model is done with feedback. They get better with feedback, they get smarter and then they're eventually able to even guide non-experts to do tasks that were not able to do.

So we're really excited. And we actually were lucky to start to benchmark the 250 earlier this year and hey, we want to solve a couple of hard problems, scientific problems, and we were like, hey, are we going to get long context and check. Okay. So you got to be able to trade larger models, they will to serve larger models and smaller chips. And so as we saw -- and the ease of using the software was also was also very pleasant. And then we saw how things are progressing. And for example, I think in 2 months, I believe, FlashAttention, which is a critical component to actually scale to longer sequences, appeared -- so it's generally very happy. I'm just impressed with the progress and excited about the chips.

Victor Peng

Thanks so much, Ashish. And Sharon. Lamini has a very innovative business model and working with enterprise for their private models. Why don't you share the mission and how the experience with AMD has been?

Sharon Zhou

Yes. Thanks, Victor. So by way of quick background, Sharon, Co-Founder, CEO of Lamini. Most recently, I was at computer science faculty at Stanford leading a research group in generative AI, did my PhD there also under Andrew Ng and teach about 0.25 million students and professionals online in generative AI. And I left Stanford to pursue Lamini and co-found on the premise of making the magical difficult, expensive pieces of building your own language model inside an enterprise, extremely accessible, easy to use. So that companies who understand their domain-specific problems best can be the ones who can actually wheel this technology and more importantly, fully own that technology. In just a few lines of code, you can run an LLM and be able to in-view it with knowledge from millions of documents, which is 40,000 times more than hitting Claude 2 Pro, that API. So just a huge amount of information can be viewed into the technologies in our infrastructure. And more importantly, our customers get to fully own their models. For example, NordicTrack, one of our customers that makes all the ellipticals and treadmills in the gym, parent company is iFit, they have over 6 million users on their mobile app platform. And so they're building an LLM that can actually create this personal AI fitness coach in-viewed with all the knowledge they have in-house on what a good fitness coaches and it turns out it's actually not a professional athlete. They tried to hire Michael Phelps, did not work. So they have real knowledge inside of their company, and they're in-viewing the LLM with that so that we can all have personal fitness trainers.

So we're very excited to be working with AMD. We actually have had a cloud -- AMD cloud in production for over the past year on MI200s, or MI210, MI250s. And we're very excited about the MI300s. And I think something that's been super important to us is that with Lamini software, we'd actually reach software parity with CUDA.

On all the things that matter with large language models, including inference and training. And I would say even beyond CUDA, we have reached beyond CUDA for things that matter for our customers. So that's including higher memory or higher capacity means bigger models. And our customers want to be able to build bigger and more capable models. And then a second point, which Lisa kind of touched on earlier today is these machines, these chips can actually, given higher bandwidth, be able to return results lower latency, which matters for the user experience. Certainly, a personal fitness coach but for all of our customers as well.

Victor Peng

Super excited. That's great. So back to change is up a little bit. So you heard several key components of ROCm is open source. And we did that for rapid adoption and also getting better -- more enhancements from the community, both on source and AI. So what do you think about the strategy? And how do you think this approach might help some of the companies that you've founded?

Ion Stoica

So obviously, given my history, really love the open source. Love the open source system, and we try to do over time to do our own contribution, bring out and I think that one thing to note is that many of the gen AI tools today are open source. And we are talking here about Hugging Face, about PyTorch, Triton, like I mentioned, vLLM, the Ray and many others. And many of these stores actually can run today on AMD and ROCm stacked today. And this makes ROCm another key component of the open source ecosystem. And I think this is great. And it's the -- in time, I'm sure that actually quite fast, it's like the community will take advantage of the unique capabilities of the AMD's MI250 and MI300X to innovate and to improve the performance of all these tools, which are running at a higher level of the Gen AI stack.

Victor Peng

Great. And that's our purpose and aim. So I'm going to jump over to Sharon. So Sharon, what do you think about how AI workloads are evolving in the future? And what do you think AMD GPU Instinct, since you have great experience with and ROCm can play in that future of AI development?

Sharon Zhou

Okay. So maybe a bit of a spicy take. I think that GOFAI, good old-fashioned AI is not the future of AI. And I really do think it's LLM or some variant of LLMs of these models can actually be able to soak up all this general knowledge that is missing from these traditional algorithms. And we've seen this across so many different algorithms in our customers already. Those who are even at the leading edge of recommendation systems, forecasting systems, classification, are even using this because of that general knowledge that it's able to learn. So I think that's the future. It's maybe more known as Software 2.0, coined by my friend Andrej Karpathy. And I really do think Software 2.0, which is hitting these models time and time again, instead of writing really extensive software inside a company will be supporting enterprises 2.0. Meaning enterprises of the future of the next generation. And I think the AMD Instinct GPUs are critical to basically supporting -- ubiquitously supporting the Software 2.0 of the future. And we absolutely need compute to be able to run these models efficiently, to run lots of these models, more of these models and larger models with greater capabilities. So overall, very excited with the direction of not only these AI workloads, but also the direction that AMD is taking in doubling down on these MI300s that, of course, can take on larger models and more capable models for us.

Victor Peng

So Ashish, we'll finish up with you, and I'll give you the same kind of question. So where do you think about the future of AI workloads? And how do you think our GPUs and ROCm can play and how you're driving things at Essential?

Ashish Vaswani

Yes, so I think that how -- we have to keep -- we have to improve reasoning and planning to solve these complex tasks like take an analyst. And if they actually -- they want to absorb like an earnings call and figure out whether how they should revise their opinion on whether to invest in a company or what recommendations that they should provide, right? It's actually going to take -- it’s going to take multiple reasoning over multiple steps. It's going to be ingesting a large document and being able to extract information from it, apply their models, actually ask for information when they don't have any, get world knowledge but also maybe have some reasoning and some outside reasoning and planning there.

And then for all these sort of -- so when I look at like the MI300 with very large HBM and high memory bandwidth, I think of what's going to be unlocked, which capabilities are going to improved and what new capabilities will be available? So I mean, even with what we have today, just imagine a world where you can just -- you can process long documents or you can make these models much more accurate by adding more examples in the prompt. But imagine like just complete user sessions that you can maintain in model state, how they would actually improve the end-to-end user experience, right? And I think that we're moving to kind of architecture where what typically is to happen in inference, a lot of search is now going to go into training where the models are going to explore thousands of solutions and eventually pick one that's actually the best option for the goal -- the best solution for the goal. And that's -- and definitely, the large HBM and high bandwidth is going to not only be important for serving larger models with low latency for better end-to-end experience, but also for some of these new techniques that we're just about -- we're just exploring that are going to improve the capabilities of these models. So very excited about the new chip and what it's going to unlock.

Victor Peng

Great. Thank you, Ashish. Ion, Ashish, Sharon, this has been really terrific. Thank you so much for all the great insights you have provided us. Thank you for joining us today.

It's just so exciting to hear with companies like Databricks, Essential AI and Lamini achieving with our GPUs and just super thrilled that their experience with our software has been so smooth and really a delight. So you can tell, they see absolutely no barriers, right, and they are extremely motivated to innovate on AMD platforms.

Okay. To sum it up, what we delivered over the past 6 months is empowering developers to execute their mission and realize their vision. We'll be shipping ROCm 6 very soon. It's optimized to LLMs and together with the MI300X, it's going to deliver 8x gen-on-gen performance improvement and it has higher performance in inference than the competition. We have 62,000 models running on Instinct today and more models will be running on the MI300 very soon. We have very strong momentum, as you can see in the ecosystem, adding OpenAI Triton to our extensive list of industry standard frameworks, models, run times and libraries. And you heard from the panels, right, our tools are proven and easy to use. Innovators are advancing the state-of-the-art AI on AMD GPUs today. ROCm 6 and the MI300X will drive an inflection point in developer adoption. I'm confident of that.

We're empowering innovators to realize the profound benefits of pervasive AI faster on AMD. Thank you.

And now I'd like to invite Lisa back on the stage.

Lisa Su

Thanks, Victor. And weren't those innovators great? I mean you love the energy and just all of the thoughts there. So look, as you can see, the team has really made great, great progress with ROCm and our overall software ecosystem. Now I said I wanted though, we really want broad adoption for MI300X. So let's go through and talk to some additional customers and partners who are early adopters of MI300X. Our next guest is a partner really at the forefront of Gen AI innovation and working across models, software and hardware. Please welcome Ajit Mathews of Meta to the stage.

Hello, Ajit. It's so nice of you to be here. We're incredibly proud of our partnership together. Meta and AMD have been doing so much work together. Can you tell us a little bit about Meta's vision in AI because it's really broad and key for the industry?

Ajit Mathews

Absolutely. Thanks, Lisa. We are excited to partner with you and others. And innovate to other to bring generative AI to people around the world at scale. Generative AI is enabling new forms of connection for people around the world, giving them the tools to be more creative, expressive and productive. We are investing for the future by building new experiences for people across our services, and advancing open technologies and research for the industry.

We recently launched AI stickers, image editing, Meta AI, which is our AI assistant that spans our family of apps and devices, and lots of AIs for people to interact within our messaging platforms. In July, we opened access to our Llama2 family of models. And as you’ve seen it, have blown away by the reception from the committee who have built some truly amazing applications on top of them. We believe that an open approach feeds to better and safer technology in the long run as we have seen from our involvement in the PyTorch Foundation, Open Compute Project and across dozens of previous AI models and data set releases. We are excited to have partnered with the industry on our generative AI work, including AMD. We have a shared vision to create new opportunities for innovation in both hardware and software to improve the performance and efficiency of AI solutions.

Lisa Su

That's so great, Ajit. We completely agree. I mean, we completely agree with the vision. We agree with the open ecosystem and that really being the path to get all of the innovation from all the smart folks in the industry. Now we've collaborated a lot on the product front as well, both EPYC and Instinct. Can you talk a little bit about that work?

Ajit Mathews

Yes, absolutely. We have been working together on EPYC CPUs since 2019 and most recently deployed Genoa and Bergamo-based servers at scale across Meta's infrastructure where it now serves many diverse workloads. But our partnership is much broader than EPYC CPUs, and we have been working together on Instinct GPUs starting since the MI100 in 2020. We have been benchmarking ROCm and working together on improvements for its support in PyTorch across each generation of AMD Instinct GPU leading up to MI300X now.

Over the years, ROCm has evolved becoming a competitive software platform due to optimizations and ecosystem growth. AMD is a founding member of PyTorch Foundations and has made significant commitment to PyTorch investment providing day zero support for PyTorch 2.0 with ROCm, Torch.Compile, Torch.Export, all of those things are great. We have seen tremendous progress on both Instinct GPU performance and ROCm maturity and are excited to see ecosystem support grow beyond PyTorch 2.0, like to OpenAI Triton, today's announcement with respect to being a default back end of AMD, that's great. FlashAttention-2 is great. Hugging Face, great, and other industry frameworks. All of these are great partnerships.

Lisa Su

It really means a lot to hear you say that, Ajit. You know, I think we also view that it's been an incredible partnership. I think the teams work super closely together, and that's what you need to do to drive innovation. And the work with the PyTorch Foundation is foundational for AMD, but really the ecosystem as well. But our partnership is very exciting right now with GPUs. So can you talk a little bit about the 300X plans?

Ajit Mathews

Oh, here we go. We are excited to be expanding our partnership to include Instinct MI300X GPUs in our data center for AI in production workloads. So just to give a little background, MI300X leverages the OCP accelerator module standard end platform which has helped us adopt in record time. In fact, MI300X is trending to be one of the fastest design to deployment solutions in the Meta's history. We have also had a great experience with ROCm, and the performance is able to deliver with MI300X.

The optimizations and the ecosystem growth over the years have made ROCm a competitive software platform as model parameters increase and the Llama family of models continues to grow in size and power, which it will, the MI300X with this 192 GB of memory and higher memory bandwidth meets the expanding requirements for large language model inference. We are really pleased with the ROCm optimizations that AMD has done, focused on the Llama2 family of models on MI300X. We are seeing great promising performance numbers, which we believe will benefit the industry.

So to summarize, we are thrilled with our partnership and excited about the capabilities offered by the MI300X and the ROCm platform as we start to scale their use in our infrastructure for production workloads.

Lisa Su

That is absolutely fantastic, Ajit.

Ajit Mathews

Thank you, Lisa.

Lisa Su

Thank you so much. We are thrilled with the partnership, and we look forward to seeing lots of the MI300Xs in your infrastructure. So thank you for being here.

So super exciting. We said cloud is really where a lot of the infrastructure is being deployed. But enterprise is also super important. So when you think about the enterprise right now, many enterprises are actually thinking about their strategy. They want to deploy AI broadly across both cloud and on-prem. And we are working very closely with our OEM partners to bring very integrated enterprise AI solutions to the market.

So to talk more about this, I would like to invite one of our closest partners to the stage, Arthur Lewis, President of Dell Technologies Infrastructure Solutions Group. Welcome, Arthur. I am so glad you could join us for this event. And Dell and AMD have had such a strong history of partnership. I actually also think, Arthur, you have a very unique perspective of what's happening in the enterprise just given your purview. So can we just start with giving the audience a little bit of a view of happening in enterprise AI?

Arthur Lewis

Yes. Lisa, thank you for having me today. We are at an inflection point with artificial intelligence. Traditional machine-learning and now generative AI is a catalyst for much greater data utilization, making the value of data tangible, and therefore, quantifiable. Data, as we all know is growing exponentially. A 100 zettabytes of data was generated last year, more than doubling over the last three years. And IDC projects that data will double again by 2026, and it is clear that data is becoming the world's most valuable asset. And this data has gravity. 83% of the world's data resides on-prem, and much of the new data will be generated at the edge. Yet customers are dealing with years of rapid data growth, multiple copies on-prem across clouds, proliferating data sources, formats and tools. These challenges, if not overcome, will prevent customers from realizing the full potential of artificial intelligence in maximizing real business outcome.

Today, customers are faced with two suboptimal choices. Number one, stitch together a complex web of technologies and tools and manage it themselves; or, two, replicate their entire data estate in the public cloud. Customers need and deserve a better solution. Our job is to bring artificial intelligence to the data.

Lisa Su

That's great perspective, Arthur. And that 83% of the data and where it resides, I think is something that sticks in my mind a lot. Now let's move to a little bit of the technology. I mean, we have been partnering together to bring some great solutions to the market. Tell us more about what you have planned from a tech standpoint.

Arthur Lewis

Well, today is an exciting day. We are announcing a much anticipated update to the family of our PowerEdge 9680, the fastest growing product in Dell ISG history, with the addition of AMD's Instinct MI300X accelerator for artificial intelligence. Effective today, we are going to be able to offer a new configuration of 8 MI300X accelerators providing 1.5 terabytes of coherent HBM3 memory, delivering bandwidth of 5.3 terabytes per server. This is an unprecedented level of performance in the industry and will allow customers to consolidate large language model inferencing onto a fewer number of services while providing for training at scale while also reducing complexity, cost and data center footprint.

We are also leveraging AMD's instinct Infinity platform, which provides a unified fabric for connecting multiple GPUs within and across servers, delivering linear scaling and low latency for distributed AI further. And there's more through our collaboration with AMD on software and open source frameworks which is you talked a lot about today, including pie tort intense or flow we can bring seamless services for customers and out-of-the-box LLM experience.

We talked about making it simple. This makes it incredibly simple. And we've also optimized the entire stack with Dell storage, specifically power scale and object scale providing ultra-low latency, Ethernet fabrics, which are designed specifically to deliver the best performance and maximum throughput for generative AI training and inferencing. This is an incredibly exciting step forward. And again, effective today, Lisa, we're open for business, we're ready to quote and we're taking orders.

Lisa Su

I like the sound of that. Look, so great to see how this all comes together. Our teams have been working so closely together over the last few years and definitely over the last year. Tell us, though, there's a lot of co-innovation and differentiation in these solutions. So just tell us a little bit more about that.

Ajit Mathews

Well, our biggest differentiator is really the breadth of our technology portfolio and Dell Technologies products like Power scale, which is our one file system for unstructured data storage has been helping customers in industries like financial services, manufacturing, life sciences to help solve the world's most challenging problems for decades as the complexity of their workflows and scale of their data estate increases.

And with AMD, we are bringing these components together with open networking products and AI fabric solutions, taking the guesswork out of building tailored gen AI solutions for customers of all sizes, again, making it simple. We have both partnered with Hugging Face to ensure transformers and LLM for generative AI don't just work for our combined solutions but are optimized for AMD's accelerators and easy to configure and size for workloads with our products.

And in addition to that, Dell validated designs, we have a comprehensive set and a growing array of services and offerings that can be tailored to meet the needs of customers looking for a complementary Gen AI strategy consultation, all the way up to and fully managed solution for generative AI.

Lisa Su

Fantastic Arthur, great set of solutions, love the partnership and love what we can do for our enterprise customers together. Thank you so much for being here.

Our next guest is another great friend. Supermicro and AMD have been working together to bring leadership computing solutions to the market for many years based on AMD ever processors as well as instinct accelerators. Here to tell us more about that. Please join me in welcoming CEO, Charles Young to the stage.

Lisa Su

Hello, Charles. Thank you so much for being here. I mean, Supermicro is really well known for building highly optimized systems for lots of workloads. We've done so much together. Can you share a little bit about how you're approaching Gen AI?

Charles Liang

Thank you. Because our building blocks are solution based on maturized design, so that enables Supermicro to design product quicker than others and deliver product to customers also quicker, better leverage inventory and better with services, and thank you for a close relationship. Thank you for all the help. So that's why we are able to design product time to market as soon as possible.

Lisa Su

Well, I really appreciate that our teams also work very closely together. And we now know that everybody is calling us for AI solutions. You've built a lot of AI infrastructure. What are you seeing in the market today?

Charles Liang

The market continued growing very fast. The only limitation is…

Lisa Su

Very fast, right.

Charles Liang

Very fast, maybe more than very fast, all we need is just more chips. So, today, including U.S.A., Netherlands, Taiwan and Malaysia, we have more than 4,000 rack per month capacity. And the customer facing to not enough power, not enough space program, so with our Rack Scale building big solution with free air cooling optimal for hybrid air and free air cooling, optimal for nuclear cooling. That can help customers save energy power up to 30% to even 40%, and that allow customers to install more systems we fixed power budget and our same power, same systems but the energy cost.

So, all of those together with our direct scale building block solution. We installed a whole rack including generative CPU, GPU and storage, switch, software, management software, security function. And when we ship to customers, customers are simply plugging two cables, power cable, data cable and then ready to run, ready to online. For deep cooling customer for sure, they need a water kind of tube. So that make customer can easily online with once chip available.

Lisa Su

Yes. No, that's fantastic. Thank you, Charles. Now let's talk a little bit about MI300X. What do you have planned for MI300?

Charles Liang

Okay. We have product based on MI300X AU for air cooler of the air cooler and for AU optimal for deeper cooler. So the air cooler per rack, we support up to 40 KW or 50 KW. For deeper cooler Decatura, we support up to 80 KW or 100 KW. And so all kind of rack scale, plug and play. So when customer need, once we have chip, we can shift to customer quicker.

Lisa Su

That sounds wonderful. Well, look, we appreciate all the partnership, Charles, and we will definitely see a lot of opportunity to collaborate together on generative AI. So thank you so much.

Charles Liang

Thank you so much. Thank you.

Lisa Su

Okay. Now let's turn to our next guest. Lenovo and AMD have a broad partnership as well that spans from data center to workstations and PCs and now to AI. So here to tell us about this special partnership. Please welcome to the stage, Kirk Skaugen, EVP and President of Infrastructure Solutions Group at Lenovo.

Lisa Su

Hello, Kirk. Thank you so much for being here. We truly appreciate the partnership with Lenovo you have a great perspective as well. Tell us about your view of AI and what's going on in the market.

Kirk Skaugen

Well, AI is not new for Lenovo. We've been talking and innovating around AI for many years. We just had a great supercomputing where we're the number one supercomputer provider to the top 500. And we're proud that IDC just ranked us number three AI server infrastructure in the world as well. So it's not new to us, but you were at Tech World. So thanks for joining us in Austin. We're trying to help shape the future of AI from the pocket to the edge to the cloud. And we've had this kind of concept of AI for all.

So what does that mean? Pocket, meaning Motorola, smartphone, AI devices and then all the way to the cloud with our ODM plus model. So our collaboration with our customers is really to accelerate AI adoption. And we recently announced another $1 billion to the original $1.2 billion we announced a few years ago to deliver AI solutions to businesses of all sizes. From the smallest business to the largest cloud. So we believe that generative AI will ultimately be a hybrid approach. And fundamentally, we do want to bring AI to the data. I think one of the most exciting things for me is I think like Arthur said, right, we'd see data doubling in the world over the next few years. 75% of that compute is moving to the edge. And today, we're only computing 2% of it.

So we're throwing away 98%. So more data is going to be traded in the next few years in the entire history of the world combined. And together, we're bringing AI to the edge with the recent SC-455 think edge that we announced. We think that there's kind of three views of generative public AI, private AI and personal AI. And the key for us is protecting privacy and addressing data security. So public AI, where you'd use, obviously, public data enterprise AI where you use only your enterprise data within your firewall, and then on things like an AIPC, things that you choose to have only on your device, whether that's a phone, a tablet or a PC.

Lisa Su

Yes, it's a very comprehensive vision, and we see it very much the same way. Now you talked a lot about your AI strategy at TechWorld and you had some key pillars there. Do you want to just tell us a little bit more about that?

Kirk Skaugen

Yes. So there's three fundamental pillars of our AI vision and strategy. First, we have an AI product road map. I think that's second to none from a rich smart device portfolio, and we'll talk about AIPC, probably more in another day. Smartphones and tablets. Then we have a huge array now of over 70 AI-ready server and storage infrastructure products. And then we recently launched ahead of a whole set of solutions and services around that as well.

So more than 70 products, and we'll talk about the new ones we're announcing today, which are very exciting. The second thing is we have something called an AI innovators program. What's really daunting to people is there's over 16,000 AI start-ups out there. So if you have an IT department of a few dozen people, how do you even start.

So we've gone and scoured the earth. We found 65 ISVs, 165 solutions where we've optimized them on top of Lenovo infrastructure for some of the key verticals and are delivering kind of simplified AI to the customer base. And then at Tech World, we launched a comprehensive set of professional services -- now Lenovo more than 40% of our revenue is non-PC. So we're transforming into data center and services.

So we're doing everything in the AI from just basic customer discovery of what you can do if you're a stadium what are the best-in-class stadium solutions, if you're a fast food chain if you're a supermarket, all the way to AI adoption and then even from a sustainability perspective, things like asset recovery services to make sure you have a sustainable AI journey as well.

Lisa Su

Yes, it makes a lot of sense. And Gen AI and large ongoing models are like sort of the defining moment for us right now. And you're spending a lot of time with customers. What are you hearing from them? And what are their challenges?

Kirk Skaugen

Yes. So I think the key message is that customers need help in simplifying their AI journey. I mean there's so much coming at them. So our investments in that $2 billion we talked about are really expanding our AI-ready portfolio to deliver fully integrated systems that bring AI-powered computing to everywhere, data is created, especially the edge and helping businesses easily and efficiently deploy generative AI applications.

We're also hearing that customers want choice. Choice in systems, choice and software, choice and services and definitely large language models and model training are creating a lot of buzz. But over time, I think we all know inference is going to become the dominant AI workload as data flows from these billions of connected devices at the edge.

So generative AI, from our perspective, like you said, I think, in your opening comments, needs high-performance compute, large and fast memory and a software stack to support the leading AI ecosystem solution. So with that, I believe Lenovo and AMD are really uniquely positioned to take advantage of these trends.

Lisa Su

Yes, absolutely. And our teams are doing a lot of work together and working closely on MI300X. Tell us more about your plans.

Kirk Skaugen

We have a long proven track record as a PC company and as a data center company of bringing Ryzen AI that are think pads. And we're committed to being time to market on large language models on inferencing and we're working with AMD to develop our next-gen AI product road map and our solution portfolios. So we're incredibly excited today about the addition of the MI300X to the Lenovo think system platform. It's going to be very exciting.

So we're committed to be time to market with a dual epic 8 GPU, MI300X and have a lot of customer interest on that. So bottom line, from edge to cloud, we are incredibly excited about what's ahead from us. We're going to have all of this available as a service through our Lenovo true scale as well. So you only have to pay for what you need. So as we move to an as-a-service model, everything we talked about today will be available through that as well. So thank you very much and look forward to continuing the collaboration.

Lisa Su

Absolutely, Kirk, thank you so much. Thanks for the partnership.

Kirk Skaugen

Alright, thank you.

Lisa Su

So that's great. Big thank you to Kirk and Arthur and Charles for all the work that we're doing together to really bring MI300X to our customers. It really does take an entire ecosystem. We're very proud of actually the broad OEM and ODM ecosystem that we have brought together to bring a wide range of MI300X solutions to market in 2024. And in addition to the OEM and ODM ecosystem, we're also significantly expanding our work with some of these specialized AI cloud partners.

So I'm happy to say today that all of these partners are adding MI300X to their portfolio. And what's important about this is it will actually make it easier for developers and AI start-ups to get access to MI300X GPUs and as soon as possible with a proven set of providers who each have their all -- their unique value and capabilities. So that tells you a little bit about the ecosystem that we're putting together for MI300X.

Now we've given you a lot of information already. But what is very, very important is not just the hardware and the software and all of our customer partnerships, but it's also the rest of the system partnerships. So now let me welcome to the stage Forrest Norrod to talk more about our AI networking and high-performance computing solutions.

Forrest Norrod

Thank you, Lisa. Good morning. So far, we've talked about the amazing GPU and open software ecosystem that AMD is building to power generative AI systems. But there's a third element that's equally important to the performance and scalability of these large AI deployments, and that's networking. The compute required to train the most advanced models has increased by a factor of $50 billion over the past decade.

While GPU performance has also increased, what that performance to man means is we need many GPUs in order to deliver the required total performance. Leading AI clusters are now tens of thousands of GPUs and that's only going to increase well. So the first way we've scaled to meet that demand is within the server. A typical server has perhaps a couple of high-performance x86 CPUs and perhaps 8 GPUs. You've seen that today.

These are interconnected with a high-performance, low-latency non-blocking local fabric. In the case of NVIDIA, that's NVLink for AMD, that's Infinity Fabric. Both have high signaling rates, low latency, both are coherent. Both have demonstrated the ability to offer near linear scaling performance as you increase the number of GPUs and both have been proprietary, effectively only supported by the companies that created them.

I'm pleased to say that today, AMD is changing that. We are extending access to the Infinity Fabric ecosystem to strategic partners and innovative companies across the industry. Doing so allows others to innovate around the AMD GPU ecosystem to the benefit of customers and the entire industry. You'll hear more about this from one of our partners in a few minutes and much more on this initiative next year. But beyond the node, we still need to connect and scale to much larger numbers.

We need fabrics to connect the servers to one another, welding them into one resource. Now there are usually two networks connected to each of these GPU servers, a traditional Ethernet network used to connect the server to the rest of the data center traditional infrastructure. and more importantly, a backside network to interconnected GPUs, allowing them to share parameters, results, activations and coordinate in the overall training and inference tasks. When we're connecting thousands of nodes like we do in AI systems, the network is critical to overall performance.

It has to deliver fast switching rates and very low latency. It must be efficiently scalable so the congestion problems don't limit performance. And in AMD, we believe it must also be open to allow innovation. Today, there are two options for the back-end fabric, InfiniBand or Ethernet. At AMD, we believe Ethernet is the right answer. It's a high-performance technology with leading signaling rates. It has extensions such as Rocky and RDMA to efficiently move data between nodes.

A set of innovations developed leading supercomputers over the years. It's scalable, offering the highest rating switching technology from leading vendors such as Broadcom, Cisco and Marvell, and we've seen tremendous innovation recently in advanced congestion control to deal with the issues of scale effectively. And most of all, it's open.

Open means companies can extend Ethernet, innovating on top as needed to solve new problems. We've seen that from Hewlett Packard Enterprise with their Slingshot technology, which powers the network at the heart of Frontier, the world's fastest supercomputer, enabling it to achieve exascale performance. And we've seen Google and AWS who run some of the largest clusters in the world, developed their own Ethernet extensions.

And finally, maybe most importantly, we've seen the industry come together to create the Ultra Ethernet consortium and standard, where leaders across the field have united to drive the future of Ethernet and ensure it's the best high-performance interconnect, for AI and HPC. And we're proud to welcome to the stage today, some of those networking leaders, Andy Bechtolsheim from Arista; Jas Tremblay from Broadcom; and Jonathan Davidson from Cisco.

Welcome, gentlemen. It's not often that we have such a panel of Ethernet experts on the stage, but before we jump right into Ethernet. Perhaps we can talk a little bit about the work of enabling an ecosystem for AI solutions and what that looks like and why is it so important to have an open approach? And maybe, Jonathan, you could start.

Jonathan Davidson

Sure, absolutely. Well, first of all, congratulations on the announcements today. We look at how Ethernet is so critical because I remember back in the day doing testing on 10-megabit Ethernet interoperability. We're now at 400 gig, 800 gig, we have line of sight to 1.6 terabit. It is absolutely ubiquitous across the industry, and it's also interoperable. It's a beautiful thing. So that open standard is really important for us to be able to make this successful.

Forrest Norrod

Absolutely, and Jeff, your thoughts as well?

Jas Tremblay

I 100% agree, Forrest you and I share a vision of the power of the data center ecosystem. If you think about a data center, you've got thousands of companies coming together to work as one, and this is really enabled by open standards and a code of conduct, that we shall enter up. We're going to make things work together across companies, in some cases, across competitors. And I'm especially excited about the work that you and I have been doing on the Infinity Fabric at xGMI and we want to let the industry know that the next generation of Broadcom PCI switches, which are used as the internal fabric inside servers are going to support Infinity Fabric xGMI and we'll be sharing more details around that over the next few quarters. But I think this is -- it's important that we offer choices and options to customers and that we come together and jointly innovate.

Forrest Norrod

I completely agree. And Andy, you've been long been a proponent of open.

Andy Bechtolsheim

Yes. Well, open standards have been the driving force for a lot of innovation throughout the industry's history. But nowhere this is more true than in the case of Ethernet, where the incredible progress we have seen for the last 40 years would not have happened without the contributions of many, many ecosystem participants including the companies that represented to you at this stage.

Forrest Norrod

Absolutely. Well, okay. So since this is a panel of Ethernet luminaries. Let's talk about Ethernet in particular. What are the advantages of Ethernet for AI? What are the advantages of Ethernet in general? And how are customers using it today? And we'll talk about the future in a minute. But let's reflect on current state. Maybe Andy, you can start out.

Andy Bechtolsheim

Yes. So Ethernet at least to me, is the clear choice for AI fabrics. And for a very basic reason, it doesn't have a scalability limit. It can truly support not just 10 thousands of nodes today, but 100,000, preps million nodes in the future. And there is no other network technology that has that attribute. And without that scalability, you're just boxing yourself in.

Forrest Norrod

Yes, very true. And Jonathan, I know you guys have been quite a bit on AI and networking systems as well.

Jonathan Davidson

Well, for today, specifically, we see the majority of hyperscalers as we've had some of them on the stage today are either using Ethernet for AI Fabrics or there is a high desire for them to move to Ethernet for the AI fabrics. And so that requires a lot of collaboration from the folks up here on stage to make that happen. We also have been helping customers deploy -- in the past, their AI networks for enterprise use cases globally, and it might have started more in the financial trading sector in the past.

We're seeing a tremendous amount of interest in use cases for that whole system and how you pull all those things together from network, the GPU, the NIC, the DPU, all the way to how you wrap the software around that to really make it simple and understand how things are working and when they're not working, why and making that simple for them to do that as well.

Forrest Norrod

Absolutely. And Jeff, I know well, all of us have been working together in deploying Ethernet-based solutions for AI leaders today. But I mean we've been working with the two gentlemen on the end on switching, but Jeff, maybe you can reflect on the NIC as well.

Jas Tremblay

I think the NIC is critical. People want choices, and we need to move the innovation even faster in the nick and you'll see much more linkages between the NIC and the switch, where before you had a compute domain and a network domain, and these things are really coming together and AI is a driving force of that because the complexity is going up so much.

Forrest Norrod

Yes, absolutely. Well, okay. So let's talk about the future a little bit. The Ultra Ethernet consortium is all three -- all four companies on stage are founding members and there's many others that have joined. UEC is one of the fastest growing or maybe the fastest-growing consortium under the Linux Foundation, which has been great to see. It's going to shape. I think UEC is going to shape the future of AI networking. And so let's unpack that because I think that's a critical topic for folks. And maybe, Jeff, why don't you go ahead and start.

Jas Tremblay

Yes. So first of all, Ethernet is ready today for AI, but we need to continue to innovate. And UEC started with a group of eight companies, including four of our companies here cloud providers, system providers and semiconductor providers coming together around a common vision.

And the vision is AI networks need to be open, standards based. We need to offer choices, and we need to enhance them. And with that common vision, the engineers we've assigned from other companies really got together and rolled up their sleeves and the innovation happened extremely quickly. It's quite exciting actually.

And one of the things that I'm most excited about this is we're not building something new. We are jointly going to enhance Ethernet that's existed for 50 years. So it's not starting from scratch, it's enhancing. It's recognizing that Ethernet is what people want. We just need to continue to enhance it and making this open and standards-based.

Forrest Norrod

Absolutely. And Jonathan, I know Cisco has been a huge proponent of UEC as well. Maybe you can reflect on your thoughts of where this is going.

Jonathan Davidson

Absolutely. Well, I think that UEC absolutely is very critical for Cisco. I on the panel and the whole industry so that we can continue to drive that movement towards open. It always takes time. You got to debate whether the right technical way to solve things. But I think that overall, it's moving in the right direction. What I see what's happening here is that we're going to have to have interoperability in more than just one area.

Andy, I want to talk about LPO and all the things that we need to do there to make to make that actually happen. And what's happened to UEC is another important part. And what I see what's happening between now and when the first enter comes out is really a coalition of the willing. Like how do we get all of this together to drive towards those open interfaces?

Whether it be at the Ethernet layer, whether it be it things that you need to plug into it, how the GPUs connect into that, how you're actually going to spray traffic across a very broad rates how are you going to make sure you can reorder packets in a consistent way. These are all things that we need to make sure that we are driving towards from an interoperability perspective. And we've got our own silicon, we've got optics, but we also are in the component business at Cisco.

And so we sell those things, hyperscalers might want to just buy pieces from us, like the silicon and enterprises may want the full system, but we want to make sure that it's absolutely 100% interruptible in every single environment.

Forrest Norrod

Absolutely. And Andy, maybe you can hone in a little bit more. I mean I think many people that aren't familiar with networking may think, hey, how hard can this be? We're just shuffling bits around between systems, but there's a lot of problems to solve.

Andy Bechtolsheim

Yes. So UEC is in fact solving a very important technical problem, which is the way we describe it is this modern RDMA at scale. And this has not been soft before. To be clear, no Rocky today exists, but it has its limitations. And it does take an ecosystem effort approach, and it involves the -- in particular, the adapter, the NIC silicon vendors, but also the whole end-to-end interbility of that architecture. And we're very excited to be part of this. We're not in the NIC business ourselves, but this is absolutely key to enable scaling of RDMA across hundred thousands, if not 1 million notes.

Forrest Norrod

Yes, absolutely. And when you look at what's being predicted in terms of million node, hundreds of thousands up to 1 million node systems. I mean, we're all -- we all have our work cut out for us. But working together, I know we can solve the problems.

Well, guys, thanks so much for coming to talk to us today. I'd like to thank you all for your partnership in this journey, and thank you all for coming today.

I'd really like to thank our partners from Arista, Broadcom and Cisco for attending and for their partnership and driving this critical third leg that determines the performance of AI systems.

Now let's turn our focus to high-performance computing, the traditional realm of the world's largest systems. AMD has been driving HPC technology for many years. In 2021, we delivered the MI250, introducing third-generation Infinity architecture. It connected an EPYC CPU to the MI250 GPU through a high-speed bus Infinity Fabric. That allowed the CPU and the GPU to share a coherent memory space and easily trade data back and forth, simplifying programming and speeding up processing. But today, we're taking that concept one step further really to its logical conclusion.

With the fourth-generation Infinity architecture, bringing the CPU and the GPU together, into one package sharing a unified pool of memory, this is an APU, an accelerated processing unit. And I'm very proud to say that the industry's first data center APU for AI and HPC, the MI300 began volume production earlier this quarter and is now being built into what we expect to be the world's highest performing system.

Now Lisa already showed you what our chiplet technologies make possible with MI300X. The MI300X takes those same building blocks in a slightly different fashion. Now the I/O die is laid down first as before and contains the affinity cash and connections to memory and I/O. The XCD accelerator chiplets are bonded on top as in the MI300X. But with the MI300, we also take CPU chiplets, leveraged directly from our fourth generation Epic CPUs, Gena, and we put those on top of the IDs as well. thus bringing together our leading CPU, Zen and CDNA technologies into one amazing part.

Finally, 8 stacks of HBM3 with up to 128 gigs of capacity complete the MI300A. A key advantage of the APU is no longer needing to copy data from one processor to another, even through a coherent link because the memory is unified, both in the RAM as well as in the cash. The second advantage is the ability to optimize power management between the CPU and the GPU. That means dynamically shifting power from one process or to another depending on the needs of the workload, optimizing application performance.

And very importantly, an APU can dramatically streamline programming, making it easier for HVC users to unlock its full performance. And let's talk about that performance. 61 teraflops of double precision floating point FP64, 122 teraflops a single precision. Combined with that 128 gigabytes of HPM 3 memory at 5.3 terabytes a second of bandwidth, the capabilities of the MI300A are impressive. And they're impressive, too, when you compare it to the alternative. When you look at the competition, MI300A has 1.6x memory capacity and bandwidth of Hopper.

For low precision operations like FP16, the two are at parity in terms of competitional performance. But where precision is needed MI300A delivers 1.8x the double and single precision FP64 and FP32 floating point performance. And beyond simple benchmarks, the real advantages of an APU come with the performance of real-world applications, which have been tuned for the APU architecture. For example, let's look at OpenFOAM. OpenFOAM is a set of computational fluid dynamics codes widely used across research, academia and industry. With MI300A we see 4x the performance of Hopper on common open flow codes.

Now that performance comes from several places. From higher performance math operations as we talked, larger memory and the increased memory bandwidth. But much of that uplift really comes from that unified memory eliminating the need to copy data around the system, that can perform fortuned applications truly transformative performance. And I'm also proud to say that beyond performance, AMD has stayed true to its heritage to its history, of leading in power efficiency. At the node level, the MI300A has twice the HPC performance per watt of the nearest competitor.

Customers can then thus fit more nodes into their overall facility power budget and better support their sustainability goals. With the MI300X, we set out to help our customers advance the Frontiers of research and not just running traditional HPC applications. One of the most exciting new areas in HPC is actually the convergence with AI, where AI is used in conjunction with HPC techniques to help steer simulations, thus getting much better results much faster. A great example of this is CosmoFlow. It couples deep learning with traditional HPC simulation methods, giving researchers the ability to probe more deeply and allowing us to learn more about the universe at scale.

CosmoFlow is one of the first applications targeted to be run on El Capitan, which we believe will be the industry's first true two, exaflop supercomputer running double precision flow when it's fully commissioned at Lawrence Livermore National Labs.

It's going to be an amazing machine. So let's hear more about El Capitan and its applications for HPC and AI from our partners at LLL and Hewlett Packard Enterprises.

[Presentation]

Forrest Norrod

We are proud to have partnered with Hewlett Packard to design and now build this amazing system. And so I'd like to invite to the stage Trish Damkroger, the Senior Vice President and Chief Product Officer; for HPC, AI and Labs from Hewlett Packard Enterprise.

Welcome, Trish. The AMD and HPE teams have been working closely together over the years to deliver some next-generation supercomputers. Most recently, of course, we've broken the exascale bar. I got to say it again. We broke the exascale barrier with Frontier for Oak Ridge National Labs, and now we're looking forward to powering another exascale system, another bench -- another record with you with El Capitan for Lawrence Livermore National Labs, another U.S. Department of Energy lab. Maybe you can share more with this audience about our journey together and the innovations that we've ushered in this journey to exascale.

Trish Damkroger

Sure. First, I want to echo the long partnership that we've had with AMD. Frontier continues to be the fastest computer in the world. Many doubted our ability to actually reach exascale. But with -- we're able to achieve this feat with industry-leading liquid cooling infrastructure, next-generation high-performance interconnect with Slingshot, our highly differentiated system management and Crave programming environment software, along with the incredible MI250.

With Frontier exascale computing has already made breakthroughs in areas such as aerospace, climate modeling, health care and nuclear physics. Frontier is also one of the world's top 10 greenest supercomputers. In fact, HPE and AMD have the majority of the world's top 10 energy-efficient supercomputers.

I am very excited to deliver El Capitan to Lawrence Livermore, as you know, I worked there for over 15 years. El Capitan's computing products will fundamentally shift what the scientists and engineers will be able to achieve. El Capitan could be 15x to 20x faster than their current system. Super computing is truly essential to the mission of the Department of Energy. Lawrence Livermore has been at the forefront, driving the convergence of HPC and AI, demonstrated by work at the National Ignition Facility and other of the National Security Programs.

I'm really looking forward to continuing our journey of bringing more leadership class systems to the world.

Forrest Norrod

Absolutely. I couldn't agree more, Trish. It's been a rewarding journey working together with HPE. Speaking of our shared success in building these record-breaking systems, can you tell us a bit more about El Capitan and how HPE is developing the Instinct 300A powered CPU to El Capitan?

Trish Damkroger

Great. Yes, El Capitan will feature the HPE Cray EX supercomputer with the MI300A accelerators to power large AI-driven scientific projects. The HPE Cray EX supercomputer was built from the ground up with end-to-end capabilities to support the magnitude of exascale. El Capitan nodes include the MI300X, coupled with our slingship fabric to operate as a fully integrated system. Supercomputing is the foundation needed for large-scale AI, and HPE is uniquely positioned to deliver this with our trace supercomputers.

El Capitan will be that engine for AI and deep learning for the Department of Energy. They will be recreating the experimental environment and simulations and training the AI models with all of that vast amount of data. El Capitan will be one of the most capable AI systems in the world. And beyond El Capitan, we're excited to have expanded our supercomputing portfolio with the MI300A to bring next-generation accelerated compute to a broad set of customers.

Forrest Norrod

Yes. So Trish, it's fantastic. And actually, let's double-click into that a little bit more. I know that there are a growing number of supercomputing customers, not just at LLM that are really applying AI to their projects. You can tell us a little bit even more about that.

Trish Damkroger

Sure. So AI undoubtedly will be the catalyst to transform scientific research. As I said earlier, supercomputing is the foundation needed to run AI. And HP is the undisputed leader in delivering supercomputers. Some example where AI will be fundamental in El Capitan include the National Ignition Facility, where they will be using 1D, 2D, 3D simulations along with trained AI models to develop a more robust design for higher-yield fusion reactions just imagine fusion energy in our future.

Another application is high-resolution earthquake modeling, essential for understanding, building structural integrity and also emergency planning. And one of our patients bioassurance where simulation and AI models will be key in developing rapid therapeutics. Supercomputing AR tools to allow engineers and scientists the ability to find the unknown. I'm thrilled to be part of the journey of accelerating scientific discovery and the scale impact it has on changing the way people live and work.

Forrest Norrod

Well, Trish. Thank you. I'm so excited about the opportunities that researchers and scientists will have with the systems that we're bringing to the market together. Thanks so much.

On behalf of AMD and the entire team, I really want to just HPE and our customers for the opportunity to participate in the development of these massive systems because El Capitan will be an amazing machine and a real showcase for the MI300A, which defines leadership at this critical juncture as HPC and AI converge. AMD is proud of the leadership systems powered by MI300A, which will be available soon from partners around the world. I can't wait to see what researchers and scientists are going to do with these systems.

And with that, I'd like to welcome Lisa back on stage to conclude our journey today. Thank you.

Lisa Su

All right. Thank you, for us, and thank you to all of our partners who joined us. You've heard from Victor, for us, our key partners, we have significant momentum, and we're building on that for the data center AI platforms. To cap off the day, let me now talk about another important area for AMD, where we're delivering leadership AI solutions, and that's the PC.

Now for the PCs, we recognized several years ago that on-chip AI accelerators or NPUs, would be very, very important for next-generation PCs. And the NPU is actually the compute engine that will enable us to reimagine what it means to build a truly intelligent and personal PC experience. At AMD, we're on actually a multiyear journey. We have a strong road map to deliver the highest performance and most power-efficient NPUs possible.

We were actually the first company to integrate an NPU into an x86 processor. When we launched Ryzen Mobile 70-40 series earlier this year, and we integrated the XDNA architecture that actually came from our acquisition of Xilinx, it actually took us less than a year to bring Xilinx's proven technology into our PC products.

Let me tell you a little bit about XDNA. It's a scalable and adaptive computing architecture. It's built around a large computing array that can efficiently transfer the massive amounts of data required for AI inference and as a result, XDNA is both extremely performant and also very energy efficient. So you can run multiple AI workloads simultaneously in real time.

Now I'm happy to say that we've already shipped millions of Ryzen AI-enabled PCs into the market with all of the leading PC OEMs. And all of this provides the hardware foundation for developers to leverage this first wave of AI PCs. Now if you look at some of the applications, today, Ryzen AI powers hundreds of different AI functions, things like advanced motion tracking and sharpening to de-blur 4K video enabling production level digital production capabilities with unlimited virtual cameras, all in an ultrathin notebook for the very first time.

We're also working with key software leaders like Adobe and Black Magic, and they're using our on-chip Radeon GPU to accelerate the AI-enabled editing features so that you can dramatically improve productivity for content creators. And of course, we've worked very, very closely with Microsoft to enable Windows 11 studio effects on Ryzen AI.

Now today, we're launching some additional capabilities. So Ryzen AI 1.0 software, it will make it easier for developers to add advanced Gen AI capabilities. So with this new package, developers can create an AI-enabled application that's ready to run on AI hardware -- on Ryzen AI hardware just by choosing a pretrained model. So for example, you can choose one of the models that are available on Hugging Face quantize it based on your needs and then deploy it through Onyx run time.

So this is a major step forward when you think about the broad ecosystem that wants to run AI apps for Windows, and we can't wait to see what ISVs will do when they really capture the leadership performance that you can get from an NPU in Ryzen AI.

Now of course, we know developers always want more AI compute. So today, I'm very happy to say that we're launching our Hawk Point Ryzen 80-40 series mobile processors. Hawk Point combines all of our industry-leading performance in battery life, and it increases AI tops by 60% of compared to the previous generation.

So if you just take a look at some of the performance metrics for the Ryzen 80-40 series, if you look at the top of the stack, so Ryzen 9, 89 45, it's actually significantly faster than the competition in many areas, delivering more performance for multi-threaded applications, 1.8x higher frame rates for games and 1.4x faster performance across content creation applications.

But when you look at the AI improvements of Ryzen 80-40, you really see some substantial improvements. So I talked about additional tops in Hawk Point and what that results in faster performance when you're running the key models. So things like Llama 27b, we run 1.4x faster and also 1.4x faster on things like AI image recognition and object detection models. So all of this, what does it do? It provides faster response times and overall better experiences.

Now I really believe that we're actually at the beginning of this AI PC journey, and it's something that is really going to change the way we think about productivity at a personal level. So we've been working very closely with Microsoft to ensure that we are co-innovating across hardware and software to enable those next generation of AI PCs, to share more about this work, I'm pleased to welcome Pavan Davuluri, Corporate Vice President of Windows and Devices at Microsoft to the stage.

Pavan Davuluri

Thank you. Good to be here.

Lisa Su

Pavan, thank you so much for being here. We started the show with Kevin Scott talking about the great partnership between Microsoft and AMD and all the work we're doing on the big iron in the cloud in Azure. And it seems fitting that we closed the show with the other very, very important work that we're doing together on the client side. So can you tell us a little bit, Pavan, about all the great work and your vision for client AI?

Pavan Davuluri

For sure. As you and Kevin covered, Microsoft and AMD have a long partnership together across Azure and Windows. And it's incredible to see us moving that partnership together into the next wave of technology with AI. As you shared Lisa, for us, there are millions of PCs right now with Ryzen 740 AI in market. And that's amazing because these are the first X86 PCs with integrated NPUs, enabling enhanced AI.

Lisa Su

You told me everybody wanted to NPUs.

Pavan Davuluri

Absolute. Right now, we get to see some incredible AI, something you talked about. When you study effects coming to life across the scale of the ecosystem, absolutely fantastic, I would say. Now for us at Microsoft and for the ecosystem, our marquee AI experience is really copilot. Similar to how the start button is the gateway into Windows, the copilot for us is the entry point into this world of AI on the PC. It has a fundamental impact on everything we will do on a computer from work at school and play and entertainment and creation.

Lisa Su

I completely agree, Pavan. I think copilot is so transformational. I mean, for everyone who's had a chance to experience it. It's so it really changes the way we do work. So let's talk about the tech that's underneath it. So to enable copilot and everything that we want to do on PCs.

Pavan Davuluri

We are putting together new systems architectures that really power those experiences going forward. And they really pulled together NPU and certainly the cloud as well. And quite honestly, we're seeing customer habits change early at this point in time. And we believe, to your point earlier, we're early in the cycle of innovation that's coming. When we have these powerful NPUs like the ones you're building, it gives us an opportunity to create apps that take advantage of both local and cloud inferencing.

And to me, that's what the Windows AI ecosystem is about, and that's what we're building in partnership with you. It's designed to enable the scenarios with the Onyx run time, of course, and the Olive tool chain to back this up, applications are going to have many models like Llama that you mentioned, FY2 running, and they will run very capably in the tops that we will have. And of course, not to mention the foundation models that are powered by the GPU in the cloud.

Lisa Su

Yes. I mean I think this is an area where Microsoft and AMD really have a very unique position because we have so much capability in the cloud. We have also access to the client and the local view. Can you share a bit about how we're thinking about across all of these -- the cloud local view?

Pavan Davuluri

Yes. With AMD, we're making it simpler to incorporate what we call the hybrid pattern or the hybrid loop into applications. And we want to be able to load shift between the cloud and the client to provide the best of computing across both those worlds. For us, it's really about seamless computing across the cloud and the client.

It brings together the benefits of local compute, things like enhanced privacy and responsiveness and latency with the power of the cloud, high-performance models, large data sets, cross-platform inferencing. And so for us, we feel like we're working together to build that future where Windows is the destination for the best AI experiences on PCs.

Lisa Su

Yes. No, I think that sounds great. Now one of the things, though, that you definitely are always talking to me about is more tops. Pavan having asked for more tops all the time. So look, we buy -- we completely believe that to enable your vision for AI experiences we've really thought about how do we actually accelerate our client AI road map.

So I want to share a little bit of our road map today. Ryzen 740 and 840, we've already delivered those industry-leading NPU capabilities but today, I'm very excited to announce that our next-gen Strix Point Ryzen processors will actually include a new NP powered by our second-generation XDNA2 architecture coming in 2024.

A little bit about XDNA2. It's designed really for leadership Gen AI performance. It delivers more than 3x the NPU performance of our current rising 70-40 series. And Pavan, I'm very happy to share. I know your teams already know this because you have the silicon. But today, Strix Point is running great in our labs, and we're really excited about it. Our teams have been working really closely together to make sure that all of those great future Windows AI features run really well on Strix Point. So I can't share more about that later this year.

Pavan Davuluri

Lisa, that's awesome, and we will use every top you will provide us.

Lisa Su

You’ve promised, right.

Pavan Davuluri

Absolutely. I mean it's not just of the neural engines, the dramatic increase in efficiency, performance per watt of these next-generation NPUs, we think will bring a whole new level of capabilities to the market, enabling personalization on every interaction on these devices.

Together with Windows, we feel like we're building that future for the copilot, where we will orchestrate multiple apps, services and across devices, quite frankly, functioning as an agent in your life that has contacts and maintains context across entire workflow. So we're very excited about these devices coming to life for the Windows ecosystem. We were excited to see what developers will do with this technology.

And quite frankly, the other day, ultimately, what customers will do with all this innovation.

Lisa Su

Thank you so much, Bob. And we are so excited about the partnership. We appreciate all the long-term work we're doing together and look forward to lots of great things to come.

Pavan Davuluri

Thank you for having me.

Lisa Su

All right. So it's been such a fun day, but now it's time for me to wrap up a bit. We've showed you a lot of new products, a lot of new platforms, a lot of new technologies that are all about taking AI infrastructure to the next level. MI300X, MI300A accelerators, these are all shipping today in production. They're already being adopted by Microsoft, Oracle, Meta, Dell, HP Enterprise, Lenovo, Supermicro and many others.

You heard from Victor, how we're expanding the ecosystem with AI developers working with us ROCm 6 software, the open ecosystem that -- our goal is to make it incredibly easy for everyone to use instinct GPUs.

You've heard from Forrest in our panel on the overall system architecture. Our work with Arista, Broadcom and Cisco, we believe that to create this high-performance AI infrastructure, it has to be open. And that's what we're doing together for scale-out AI solutions.

And then you heard what we're doing on the other side, the client part of our business because we actually believe AI should be everywhere. So our latest Ryzen processors really extend our compute vision and our AI leadership. I hope you can see that AI is absolutely the number one priority at AMD.

Our goal is to push the envelope to bring innovation to the market to do more than anything thought was possible because we believe as wonderful as our technology is, it is about doing it together in a partner ecosystem where everybody brings their best to the market.

I want to say, on a personal level, today is an incredibly proud moment for AMD. If you think about all of the innovation, everything that we bring to the market to be part of AI at this time, at the beginning of this era, to work with these amazing people throughout the industry, throughout the ecosystem, at AMD. I can say that I've never seen something more exciting.

A very, very special thank you to all of our partners who joined us today, and thank you all for joining us.

