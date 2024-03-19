NVIDIA Corporation (NASDAQ:NVDA) GTC Financial Analyst Q&A Call March 19, 2024 11:30 AM ET

Company Participants

Jensen Huang - Founder and Chief Executive Officer

Colette Kress - Executive Vice President and Chief Financial Officer

Conference Call Participants

Ben Reitzes - Melius Research

Vivek Arya - Bank of America Merrill Lynch

Stacy Rasgon - Bernstein Research

Matt Ramsay - TD Cowen

Tim Arcuri - UBS

Brett Simpson - Arete Research

C.J. Muse - Cantor Fitzgerald

Joseph Moore - Morgan Stanley

Atif Malik - Citi

Pierre Ferragu - New Street Research

Aaron Rakers - Wells Fargo

Will Stein - Truist Securities

Jensen Huang

Good morning. Nice to see all of you. All right. What's the game plan?

Colette Kress

Okay. Well, we've got a full house and we're thanking you all for coming out for our first in-person in such a long time. Jensen and I are here to kind of really go through any questions that you have, questions from yesterday.

And we're going to go through a series of folks that are going to be in the aisles that you can just reach-out to us, raise your hand, we'll get to you with a mic and Jensen are here to answer any questions from yesterday.

We thought that would be a better plan for you. I know you have already asked quite a few questions, both last night and this morning, but rather than giving you a formal presentation, we're just going to go through of good Q&A today. Sound like a good plan.

I'm going to turn it to Jensen to see if he wants to add some opening remarks because we have just a quick introduction. We'll do it that way. Okay.

Jensen Huang

Yeah. Thank you. First, great to see all of you. There were so many things I wanted to say yesterday and probably have said -- and wanted to say better, but I got to tell you, I've never presented at a rock concert before. I don't know about you guys, but I've never presented in a rock concert before. The -- I had simulated what it was going to be like, but when I walked on stage, it still took my breath away. And so anyways, I did the best I could.

Next, after the tour, I'm going to do a better job, I'm sure. I just need a lot more practice. But there were a few things I wanted to tell you. Is there a clicker -- oh, look at that. See, this is like spatial computing. It's -- by the way, if you get -- I don't know you'll get a chance, because it takes a little step up, but if you get a chance to see Omniverse in Vision Pro, it is insane. Completely incomprehensible how realistic it is.

All right. So we spoke about five things yesterday and I think the first one really deserves some explanation. I think the first one is, of course, this new industrial revolution. There were two -- there are two things that are happening, two transitions that are happening. The first is moving from general purpose computing to accelerated computing. If you just looked at the extraordinary trend of general-purpose computing, it has slowed down tremendously over the years.

And in fact, we've known that it's been slowing down for about a decade and people just didn't want to deal with it for a decade, but you really have to deal with it now. And you can see that people are extending the depreciation cycle of their data centers as a result. You could buy a whole new set of general purpose servers and it's not going to improve your throughput of your overall data center dramatically.

And so you might as well just continue to use what you have for a little longer. That trend is never going to reverse. General purpose computing has reached this end. We're going to continue to need it and there's a whole lot of software that runs on it, but it is very clear we should accelerate everything we can.

There are many different industries that have already been accelerated, some that are very large workloads that we really would like to accelerate more. But the benefits of accelerated computing is very, very clear.

One of the areas that I didn't spend time on yesterday that I really wanted to was data processing. NVIDIA has a suite of libraries that before you could do almost anything in a company, you have to process the data. You have to, of course, ingest the data, and the amount of data is extraordinary. Zettabytes of data being created around the world, just doubling every couple of years, even though computing is not doubling every couple of years.

So you know that data processing, you're on the wrong side of that curve already on data processing. If you don't move to accelerated computing, your data processing bills just keep on going up and up and up and up. And so for a lot of companies that recognize this, AstraZeneca, Visa, Amex, Mastercard, so many, so many companies that we work with, they've reduced their data processing expense by 95%, basically 20 times reduction.

To the point the acceleration is so extraordinary now with our suite of libraries called rapids, that the inventor of Spark, who started a great company called Databricks, and they are the cloud large scale data processing company, they announced that they're going to take Databricks their photon engine, which is their crown jewel and they're going to accelerate that with NVIDIA GPUs.

Okay. So the benefit of acceleration, of course, pass along savings to your customers, but very importantly, so that you can continue to sustainably compute. Otherwise, you're on the wrong side of that curve. You'll never get on the right side of the curve. You have to accelerate. The question is today or tomorrow? Okay. So accelerated computing. We accelerated algorithms so quickly that the marginal cost of computing has declined so tremendously over the last decade that it enabled this new way of doing software called generative AI.

Generative AI, as you know, requires a lot of flops, a lot of flops, a lot of computation. It is not a normal amount of computation, an insane amount of computation. And yet it can now be done cost effectively that consumers can use this incredible service called ChatGPT. So, it's something to consider that accelerated computing has dropped, has driven down the marginal cost of computing so far that enabled a new way of doing something else.

And this new way is software written by computers with a raw material called data. You apply energy to it. There's an instrument called GPU supercomputers. And what comes out of it are tokens that we enjoy. When you're interacting with ChatGPT, you're getting all -- it's producing tokens.

Now, that data center is not a normal data center. It's not a data center that you know of in the past. The reason for that is this. It's not shared by a whole lot of people. It's not doing a whole lot of different things. It's running one application 24/7. And its job is not just to save money, its job is to make money. It's a factory.

This is no different than an AC generator of the last industrial revolution. And it's no different than the raw material coming in is, of course, water. They applied energy to it and turns into electricity. Now it's data that comes into it. It's refined using data processing, and then, of course, generative AI models.

And what comes out of it is valuable tokens. This idea that we would apply this basic method of software, token generation, what some people call inference, but token generation. This method of producing software, producing data, interacting with you, ChatGPT is interacting with you.

This method of working with you, collaborating with you, you extend this as far as you like, copilots to artificial intelligence agents, you extend the idea as long as you like, but it's basically the same idea. It's generating software, it's generating tokens and it's coming out of this thing called an AI generator that we call GPU supercomputers. Does that make sense?

And so the two ideas. One is the traditional data centers that we use today should be accelerated and they are. They're being modernized, lots and lots of it, and more and more industries one after another. And so what is a trillion dollars of data centers in the world will surely all be accelerated someday. The question is, how many years would it take to do? But because of the second dynamic, which is its incredible benefit in artificial intelligence, it's going to further accelerate that trend. Does that make sense?

However, the second data center, the second type of data center called AC generators or excuse me, AI generators or AI factories, as I've described it as, this is a brand new thing. It's a brand new type of software generating a brand new type of valuable resource and it's going to be created by companies, by industries, by countries, so on and so forth, a new industry.

I also spoke about our new platform. People are -- there are a lot of speculations about Blackwell. Blackwell is both a chip at the heart of the system, but it's really a platform. It's basically a computer system. What NVIDIA does for a living is not build the chip. We build an entire supercomputer, from the chip to the system to the interconnects, the NVLinks, the networking, but very importantly the software.

Could you imagine the mountain of electronics that are brought into your house, how are you going to program it? Without all of the libraries that were created over the years in order to make it effective, you've got a couple of billion dollars' worth of asset you just brought into your company.

And anytime it's not utilized is costing you money. And the expense is too incredible. And so our ability to help companies not just buy the chips, but to bring up the systems and put it to use and then working with them all the time to make it -- put it to better and better and better use, that is really important.

Okay. That's what NVIDIA does for a living. The platform we call Blackwell has all of these components associated with it that I showed you at the end of the presentation to give you a sense of the magnitude of what we've built. All of that, we then disassemble. This is the hard -- this is the part that's incredibly hard about what we do.

We build this vertically integrated thing, but we build it in a way that can be disassembled later and for you to buy it in parts, because maybe you want to connect it to x86. Maybe you want to connect it to a PCI-Express fabric. Maybe you want to connect it across a whole bunch of fiber, okay, optics.

Maybe you want to have very large NVLink domains. Maybe you want smaller NVLink domains. Maybe you can use arm, maybe so on and so forth. Does it make sense? Maybe you would like to use Ethernet. Okay, Ethernet is not great for AI. It doesn't matter what anybody says.

You can't change the facts. And there's a reason for that. There's a reason why Ethernet is not great for AI. But you can make Ethernet great for AI. In the case of the ethernet industry, it's called Ultra Ethernet. So in about three or four years, Ultra Ethernet is going to come, it'll be better for AI. But until then, it's not good for AI. It's a good network, but it's not good for AI. And so we've extended Ethernet, we've added something to it. We call it Spectrum-X that basically does adaptive routing. It does congestion control. It does noise isolation.

Remember, when you have chatty neighbors, it takes away from the network traffic. And AI, AI is not about the average throughput. AI is not about the average throughput of the network, which is what Ethernet is designed for, maximum average throughput. AI only cares about when did the last student turn in their partial product? It's the last person. A fundamentally different design point. If you're optimizing for highest average versus the worst student, you will come up with a different architecture. Does it make sense?

Okay. And because AI has all reduce all to all, all gather, just look it up in the algorithm, the transformer algorithm, the mixture of experts algorithm, you'll see all of it. All these GPUs all have to communicate with each other and the last GPU to submit the answer holds everybody back. That's how it works. And so that's the reason why the networking is such a large impact.

Can you network everything together? Yes. But will you lose 10%, 20% of utilization? Yes. And what's 10% to 20% utilization if the computer is $10,000? Not much. But what's 10% to 20% utilization if the computer is $2 billion? It paid for the whole network, which is the reason why supercomputers are paid -- are built the way they are. Okay.

And so anyways, I showed examples of all these different components and our company creates a platform and all the software associated with it, all the necessary electronics, and then we work with companies and customers to integrate that into their data center, because maybe their security is different, maybe their thermal management is different, maybe their management plane is different, maybe they want to use it just for one dedicated AI, maybe they want to rent it out for a lot of people to do different AI with.

The use cases are so broad. And maybe they want to build an on-prem and they want to run VMware on it. And maybe somebody just wants to run Kubernetes, somebody wants to run Slurm. Well, I could list off all of the different varieties of environments and it is completely mind blowing.

And we took all of those considerations and over the course of quite a long time, we've now figured out how to serve literally everybody. As a result, we could build supercomputers at scale. But basically what NVIDIA does is build data centers. Okay. We break it up into small parts and we sell it as components. People think as a result, we're a chip company.

The third thing that we did was we talked about this new type of software called NIMs. These large language models are miracles. ChatGPT is a miracle. It's a miracle not just in what it's able to do, but the team that put it so that you can interact with ChatGPT in very high response rate. That is a world class computer science organization. That is not a normal computer science organization.

The OpenAI team that's working on this stuff is world class, is a world class team, some of the best in the world. Well, in order for every company to be able to build their own AI, operate their own AI, deploy their own AI, run it across multiple clouds, somebody is going to have to go do that computer science for them. And so instead of doing this for every single model, for every single company, every single configuration, we decided to create the tools and tooling and the operations and we're going to package up large language models for the very first time.

And you could buy it. You could just come to our website, download it and you can run it. And the way we charge you is all of those models are free. But when you run it, when you deploy it in an enterprise, the cost of running it is $4,500 per GPU per year. Basically, the operating system of running that language model.

Okay. And so the per instance, the per-use cost is extremely low. It's very, very affordable. And -- but the benefit is really great. Okay. We call that NIMs, NVIDIA Inference Microservices. You take these NIMs and you're going to have NIMs of all kinds. You're going to have NIMs of computer vision. You're going to have NIMs of speech and speech recognition and text to speech and you're going to have facial animation. You're going to have robotic articulation. You're going to have all kinds of different types of NIMs.

These NIMs, the way that you would use it is you would download it from our website and you would fine tune it with your examples. You would give it examples. You say the way that you responded to that question isn't exactly right. It might be right in another company, but it's not right in ours. And so I'm going to give you some examples that are exactly the way we would like to have it. You show it your work products. This is the way -- this is what a good answer looks like. This is what right answer looks like, whole bunch of them.

And we have a system that helps you curate that process that tokenize that, all of the AI processing that goes along with it, all the data processing that goes along with it, fine tuning that, evaluate that, guardrail that so that your AIs are very effective, number one, also very narrow.

And the reason why you want it to be very narrow is because if you're a retail company, you would prefer your AI just didn't pontificate about some random stuff, okay. And so whatever the questions are, it guardrails it back to that lane. And so that guard railing system is another AI. So, we have all these different AIs that help you customize our NIMs and you could create all kinds of different NIMs.

And we gave you some frameworks for many of them. And one of the very important ones is understanding proprietary data, because every company has proprietary data. And so we created a microservice called Retriever. It's state-of-the-art and it helps you take your database, which is structured or unstructured images or graphs or charts or whatever it is and we help you embed them.

We help you extract the meaning out of that data. And then we take the -- it's called semantics and what that semantic is embedded in a vector that vector is now indexed into a new database called vector database, okay. And that vector database, then afterwards you can just talk to it. You say, hey, how many mammals do I have, for example. And it goes in there and says, hey, look at that. You got a cat, you have a dog, you have a giraffe.

This is what you have in inventory, in your warehouse you have, okay, so on and so forth, all right. And so all of that is called NeMo and we have experts to help you. And then we put our -- we put a canonical NVIDIA infrastructure we call DGX Cloud in all of the world's clouds. And so we have DGX Cloud in AWS, we have DGX Cloud in Azure, we have DGX Cloud in GCP and OCI.

And so we work with the world's enterprise companies, particularly the enterprise IT companies and we create these great AIs with them, but when they're done, they can run in DGX Cloud, which means we're effectively bringing customers to the world's clouds. A platform like us, a platform company, brings system makers customers and CSPs are system makers.

They rent systems instead of sell systems, but they're system makers. And so we bring customers to our CSPs, which is a very sensible thing to do just as we brought customers to HP and Dell and IBM and Lenovo and so on and so forth and Supermicro and CoreWeave, so on and so forth, we bring customers to CSPs because a platform company does that. Does that make sense?

If you're a platform company, you create opportunities for everybody in your ecosystem. And so the DGX Cloud allows us to land all of these enterprise applications in the world CSPs. And they want to do it on-prem. We have great partnerships with Dell that we announced yesterday, HP and others, that you can land those NIMs in their systems.

And then I talked about the next wave of AI, which is really about industrial AI. This -- that the vast majority of the world's industries, the largest in dollars, are heavy industries and heavy industries have never really benefited from IT. They've not benefited from a lot of the design and all of the digital.

It's called not digitization, but digitalization, putting it to use. They've not benefited from digitalization, not like our industry. And because our industry is completely digitalized, our technology advance is insanely great. We don't call it chip discovery. We call it chip design. Why do they call it drug discovery, like, tomorrow could be different than yesterday? Because it is.

And it's so much -- it's so complicated -- it's so complicated biology, it's so changed -- and the longitudinal impact is so great, because, as you know, life evolves at a different rate than transistors. And so therefore, cause and effect is harder to monitor because it happens over a large scale, large scale of systems and large scale of time. These are very complicated problems. Physics is very similar.

Okay. Industrial physics is very similar. And so we finally have the ability using large language models, the same technologies. If we can tokenize proteins, if we could tokenize -- if we can tokenize words, tokenize speech, tokenize images, we can tokenize articulation. This is no different than speech, right?

We can tokenize proteins moving, that's no different than speech, okay. Just -- we can tokenize all these different things. We can tokenize physics then we can understand its meaning just like we've understood the meaning of words.

If we can understand its meaning and we can connect it to other modalities then we can do generative AI. So I just explained very quickly that 12 years ago I saw it, our company saw it with ImageNet. The big breakthrough was literally 12 years ago.

We said, interesting, but what are we actually looking at? Interesting, but what are you looking at? ChatGPT, I would say, everybody should say interesting, but what are we looking at? What are we looking at? We are looking at a computer software that can emulate you -- emulate us.

By reading our words, it's emulating the production of our words. Why -- if you can tokenize words and if you could tokenize articulation, for example, why can't it imitate us and generalize it in a way that ChatGPT has. So the ChatGPT moment for robotics has got to be around the corner. And so we want to enable people to be able to do that. And so we created this operating system that enables these AIs to be able to practice in a physically based world and we call it Omniverse.

Omniverse is not a tool. Omniverse is not even an engine. Omniverse are APIs, technology APIs that supercharge other people's tools. And so I'm super excited about the announcement with Dassault. They're using -- they're connecting to Omniverse API to supercharge 3DEXCITE. Microsoft is connected it to Power BI.

Rockwell has connected it to their tools for industrial automation. Siemens has connected to their, so it's a bunch of APIs that is physically based and it produces image or articulation and it connects a whole bunch of different environments. And so these APIs are intended to supercharge third party tools. And I'm super delighted to see the adoption across it, particularly in industrial automation. And so those are the five things that we did.

I'll do this next one very quickly. I'm sorry I took longer than I should, but let me do this next one really quickly. Look at that. All right. So this chart, don't over stare at it, but it's basically, it communicates several things. On top are developers. NVIDIA is a market maker, not share taker. The reason for that is everything we do doesn't exist when we started doing it. There is no such -- you just go up and down. In fact, even in originally 3D computer games didn't exist when we started working on it.

And so we had to go create the algorithms necessary. Real time ray tracing did not exist until we created it. And so all of these different capabilities did not exist until we created it. And once we created it, there are no applications for it. So we had to go cultivate and work with developers to integrate this technology we have just created so that applications could be benefited by it.

I just explained that for Omniverse. We invented Omniverse. We didn't take anything from anybody, didn't exist. And in order for it to be useful, we now have to have developers, Dassault, Ansys, Cadence, so on and so forth. Does that make sense? Rockwell, Siemens.

We need the developers to take advantage of our APIs, our technologies. Sometimes they're in the form of an SDK. In the case of Omniverse, I'm super proud that it's in the form of cloud APIs, because now it's so easy to use that you could use it in both ways, but APIs are much, much easier to use, okay. And we host Omniverse in the Azure cloud. And notice whenever we connect it to a customer, we create an opportunity for Azure.

So Azure is on the foundation, their system provider. Back in the old days, system providers used to be OEMs and they continue to be, but system providers on the bottom, developers on top. We invent technology in the middle. The technology that we invent happens to be chip last.

It's software first. And the reason for that is without a developer, there will be no demand for chips. And so NVIDIA is an algorithm company first and we create these SDKs. They call them DSLs, domain specific libraries. SQL is a domain specific library. You might have heard of Hadoop is a domain specific library in storage computing.

NVIDIA's cuDNN is potentially the most successful domain specific library short of SQL the world has ever seen. cuDNN is the domain specific library. It's computation engine library for deep neural networks. Without DNN, none of them would have been able to use CUDA. So DNN was invented.

Real time ray tracing optics, which led to RTX, makes sense. And we have hundreds of domain specific libraries. Omniverse is a domain specific library. And these domain specific libraries are integrated with developers on the software side, which then when the applications are created and there's demand for that application, creates opportunities for the foundation below. We are market makers, not share takers. Does that make sense?

And so what's the takeaway? The takeaway is you can't create markets without software. It has always been the case. That has never changed. You could build chips to make software run better, but you can't create a new market without software. What makes NVIDIA unique is that we're the only chip company I believe that can go create its own market and notice all the markets we're creating.

That's why we're always talking about the future. These are things that we're working on. We really -- nothing would give me more joy to work with the entire industry to create the computer aided drug design industry, not drug discovery industry, drug design industry.

We had to do drug designed the way we do drug chip design not chip discovery. And so I expect every single chip next year to be better than the one before, not as if I'm looking for truffles, which is discovery. Some days are good, some days are less good.

Okay, all right. So we have developers on top. We have our foundation on the bottom. The developers want something very, very simple. They want to make sure that your technology is performing, but they have to solve the problem, that they couldn't solve any other way. But the most important thing for a developer is installed base. And the reason for that is they don't sell hardware, their software doesn't get used if nobody has the hardware to run it.

Okay. So what developers want is installed base that has not changed since the beginning of time, is has not changed now. Artificial intelligence, if you develop artificial intelligence software and you want to deploy so that people could use it, you need installed base.

Second, the systems companies, the foundation companies they want killer apps. That's the way -- that's the reason why killer app word existed because where there is a killer app, there is customer demand, where there is customer demand, you can sell hardware.

And so, it turns out this loop is insanely hard to kick-start. And how many accelerated computing platforms can you really, really build? Can you have an accelerated computing platform for generative AI as well as industrial robotics, as well as quantum as well as 6G as well as weather prediction as well.

And you can have all these different versions because some of it is good at fluids. Some of it's good at particle. Some of it is good at biology. Some of it is good at robotics. Some of it is good at AI. Some of it is good at SQL. The answer is no. You need a general -- sufficiently general purpose accelerated computing platform. Just as the last computing platform was insanely successful because they ran everything.

Now NVIDIA is taken us a long time, but we basically run everything. If your software is accelerated, I am very certain, it runs on NVIDIA. Does that makes sense? Okay. If you have accelerated software, I am very, very certain it runs on NVIDIA. And the reason for that is because it probably ran on NVIDIA first.

Okay. All right. So this is the NVIDIA architecture. I spoke about whenever I give keynotes, I tend to touch on all of them, different pieces of it, something that -- some new things that we did in the middle, in this case, Blackwell. I spoke about there were so many good stuff and you really have to go to our tox, looks like a 1000 tox. 6G research, how 6G going to happen? Of course, AI. And why do you use the AI for? Robotic MIMO.

Why is MIMO so pre-installed meaning that, why does the algorithm come before the site. We should have site-specific MIMO just like Robotic MIMO. And so, reinforcement learning and the deals with the environment and so 6G of course is going to be software-defined, of course, it's going to be AI.

Quantum Computing, of course, we should be a great partner for the quantum computing industry. How else are you going to drive a Quantum Computer? To have the world's fastest computer sitting next to it.

And how are you going to stimulate a Quantum Computer, emulate the Quantum Computer? What is the programming model for Quantum Computer? You can't just program a Quantum Computer all by itself. You need to have classical computing sitting next to it. And sort of quantum would be kind of a quantum accelerator.

And so that -- who should go do that, well we've done that and so we work with all the industry on that. So across the board, some really, really great stuff. I wish I could have covered, we could have a whole keynote just on all that stuff. But we cover the whole gamut. Okay. So that was kind of yesterday. Thank you for that.

Question-and-Answer Session

A - Colette Kress

Okay. We have them going around and we'll see if we can grab your questions.

Jensen Huang

That was the question that I'm sure, first question goes. If you could have -- done the keynote in 10 minutes, why didn't just do yesterday in 10 minutes? Good question.

Ben Reitzes

Yeah. Hi, Jensen.

Jensen Huang

Hi.

Ben Reitzes

Ben Reitzes with Melius Research. Nice to see you. Thanks for being here.

Jensen Huang

Thank you, Ben.

Ben Reitzes

It's a big thrill, I think for all of us. So I wanted to ask you a little bit more about your vision with software. You are creating industries. You have a full-stack approach. It's clear, your software makes your chips run better. Do you feel that your software business over the long term could be as big as your chip businesses? How do you look at -- if we look in 10 years are you -- and you're not a chip company, but what do you think, you look like given what you're seeing with the momentum in software and how you're building these industries. It would seem like you're going to be a lot more.

Jensen Huang

Yeah. Thank you, Ben. I appreciate that. First of all, I appreciate all of you coming. This is a very, very different type of event as you know. Most of the talks are software talks, and they're all computer scientists, and they're talking about algorithms. What NVIDIA -- the NVIDIA software stack is about two things. It's either algorithms that help the computer run better, TensorRT-LLM. It's an insanely complicated algorithm, and it explores the computing space in a way that most compilers never have to do. And TensorRT-LLM can't even be built without a supercomputer. And it's very likely that TensorRT in the future, TensorRT-LLM in the future, actually just have to run on a supercomputer all the time and in order to optimize AIs for everybody's computer. And so that optimization problem is very, very complicated. So that would be an example of software that we create, the optimization, the runtime. The second software we create is whenever there's an algorithm where the principled algorithm is well known. For example, Navier-Stokes, however --Schrodinger's equation, however, maybe the expression of it in a supercomputing or accelerated computing or real-time way ray tracing is a great example. Real-time way has never been discovered. Does that make sense? Okay. And so, as you know, Navier-Stokes is insanely complicated algorithm. And to be able to refactor that in a way that can run in real-time is insanely complicated as well and requires a lot of invention and some of the inventions, some of our computer scientists in our company have Oscars. There's award-winning computer scientists because they've solved these problems at such a large scale that you use it for movies. And their inventions are, their algorithms are, their data structures are computer science in itself. Okay. And so we'll dedicate ourselves to these two layers. And then when you package it -- all back in the old days, that's useful for entertainment, media entertainment, science, so on and so forth. But today, because AI has brought this technology so close to application, simulating molecules used to be a thing that you do in universities. Now you can do that at work. So as we now reformulate all of these algorithms for the consumption of enterprise, it becomes enterprise software. Enterprise software like nobody's ever seen before. We call them -- we're going to put them in NIMs, these packages. We'll have hundreds of them, and we'll manufacture these things and support them and maintain them and keep them performant and so on, to support customers with it. And so we'll produce NIMs at a very large scale, is my guess. And this is going to be, we call that underneath the entire bucket of software, we call NVIDIA AI Enterprise. A NIM is basically an AI in a microservice for enterprise. And so my expectation is that this is going to be a very large business, and this is the part of the industrial revolution. If you saw that, there's the IT industry today, SAPs and great companies, ServiceNow's and Adobe's and Autodesk and Canes, that layer, that's today's IT industry. That's not where we're going to play. We're going to play on the layer above. That layer above is a bunch of AIs and these algorithms, really, we are the right company to go build them. And so we'll build some with them, we'll build some ourselves, but we'll package them up and deploy it at enterprise scale. Okay. And so I appreciate you asking the question. And while she's walking there. Go ahead. Yeah.

Vivek Arya

Hi. Vivek Arya from Bank of America Securities. Thank you, Jensen. Thank you Colette for the presentation. So Jensen my question is perhaps a little more near to medium term, which is just the size of the addressable market, because your revenues have gotten big so quickly. And when I look at how much they represent as a percentage of the spending of some of your large customers they are like 30%, 40%, 50%, sometimes more, but when I look at how much money they are generating from generative AI is like less than 10% of their sales. So, how long can this gap persist? Right. And then more importantly, are we kind of midway through how much of their spending can be spent on your products? So just I think in the past you have given us kind of a trillion-dollar market, going to $2 trillion. If you could just educate us on how large the market is? And where are we in that adoption curve based on how much it can be -- based on how much it's being monetized in the near-to-medium term?

Jensen Huang

Okay. I'm going to first give you the super-condensed version, and I'll come back and work it out. Okay. So the answer for how big the market is? How big we can be has to do with the size of the market and what we sell. Remember, what we sell is a data center. I just broke it into parts. But in the end, I sold the data center. Notice that the last image you saw at the keynote, it's a reminder of what we actually sell. We showed a bunch of chips. But remember, we don't really sell that. The chips don't work all by themselves. You can buy the chips, but they don't work. You need to build them into our system. And most importantly, the system software and the ecosystem stack is really complicated. And so NVIDIA builds entire data centers for AI. And we just break it up into parts of that. It fits into your company. So that's number one. What do we sell? And what is the opportunity? The opportunity for the world today, the data center size is $1 trillion. Right. And it's a $1 trillion worth of installed, $250 billion a year. We sell an entire data center in parts and so our percentage of that $250 billion per year is likely a lot, lot, lot higher than somebody who sells a chip. It could be a GPU chip or CPU chip or networking chip. That opportunity hasn't changed from before. But what NVIDIA makes is an accelerated computing platform data center scale. Okay. And so our percentage of $250 billion will likely be higher than the past. Now, second question. How sustainable is it? There are two answers for that. One reason that you buy NVIDIA is for AI. If you just build TPUs, if your GPU is only used for one application, then you have to hang your hat on a 100% of that. What can you monetize of AI today? Token generation returns. However, if your value proposition is that AI token generation but that AI training the model and very importantly, reducing the cost of expense of computing, accelerated computing, sustainable computing, energy-efficient computing that's what NVIDIA does for a living at its core. It's just we did it so well that generative AI was created. Okay. And now people forgot that it's a little bit like our first application was computer graphics. And the first application was games. We did that so well, we did it so passionately people forgot, we are accelerated computing company. They thought, hey, you're a gaming company, and a whole generation of young people grew up. And once they learn, they use RIVA 128 and they went to college with GeForce, and then when they finally became an adult, they thought you were a gaming company. And so -- we just do -- we do accelerated computing so well. We do AI so well, people think that that's all we do. But accelerated computing is a trillion -- it's $250 billion a year. $250 billion a year should go to accelerated computing with or without AI, just for the sake of a sustainable computing, just to process SQL, which is, as you guys know, one of the largest consumption of computing in the world. Okay. So I would say $250 billion a year should go to accelerated computing no matter what. And then on top of there is generative AI. How sustainable do I think generative AI is going to be? You know how I feel about it. I think we're going to be generating words, images, videos, proteins, chemicals, kinetic action, manipulation. We're going to be generating forecasts. We're going to be generating bill plans. We're going to be generating bill of materials, we're going to be generating list goes on.

Stacy Rasgon

Hi, Jensen, Colette. Thanks. It's Stacy Rasgon, Bernstein Research. I wanted to ask about the interplay between CPUs and GPUs. Most of the benchmarks, if not all of them, that you showed yesterday, were really around the Grace Blackwell system that had, I guess, two GPUs and one CPU sort of doubled the CPU per GPU ratio versus Grace Hopper. You didn't talk a lot about benchmarks relative to the standalone GPUs. Is this a shift? Are you guys looking for much more CPU content, I guess, in these AI servers going forward? And then how do I think about the interplay between the ARM CPUs that you're developing and x86 seems like you're putting a little less emphasis on the x86 side of things going forward.

Jensen Huang

Yeah, Stacy. Appreciate the question. You the -- there is actually zero concern about either one of them. I think x86 and ARM are both perfectly fine for data centers. There's a reason why Grace is built, the way it is. Grace is built in such a way, the benefit of ARM is that we could mold the NVIDIA system architecture around the CPU. So that we can create this thing called chip to chip, the NVLink that connects between the GPU and the CPU. We can make the two sides coherent, meaning, when the CPU touches a register it invalidates the same register on the GPU side. As a result, the two sides can work together on one variable coherently. You can't do that today between x86 and peripherals and so we were able to solve some problems that we couldn't solve otherwise. And as a result, Grace Hopper is insanely great for CAE applications which is multi-physics. Some of it is running on CPUs, some of it is on GPUs. It's insanely great for different combinations of CPU and GPUs. So that we can have very large memories associated with each maybe one GPU or two GPU coherently. And so we can solve some of these problems, data processing, for example, insanely great on Grace Hopper. Okay. And so it's just harder to solve not because the CPU itself but because we couldn't adopt the system. Second, the reason why I showed I will say that there was one chart where I showed Hopper versus Blackwell on x86 systems B100, B200 and then also GB200 which is the Grace Blackwell. The benefit of Blackwell in that case wasn't because the CPUs better. It's because in the case of Grace Blackwell we were able to create a larger NVLink domain. And that larger NVLink domain is really, really important for the next generation of AI. The next three years, the next three -- five years, which is, as far as we can see right now. If you really want a good inference performance, you're going to need NVLink. That was the message, I was trying to deliver. And we're going to talk more about this. It's abundantly clear now, these large language models, they're never going to fit on one GPU. Okay. That's not the point, anyways. And in order for you to be sufficiently responsive and have high throughput to keep the cost down, you need a lot more GPUs than what you even fit in. And in order to have a lot of GPUs working together without the overhead, the IO overhead getting in the way you need NVLink. NVLinks benefit and inference every -- always thought NVLinks benefit is in training. NVLinks benefit and inference is off the charts. That's the difference between 5X and 30X that was another 6X, it's all NVLink. NVLinks in the new Tensor Core, excuse me. Yeah, okay. And so the Grace gives us the ability to architect a system exactly as we needed and it's harder to do it with x86. That's all. But we support both. We'll have two versions of both. And in the case of B100 it just slides into where H100 and H200 goes into. And so the adoption of transition for Hopper to Blackwell is instantaneous. The moment it's available you just slide it in and then you can figure out what to do about the next data center. Okay. So we get the benefit of extremely excellent performance at its limit of the architecture as well as easy-peasy transition.

Stacy Rasgon

Thank you.

Matt Ramsay

Hey there. It's Matt Ramsay from TD Cowen. Hey, Jensen, Colette. Thank you. Good morning for doing this. I wanted to -- Jensen for you to comment on a couple topics that I've been noodling on. One of which is NIMs that you guys talked about yesterday, it seems like a vertical-specific accelerant for people to get into AIE and onboard customers more quickly. I wonder if you could just give us an overview of how your company is going at broader enterprise and just what different vehicles there are for people to onboard into AI? The second topic is on power. My team has been spending a good bit of time on power. I'm trying to decide if I should spend more time there or less. Some of the systems you introduced yesterday are up to 100 kilowatts or more. I know that scale of computing couldn't be done without the integration that you guys are doing, but also we are getting questions on power generation at the macro-level, power delivery to the cabinet at that density. I just would love to hear your thoughts about how your company is working with the industry to power these systems. Thanks.

Jensen Huang

Okay. I'll start with the second first. Power delivery, 100 kilowatts as you know for computer is a lot, but 100 kilowatts is a commodity, you guys know that, right. The world needs a lot more than 120 kilowatts. And so the absolute amount of power is not an issue. The delivery of the power is not an issue. And the physics of delivering the power is not an issue. And cooling 120 kilowatts is not an issue. We can all agree on that. Okay. And so none of this is a physics problem. None of this requires invention. All of it require supply chain planning. Makes sense. So that's the way. And how big of a deal is supply chain planning? A lot. I mean, we take it very seriously. And so we think about supply chain planning for all the time and you got to go at, the reason why we have great partnerships with. If you go -- I think if you look at Vertiv, I think the front pages of paper that we wrote together. So Vertiv and NVIDIA engineers working on cooling systems. Okay. And so Vertiv is very important in the supply chain of designing liquid cooled and otherwise data centers. We have great partnerships with Siemens. We have great partnerships with Rockwell, Schneider for all the reasons. This is exactly the same as having great partnerships with TSMC and Samsung and SPIL and Wistron and so on and so forth. And so we're going to have to go -- our company supply chain relationships are quite broad and quite deep. And thus the fact that we build our own data centers, really help that. We've been building supercomputers now for quite some time. This is not our first time. Our first supercomputer was DGX-1 in 2016 that kind of puts in perspective. And we've built one every year and this year we're building several. And so the fact that we're building it, it gives us tactile sensation of who we're working with, who are the best and we do it for that very reason, one of the reasons for that. NIMs. There are two onboard. Two-ways to onboard into enterprise. There is the most impactful way. And then there's the other way. Okay. They're both important. The other -- I'll start with the other. The other way is that we are going to create these NIMs. We are going to put it on our website. And we're going to go through GSIs and a lot of solution providers and they're going to help companies turn these NIMs into applications. And that's going to have a whole thing. That's going to have a whole thing, okay. And so that go-to-market includes large GSIs and smaller specialized GSIs and so on and so forth okay. We have lots of partnerships in that area. The other area that I think it's really quite exciting. And I think that this is really where big action is going to happen is the trillion dollars of enterprise companies in the world. They create tools today. In the future they're going to offer you tools plus copilots. Remember, the single most pervasive tool in the world is Office. And now copilots for office. There is another tool that is super important to NVIDIA Synopsys, Cadence, Ansys. We would like to have copilots for all of them. Notice we were building copilots for our own tools. We call them ChipNeMo. And ChipNeMo is super smart. And ChipNeMo now understands NVIDIA Lingo, NVIDIA Chip Talk and it knows how to program NVIDIA programs. And so every engineer that we hire the first thing we're going to tell them, here's ChipNeMo, and then there's the bathroom, and then there's the cafeteria, and so, in that order. And so they will be productive right away whether you lunch, they could ChipNeMo could be doing some stuff. And so that just gives you an example. But we have copilots are being built on top of our own tools all over the place. Most companies probably can't do this, and we can teach the GSIs to do this, but in the area of these tools Cadence and others, they're going to build their own copilots. And they will rent them out as hire them out as engineers. I think they're sitting on a goldmine. SAP is going to do that. ServiceNow is going to do that and they're very specialized copilots. They understand languages like -- in the case of SAP, ABAP is that right, which is a language that only an SAP lever would love and as you know, ABAP is a very important language for the world's ERP systems. Every company runs on it. We use ABAP. And so now they have to go create a Chat ABAP and that Chat ABAP, just like ChipNeMo or ChatUSD that we created for Omniverse and so Siemens will do that, Rockwell will do that, so on so forth. Does that makes sense? And that I think is another way, you get to enterprise and that ServiceNow is going to do that. Lots and lots of copilots they're building. And that's how they can create another industry on top of their current industry, it's almost like an AI workforce industry. Yeah. I am super excited about the partnerships we have with all of them. Just I'm so excited for them. Every time I see them, I just -- I tell them, anywhere you're sitting on a goldmine, you're sitting on a goldmine. I mean I'm so excited for them.

Tim Arcuri

Jensen, hi. It's Tim Arcuri at UBS. I had a question also about the TAM and it's more greenfield versus brownfield, because up until now H100 was pretty much all greenfield. So people weren't taking A100s and ripping them out and replacing them with H100s, could B100 be the first time, where you see some brownfield upgrades where we go in and we rip out A100s and we replace them with B100s? So that may be the TAM, if the $1 trillion goes to $2 trillion, you have a four-year replacement cycle. You're talking about $500 billion, but much of that growth comes from upgrading the existing installed base. Wondering if you can comment on that.

Jensen Huang

Yeah, really good question. Today, we are upgrading the slowest computers in the data center, which will be the CPUs. And so that's what should happen. And then eventually you'll get around to the Amperes and then you get around to the Hoppers. I do believe that in five, six, seven, eight years, you're going to give you --we're going to be in -- picker you're out there I'm not picking one. I'm just saying in the outer years, you're going to start seeing replacement cycles, obviously, of our own infrastructure. Yeah, but, I wouldn't think that that's the best utilization of capital at the moment. Amperes are super productive as you know.

Brett Simpson

Yeah. Hi, Jensen. It's Brett Simpson here at Arete Research, and thanks for hosting a great event this last couple of days. My question was on Inference. I wanted to get your perspective on -- you put up some good performance numbers with the B100 in terms of how Inference compares with H100. How -- what's the message you're giving to customers on cost of ownership around this new platform? And how do you think it's going to compare with ASICs or other Inference platforms in the industry? Thank you.

Jensen Huang

I think for language models, large language models Blackwell with the new transformer engine and NVLink is going to be very, very, very hard to overcome. And the reason for that is the dimensionality of the problem is so large. And TensorRT-LLM this exploration tool, this optimization compiler that I talked about. The architecture underneath the Tensor Cores are programmable. NVLink allows you to connect up whole bunch of GPUs working in tandem with very, very low overhead, basically no overhead. Okay. And so as a result, 64 GPUs is the same as one programmatically. It is incredible. And so when you have 64 GPUs without overhead without this NVLink overhead, if you have to go over the network like Ethernet, it's over. You can't do it. You just wasted everything. And because they all have to communicate with each other, it's called all2all. Whenever all have to communicate each other the slowest link is the bottleneck, right. It's no different than having a city on one side of the river having a city on the other side of the river that bridge that's it. That's the throughput doing -- that defines the throughput. Okay. And that bridge will be Ethernet. On one side is NVLink, on the other side is NVLink Ethernet in the middle makes no sense. So we have to turn that into NVLink. And now we have all of the GPUs working together generating tokens one at a time. Remember the tokens cannot be -- it's not as if you splat out a token because tokens the transformer has to generate the tokens one at a time in sequence. And so this is a very complicated parallel computing problem, okay. And so I think the -- I think Blackwell has raised the bar a lot. Just mountains. Utterly mountains, ASIC or otherwise.

C.J. Muse

Hello, Jensen and Colette. C.J. Muse with Cantor. Thank you for hosting this. And it's great to see you both. Question on your pricing strategy. Historically, you talked about the more you buy, the more you save. But it sounds like initial pricing on Blackwell is coming in at perhaps maybe a lower premium than the productivity that you're offering. So, curious, as you think about maybe razor, razor blade and selling software and the full system, how that might cause you to kind of evolve your pricing strategy and how we should think about kind of normalized margins within that construct? Thank you.

Jensen Huang

The pricing that we create always starts from TCO. I appreciate that comment, C.J. We always come from TCO. However, we also want to have the TCO not of the main body of customers. And so when the customers -- when you only have one particular domain of customers, let's say, molecular dynamics, then if it's only one application, then you set the TCO based on that one application. It could be a medical imaging system. And all of a sudden, the TCO is really very, very high, but the market size is quite small. In every single generation that goes by, our market size is growing, isn't that right? And we want to make the entire market be able to afford Blackwell. And so in a way, it's kind of a self-carrying problem. As we solve for the TCO for a much larger problem -- larger market then some customers would get too much value, if you will. But that's okay. But you're making the business simpler, having one basic product and you're able to support a very, very large market. Now, over time, if the market were to bifurcate, then we can always segment, but that's we're nowhere near that today. And so I think we have the opportunity to create a product that delivers extraordinary value for many and extremely good value for all. And that's our purpose.

Joseph Moore

Hi. Joe Moore from Morgan Stanley. It seems like the most impressive specs that you showed were around GB200, which you just described as a function of having that bigger NVLink domain. Can you contrast what you're doing with GB200 with what you did with GH200? And why you think it could be a much bigger product this time around?

Jensen Huang

Oh, great question. The simple answer is GH200, 100, 200, Grace Hopper, before it could really take off significantly, Grace Blackwell is already here. And Grace Hopper had the additional burden that Hopper didn't have. Hopper fit right into where Ampere left off. A100s went to H100s, they're going to go to B100s, so on and so forth. And so that particular chassis or that particular use case is fairly well established and we'll just keep on moving. Software is built for it. People know how to operate it so on and so forth. Grace Hopper is a little different and it addressed a new class of applications that we didn't address very well before. And I was mentioning some of it earlier. Multi physics problems with a CPU and GPU was having to work closely together, very large data sets, so on and so forth. Difficult to paralyze, for example, those kind of problems Grace Hopper was really good for. And so we started developing software for that. My recommendation for most customers is, at this point, just gear for Grace Blackwell and I have given them that recommendation. And so everything that they do with Grace Hopper will be completely architecturally compatible. That's the wonderful thing. And so, whatever they have, whatever they buy is still fantastic, but I would recommend that they put all their energy into Grace Blackwell because it's so much better.

Unidentified Analyst

Jensen, Collete, thanks for having us here today. I want to ask a question on robotics. It seems like every time we come back to GTC, you sneak something at the end. And in a couple years, we go, wow, he has been talking about that for a while. I heard this week you guys mentioned that robotics may be getting close to its ChatGPT moment. Can you describe what that means and where you start to see that robotics evolution kind of like our day to day lives? That would be super helpful. Thank you.

Jensen Huang

Okay, several things. First of all, I appreciate that. I showed Earth-2, two years ago. And two years later, we have this new algorithm that is able to do regional weather prediction at 3 kilometers. The supercomputer you need to do that is 25 times larger, excuse me, 25,000 times larger than the one that you currently use to do weather simulations at NOA and in Europe and so on and so forth. 3 kilometer resolution is very high resolution, if you will, right above your head, okay. And weather simulation also requires a whole lot of what is called ensembles because the world looks chaotic and you want to simulate a lot of distribution, sample a lot of different parameters, a lot of different perturbations, and try to figure out what is that distribution and that the middle of that distribution likely is going to be the weather pattern. Well, if it takes that much energy just to do it one time, they're not going to do it more than one time. But in order to predict where weather is going to be a week from now, especially extreme weather that can change so dramatically, you're going to need a lot of what they call members, a lot of ensemble members, a lot of samplings. And so you're basically doing -- we're basically doing weather simulation 10,000 times, okay. And because we train an AI to understand physics and it's physically possible and it can't hallucinate, so it has to understand the laws of physics and such. And so two years ago, I showed it today and we connected into the most trusted source of weather in the world, the weather company. And so we're going to help people do regional weather all over the world. If you're a shipping company and you need to know weather conditions. If you're an insurance company, you need to know weather conditions. If you're in the Southeast Asia region, you have so many hurricanes and typhoons and things like that, you need some of this technology. And so we're going to help people adapt it for their region and their use case. Well, I did that a couple of years ago. The ChatGPT moment kind of works like this. Take a step back and ask yourself what happened with ChatGPT? The technology is insanely great, okay. It's really incredible. But there's several things that happened. One, it learned from a whole lot of human examples. We wrote the words, right? It was our words. So it learned from our human examples and it generalized it. So it's not repeating back the words. So it can understand the context and it can generate a regional form. It understood the context meanings that it adapted to itself, okay, or it adapted to the current circumstance, the context. And then the third thing is, it could now generate original tokens. Now I'm going to take everything back into tokens. Forget words, just tokens now. Use all the same words that I just use, but replace words with tokens. If I could just figure out how to communicate with this computer, what this token means? Okay, if I can just tokenize this. Just as when you do speech recognition, you tokenized my sound, my voice. Just as when we reconstructed proteins, we tokenized the amino acids. You can tokenize almost everything. You can digitize a simple way of representing each chunk of the data, okay. So once you can tokenize it, then you can learn it. We call it learning the embeddings of it, the meanings of it. And so if I can tokenize motion, okay, the world and I can generalize and I can tokenize articulation, kinematics, and I can learn and generalize it and then generate just. I just did the ChatGPT moment, how is it any different? The computer doesn't know. Now, of course, the problem space is a lot more complicated because it's physical things. So, you need this thing called alignment. And what was the great invention of ChatGPT, reinforcement, learning human feedback alignment. Is that right? So, it would try something. You say no, that's not as good as this. It would try something else. You said, no, that's not as good as this. Human feedback, reinforcement learning and it keeps -- it takes that reinforcement and improves itself. And so what is Omniverse for? Well, if it's in a robot, then how would you do feedback? And what is feedback about? It's physical feedback, physics feedback. It generalized -- it generated a movement to go pick up a cup, but it tipped a cup over. It needs the reinforcement learning to know when to stop. Does that make sense? And so that feedback system is not human. That feedback system is physics. And that physics simulation feedback is called Omniverse. So Omniverse is reinforcement learning, physical feedback, which grounds the AI to the physical world, just as reinforcement learning human feedback grounds the AI to human values. Are you guys following me? I just described two completely different domains using exactly the same concepts. And so what I've done is I've generalized general AI. And by generalizing it, I can reapply it somewhere else. And so we made this observation some time ago and we started preparing for this. And now you're going to find that Isaac Sim, which is a gym on top of Omniverse is going to be super, super successful for just about anybody who is doing these robotic systems. We've created the operating system for robots. I'm sure there's a corporate answer for all the questions you guys ask, but unfortunately, I only know how to answer the one geek way.

Atif Malik

Hi. I am Atif Malik from Citigroup. I have a question for Colette. Colette in your slides, you talked about availability for the Blackwell platform later this year. Can you be more specific? Is that the October quarter or the January quarter? And then on the supply chain, readiness for the new products is the packaging, particularly on the B200 CoWoS-L and how you are getting your supply chain ready for the new products?

Colette Kress

Yeah, so let me let me start with your second part of the question, talking about the supply-chain readiness. That's something that we've been working well over a year getting ready for these new products coming to market. We feel so privileged to have the partners that work with us in developing out our supply chain. We've continued to work on resiliency and redundancy. But also, you're right, moving into new areas, new areas of CoWoS, new areas of memory, and just a sheer volume of components and complexity of what we're building. So that's well on its way and will be here for when we are ready to launch our products. So there is also a part of our supply chain as we talked earlier today, talking about the partners that will help us with the liquid cooling and the additional partners that will be ready in terms of building out the full of the data center. So this work is a very important part to ease the planning and the processing to put in all of our Blackwell different configurations. Going back to your first part of the question, which is when do we think we're going to come to market? Later this year, late this year, you will start to see our products come to market. Many of our customers that we have already spoken with talked about the designs, talked about the specs, have provided us their demand desires. And that has been very helpful for us to begin our supply chain work, to begin our volumes and what we're going to do. It's very true though that on the onset of the very first one coming to market, there might be constraints until we can meet some of the demand that's put in front of us. Hope that answers the question.

Jensen Huang

Yeah, That's right. And just remember that Hopper and Blackwell, they're used for people's operations and people need to operate today. And the demand is so great for Hoppers. They -- most of our customers have known about Blackwell now for some time, just so you know. Okay, so they've known about Blackwell. They've known about the schedule. They've known about the capabilities for some time. As soon as possible, we try to let people know so they can plan their data centers and notice the Hopper demand doesn't change. And the reason for that is they have an operations they have to serve. They have customers today and they have to run the business today, not next year.

Atif Malik

Okay.

Pierre Ferragu

Pierre Ferragu, New Street Research. So, like a geeky question on Blackwell and --

Jensen Huang

Thank you.

Pierre Ferragu

The two dyes and the 10 terabytes between the two dyes, can you tell us about how you achieve that? How much work you've put over the years into being able to achieve that technically like from a manufacturing standpoint? And then how you see the future in your roadmap, looking further away? Do you think we're going to see more and more dyes getting together into a single package? So that's one side of my question, which is more like on the chip and the architecture. And the other side is, you must be seeing like all these models that are like Sam Altman said, behind the veil of ignorance. And so can you tell us about what you see and how you see the next generation of models influencing your architecture? And so what's the direction of travel for GPU architecture for data center AI?

Jensen Huang

Yeah, I'll start with the second. This is one of the great things about being the platform where all AI research is done. And so we get the benefit of seeing everything that's coming down the pike. And, of course, all next generation models are intended to push the limits of current generation systems to its limit. And so large context windows, for example, insanely large context windows, state space vectors, synthetic data generation, essentially models talking to themselves, reinforcement learning, essentially AlphaGo of large language models, Tree Search. These models are going to have to learn how to reason and do multipath planning. And so instead of one shot, it's a little bit like us thinking we have to work through our plan. And that planning system, that reasoning system, multistep reasoning systems could be quite abstract and the path could be quite long, just like playing go. And so -- but the constraints are much, much more difficult to describe. And so this whole area of research is super, super exciting. The type of systems that we're going to see in the next several years, a couple of two, three years, is unimaginable compared to today for the reasons I described. There are some concern about the amount of Internet data that's available for training these models, but that's just not true. 10 trillion tokens is great, but don't forget, synthetic data generation, models talking to each other, reinforcement learning, the amount of data you're going to be generating, it's going to take two computers to train each other. Today we have one computer training on data. Tomorrow it's going to be two computers, just -- right? Don't forget. Remember, AlphaGo. It's multiple systems competing against -- playing against each other, okay, so that we could do that as quickly as possible. So some really exciting ground-breaking work around the corner. All right. The one thing that we're certain is that the scale of these -- the scale of our GPUs, they want to be even bigger. The SerDes of our company is world class. NVIDIA's SerDes are absolutely the world's best. The data rate and the energy consumed, the data rate, the picojoule per bit in our company is unbelievably good. It is the reason why we were able to do NVLink. Remember, NVLink was because we could not make a chip big enough and so we connected eight of them together. This is in 2016. We're on NVLink Gen 5. The rest of the world doesn't even have NVLink Gen 1 yet. NVLink Gen 5 allows us to connect 576 chips together. They are together as far as I'm concerned. The data center is so big, does it have to be this close together? No, not at all. And so it's okay to split them up 576 ways. And the SerDes are so low energy anyways. Now we could make even closer chips. Now, the reason why we want that is because then the software cannot tell the difference. When you break up chips, the algorithm should be build the largest chip that lithography can make and then put multiple of them together. However, whatever technology is available to do so. But you start by building the largest chip ever. Otherwise, why didn't we do multichip back in the old days? We just kept pushing, right, monolithic as far as. And the reason for that is because the data rate on chip and the energy on chip allows for the programming model to be as uniform as possible. You don't have these things called, speaking of geeking out NUMA, non-uniform memory access, right? So you don't have NUMA behavior. You don't have weird cache behavior. You don't have memory locality behavior, which causes the programs to work differently depending on the nodes that the systems they run on. We want our software to run exactly the same wherever they are. And so you start with the biggest chip possible. That's the first Blackwell dye. We connect the two of them together. The technology 10 terabytes per second is insane. Nobody's ever seen 10 terabytes per second link before. That's 10 terabytes per second and it obviously consumes very little power, otherwise it would be nothing but that link. And so we -- you had to solve that number one. The second thing you had to solve was the question that came up before was CoWoS. It's the largest CoWoS in the world, because the first generation CoWoS was already the largest CoWoS in the world. Now the second generation is even larger. The benefit that we have is we're not surprised this time. The volume ramp demand happened fairly sharply last time, but this time we've had plenty of visibility. And so Colette is absolutely right. We've worked with the supply chain, worked with TSMC very closely. We are geared up for an exciting ramp.

Aaron Rakers

This will be the last question then.

Jensen Huang

Bummer. Come on.

Aaron Rakers

Wow, thank you. Aaron Rakers at Wells Fargo. I really appreciate all this detail. I'm actually going to dovetail off this last comment, because today you started the conversation by talking a little bit about Ethernet and how Ethernet with Ultra.

Jensen Huang

I love Ethernet.

Aaron Rakers

Yeah. So I want to understand a little bit, NVLink, 576 GPUs now interconnected together. This idea of the fabric architecture, where does that play relative to the evolution of Ethernet, your Spectrum 4 product, this move to 800 gig? I'm just trying to understand the interplay between those and whether or not you see NVLink competing with Ethernet in those environments.

Jensen Huang

No. First, the algorithm is actually very simple. First, build the largest dye you possibly can. So big that if you added one more transistor, it would literally fall on the ground. That's algorithm number one. And look at the chips that we build. They're literally the largest. They're radicle limits. Number two. If possible, connect the two of them -- connect two of them together. You're not going to connect four of them together. That's not going to happen. But if you can connect two of them together and that's the Blackwell invention. We now know how to build dice that big. But beyond that, you're going to have all kinds of weird NUMA effects and locality effects. You might as well go to NVLink. And so once you get to NVLink, the question is -- and of course, we're in Gen 5. If you don't have NVLink, then you're kind of stuck. Okay, you can't build systems like this. But if you have NVLink, then the next part is build NVLink as large as you can, modulated by power and cost. And that's the reason why NVLink is direct connect. It's direct drive, not because optical transceivers are out of fashion. Optical? Are you kidding me? We love optical. We need optical. We're going to use tons of optical. But you should build the NVLink as large as you can using copper, because you could save a lot of power, you could save a lot of money. You can make it scalable, sufficiently scalable. Now, you've got one giant chip, 576 GPU chip effectively. But that's only 576 GPU chips. That's not enough. And so we're going to have to connect multiple of them. The next click after that, the best thing you have is InfiniBand. The second best you have is Ethernet with an augmented computing layer on top of it we call Spectrum X, so that we can control the traffic that's in the system, so that we don't have these long tails. Remember, as I said, the last one to finish determines the speed of the computer. This is not an average throughput. This is not like all of us individually are accessing hyperscale and our average throughput is good enough. This is literally the last person who finishes that partial product, who finishes that tensor. Everybody else is waiting on them. I don't know who it is in this room. That's going to be the last, but we're going to hope that that person doesn't hold up right. And so we're going to make sure that that last one is -- we push everything to the middle. We only want one answer. It all shows up at the right time. Okay. And so that's the second best. And then you scale that out as much as you can and that's going to need optics and so on and so forth. There's a place for all of it. I think if anybody's concerned about optics, don't be concerned. We're -- I think the demand for optics is very, very high. Demand for repeaters is very, very high. We didn't change anything about that. All we did was we made computers larger, we made GPUs larger. Can we take one more question? This is so much fun.

Will Stein

One last question from the buy-side, Jensen. You've talked a lot about -- oh, I'm sorry.

Jensen Huang

Where is he? Oh, there is. Hey, Will.

Will Stein

Yeah, hey.

Jensen Huang

Hi, Will.

Will Stein

Sovereign AI. Is there a way to sort of understand like what you're going to do for the United Arab Emirates? That would be one question. And I guess my second question is, like, I'm going to go home. I'm going to see my 91-year-old mother. How can I try to explain to some 91 year old what accelerated computing? I guess, I've got a good answer of the first question. I'll figure out the second one. Thanks.

Jensen Huang

Okay. Yeah, I don't know what you were going to say on the second one, but on the second one, I would say use the right tools for the right job. And right now, general purpose computing, you're using the same tool for every single job. Literally what you have is a screwdriver and you're using it from the moment you woke up to the moment you go to bed. And so you start with you brushing your teeth with a screwdriver. It probably works. I haven't tried it, but it probably works. And so you just use that one tool the whole day. Now, of course, because you're going to use that one tool for the whole day, over time humans have gotten pretty smart. And so we made that general purpose tool. And so now the screwdriver has brushes on it, it's got hair on it. So then it becomes useful for all kinds of stuff. And you could also use it to clean the bathroom and all that kind of stuff. And so one tool. Was that the answer you were going to give? All right. So, we created basically two tools. We said that the CPU is incredibly good at sequential things and what it's not good at is parallel things. Now, the parallel things, the weird thing is this, for most applications, let's say XL, the parallel part is not very much. That's the reason why CPUs are really the best processor for XL. For your web browser, except for graphics that we came along later, most web browsers are largely single threaded. Java is largely single threaded. And so for many applications of personal computing is largely single threaded and then CPU is really quite ideal. And then all of a sudden, there's this new application that came along, computer graphics, video games, where literally 1% of the code is 99% of the runtime. Do you guys understand what I'm saying? 1% of the code is 99% of the runtime. And the reason for that is because it's computing the pixels one at a time. So 1% of the code is 99% of the runtime. And we said, look at that. How interesting. Why don't we take, go create something that's insanely good at 1% of the runtime, meaning it's bad at 99% of the runtime. Excuse me, bad at 99% of the code. It's good at 1% of the code. And we go create applications or find applications where that 1% of the code is 99% of the runtime. Molecular dynamics, medical imaging, seismic processing, artificial intelligence, makes sense. That's why accelerated computing, data processing, so on and so forth where 1% of the code is 99% of the runtime. And that's the reason why we get such great speed up. All right.

Colette Kress

Sovereign AI

Jensen Huang

Sovereign AI. Every country has their own natural resource and that natural resource is called their intelligence. It's in their language. India has their own language. They have many of them, lots of different dialects. They have their own language, their sensibility, their culture, their history. It belongs to them. And a lot of it is in their national archives and is digitized. It's not actually on the Internet. It belongs to them. They ought to take that and go create their own sovereign AI. And they believe the same. Sweden is the same way. Japan is going to do the same. You name it. Companies -- countries all over the world realize that this is their natural resource and they shouldn't let it just be used by anybody to then import their natural resource back to them in an automated way by paying somebody else. Don't let their data go out for free and import AI. They now realize it ought to be the other way around that they should keep their own data and then export AI. And so export the AI of Korea, export the AI of Malaysia, export the AI of, you name it, Middle East countries. So, we have export control limitations on our products. And in most of the areas, the answer is it's not export control. And if there's any export control, we can still work with the US government and make sure that the export is going to be fine. But we, number one, just make sure that we're compliant with export control and in some countries we have to offer degraded products or -- I didn't say that right or lower specification products. And -- but anyways, number one, just be compliant with export controls and help countries around the world to be able to do this. It's a very big market. Yeah, it's a very big market. There are going to be AIs that are going to be trained and continuously refined for just about every culture in the world.

Jensen Huang

Thank you. Do you guys? No, no. Thank you very, very much. I appreciate -- Colette and I appreciate all of your support and interest in the company. And this is really quite an extraordinary time. It's not usual that we get to live through a time like this where the single most important instrument of society is being reinvented after 60 years, that a new way of doing software has emerged. And you know that software is one of the most important technologies that humanity has ever created and that you're in the beginning of a new industrial revolution. And so the next 10 years, you definitely don't want to miss. All right. Thank you very much.