AI Use Cases and Deployments Today, and Problem Statement for Future

Good morning everyone. I’m really excited to be here today. I think John opened up with a powerful statement around 100-1000 X efficiency improvement in order to support carbon-neutral AI. That is a very powerful statement to rally around and hopefully through the sessions this morning, we will discuss why that’s important and almost existential to drive towards in order for us all to have a more sustainable AI related footprint.

For today’s session, I’ll start off by kicking off some of the use case deployments and giving the next level details of why does 100-1000 x is going to be important, so that will do it for the first part of it, and then I’ll introduce the panelists today, which are a good representation across technology providers, end-users and academic collaborators and the range of topics across the co-design spectrum that they’ll be covering today.

Then finally, we’ll wrap up the session in the afternoon around 11:30 with the panel where we will discuss what are the opportunities and areas of prioritization and focus for IEE going forward in the space of AI ML. That is what we will cover through the course of this morning’s session and I look forward to discussing this with you all today.

I also want to make a call out for my colleague, David, who’s on the call today. He is largely my partner in crime on these efforts and partnerships with the UCSB and he will also be joining us for the panel session later today. In the case of the use cases, the challenges with AI is not just that it’s important, there are a variety of use cases that range all the way from the data center to end points.

They continue to proliferate at a rapid pace. We largely depend on things like we have started with content understanding, recommendation model are very important for us.

Natural Language Processing, video-based AI and now as we head into a new focus areas around metaphors, the key types of AI applications are continuing to segment and that drives the need for optimization at different levels and that I think is one of the interesting areas of how we can look at this going forward. Not just at the data center level, also going all the way through the edge. We look into the next level of details on what may be some of the opportunities that we can look at in terms of how we optimize this going forward as a community.

The first vector that I think Catherine somewhat alluded to is that it is primarily around the hardware complexity. It is a little bit more manageable if all the complexity goes in one direction. But as these workloads around AI segment, there are different levels of optimization. Content understanding model is very different than recommendation models. If you look at this picture that’s being shown here, DLRM which is one of the focus areas for recommendation models, they are not that intensive.

If you look at the left chart from a compute standpoint, if you look at the petaflops and things like that, they are not that high in terms of an order of magnitude relative to something like GPT-3 use case. But then when you look at it on the right side, they are very different and more demanding in terms of the number of parameters and then the need to communicate the results of these competitions between across the networks. Whether it is all to all or all reduce and things like that and what this means from a datacenter use case is that as these workloads grow in scale and they also segment, we need optimization points because they become large enough to the earlier question for these different types of use cases at some level in order to drive towards those high energy efficiencies that we are talking about. This is clearly a workload based model through which we can apply efficiency as the key angle. From a medical perspective, we have made many of these workloads available to the community such as the DLRM based models and we continue to make available more and more models, whether it is training or inference and this is clearly an area that is a vector for focus.

We talked about this a little bit in this segment earlier around the fact that DLRM is different than the transformer and I think I just covered this in the earlier slide as well. I’ll just move past this. Now, getting into the scale of it. Even though we talk about AI in the context of training and inference, they’re really different problem statements and they drive different types of optimizations. For instance, on the training side, there are lots of investigations happening around how we deploy liquid cooling, how do we have dedicated infrastructure in order to drive more optimized coding for training base use cases.

But the same types of optimizations don’t naturally translate onto the inference space in cases where for inference, we really focus around performance per watt as well as having an app called infrastructure. But if you look at this graph, they both continue to grow at a pretty rapid pace, like Catherine mentioned, this slide is also, I would say, a year plus old.

When you look at an aggregate growth of forex or something along those lines on training and inference. This Gallagher, I would say at this point in time, just as continued to accelerate even further. It’s currently not showing any signs of tapering out.

You see both inference and training. They do require different types of optimizations and that also presents an opportunity for what are the types of areas of research and focus that we’ll look at going forward. The other angle I wanted to point out is that when these things become at scale, from a data center perspective, the reliability of these machines becomes of paramount importance.

These workloads are now doing more mission-critical computations.

Things like silent data corruption errors, things like error handling, how reliable are the systems?

That becomes a very key metric for us because many of these workloads run for a long period of time, and the computations that they end up doing, they end up getting used in the things like selling ads and other kinds of news cases. The reliability of the systems in terms of silent data errors and those kinds of things at the scale of deployment that we’re talking about it’s a key angle for us also looking going forward. Going back to an earlier question, we clearly can see here what percentage of AI is a part of our overall computer infrastructure, but we can´t say that AI is large enough that it drives independent decisions in terms of how we design and deploy in the data center.

For instance, AI requires a separate back-end fabric. AI is big enough and we are willing to invest in it to optimize.

Maybe my guidance to this discussion would be that AI is big enough that it drives independent optimization decisions, and that can be an assumption we can all use as we look forward in terms of the opportunities here. A part of the [inaudible] slow down curve that Catherine talked about is that we probably will hear more from the technology providers.

We have presentations from [inaudible] and [inaudible] going forward, is that we see a trend where the accelerators continue to grow and into higher power envelopes. What was 300, 400 watts is now getting closer to up to a kilowatt in terms of the types of power profiles that the AI accelerators are driving towards. What this requires is a different type of cooling solutions relative to what traditional compute solutions that we deploy at scale.

This has driven investigations and investments in areas like ad assisted liquid cooling within a rack. How do we design our next-generation data centers with more capabilities to supply this AI infrastructure. Clearly, AI data center design is also another area which is right for investigations as the AI scale continues to proliferate. This is a good segment to the mission statement that was laid out 100 to 1000x efficiency improvements, to enable a carbon neutral AI. The picture that is being shown here, the red line basically explains why the current trajectory is just unsustainable.

If you look on the left, you see the growth in the model complexities that Catherine alluded to, it’s just a green line, which is on a very exponential curve that was described earlier.

Today, some of our clusters are already in the hundreds of kilowatts, which is what the bottom dot in the red chart is plotting. If we continue to go with the types of moral complexities that we are talking about, you can see an extrapolation of what that could mean in terms of a power footprint. Clearly building this class of data centers and this number of data centers is just not practical, so that clearly is a need for energy efficiency in this space. Some of the questions that were coming in in the earlier part, they were asking around, is this a big enough problem?

Without going into specifics, we can clearly say that this is absolutely a big enough problem and the community needs to come together in order to collaborate and make this path forward much more sustainable.

What is it that we need as we look forward into the space, and this gets into the co-design team that we will here towards this. But in terms of opportunities, we’re looking at opportunities pretty much across the board, starting from data center design to compute storage memory network optimizations, computation memory, hardware design. How do we optimize our hardware design infrastructure? Cooling at various levels within [inaudible] wanting to enable capabilities at a data center level, and all the algorithmic innovations that can come from a software layer.

All of these are areas that need to work together in order to flatten the curve.

But still on the left seems like it’s pretty much a small [inaudible] between the red and the green line, but clearly it’s an exponential view in terms of how we are looking at it. Clearly this is the focus for us as a community going forward. This is again, laying out a different view of what Catherine talked about a little bit earlier, and the focus is just not in training and inference, the data ingestion pipeline from a storage standpoint is also a key consideration for us. There are opportunities for this across the floor as we move forward.

This is the piece that gets exciting. The challenge is so broad and so important. This is going to require not just a company, or two companies, or five companies making this happen. It is going to require a broad collaboration and effort in order to make this happen. This collaboration needs to happen across end-users, technology providers, system providers, academia, data center design, standards body.

You name it. This is something that is going to require the effort of the entire industry, in order to build this future that is a lot more sustainable for green AI.

There is clearly a lot of work ahead of us, and industry participation is critical, and we look forward to partnering with the industry.

https://jvz1.com/c/1127755/11081

https://theclickgenerator.com/index.php?r=joseintercallosa36

This months niche topic: How to Bolster Your Immune System


Discover more from Making Money Is Easy

Subscribe to get the latest posts sent to your email.

About amorosbaeza1964

Hello, my name is Jose Amorós first of all I wish you a warm welcome to my blogs. It will be a pleasure to share with all of you information about my career and thus evaluate knowledge that will be beneficial for both of us. If you wish, you can contact us through the form, thank you!
This entry was posted in Products and tagged , , , , , . Bookmark the permalink.

Leave a Reply