Tesla Inference: Another Area It Will Lead

Jensen Huang, the CEO of NVIDIA, stated that inference is going to be a billion times larger than training compute in the future. Let that statement sink in.

This is coming from the head of the leading producer of GPUs. Most of the training that is taking place for AI is being done using Nvidia chips. Meta, XAI, and Tesla have the largest clusters in the world (or being brought online), using predominantly Nvidia chips. XAI is already operating a 100K+ cluster with Tesla set to join them by the end of the year. Meta is also over 100K although there is no word when that will go live.

Huang said that 40% of their business is nOW inference. This is also something to consider. Sam Altman, CEO for OpenAi, admitted the delay in the company releasing applications is due to a lack of compute.

Here we have an eye opener and why Tesla is forging a path that is going to set it up in a very powerful position.


Image generated by Ideogram

Tesla: Inference Leader

Every Tesla that is sold these days can be an inference node. Each vehicle has a computer that is over speced so as to perform this function. It was something that was not discussed a great deal over the years but is garner more attention.

Distributed computing is getting bigger. We have seen it used for protein folding, the operating of networks, and for storage.

With the rapid advancement of AI, inference is going to be crucial. In fact, it is going to be a bottleneck going forward.

Perhaps we should explain what inference computing is.

When one goes to a chatbot, the model was trained on a neural network that was powered by GPUs. This compute was used for training. Since models are continually being updated, these neural networks are kept busy.

Once it is released, people start to use the model. Each time someone does a prompt, compute is required. This is inference compute.

Meta, OpenAI, XAI, and Google have to provide this. A model is not very good if people cannot get information out of it. This requires processing.

It is no different than your personal computer. To utilize an application requires processing. When one opens up Excel, the system uses resources to bring that live. Of course, the same is true when the program is being used, including saving the data.

With an AI model, it is no different except for the fact this is being done in the cloud. servers are constantly processing the prompts from millions of people, requiring a great amount of compute.

This is something that is going to increase with time.

Inference Cluster

What Tesla is setting up is an inference cluster.

In the next 18 months, it is likely the number of vehicles on the road with computers capable of doing inference will surpass 10 million. This can be viewed as a distributed cluster.

Of course, the main utility of the computers is running the onboard software. When operating, inference services will not be available.

That said, there is a lot of time when vehicles are not operating. When at rest, the computer is basically idle. Here is where the ability to perform other services such as inference enter.

When someone like Jensen Huang is bringing the topic up, you know it is something to pay attention to. The idea that we are going to be constrained for inference compute should not surprise anyone. Think about all the areas we see AI being integrated. All of this is going to need inference.

Where are we going to get it? There are only so many servers available at AWS, Google, and Meta. Chip makers are pumping product out as quickly as they can. There could be short term reprieves if demand falls off temporarily, it will not last. The long term trend is fairly certain.

The solution could reside in distributed computing. Each Tesla comes with a computer to run the vehicle software. In other words, each comes FSD ready. This is a computer designed for inference of AI. It is the role it serves for the FSD software.

Each quarter, more vehicles hit the road, expanding the fleet. Instead of looking at this as vehicles, we can view it as the growing of the future inference network.


What Is Hive

Posted Using InLeo Alpha