How AI helps improve chips: Nvidia report

It has become a tradition at GTC Spring to talk about Nvidia R&D by Chief Scientist and Senior Vice President of Research Bill Dalley. He shared how the Nvidia R&D department works, and also spoke a little about current priorities. This year, Dalley has been mainly focused on the AI tools that Nvidia both develops and uses to improve its own products. For example, Nvidia has begun using AI to effectively improve and refine the GPU design process.

Bill Dally of Nvidia in his home “workshop”
“We are a group of about three hundred people trying to predict the future of Nvidia products,” said in his report. Dally. “We are like powerful searchlights, illuminating objects in the distance. The group is split into two parts. The “proposals” group creates technologies that support the production of GPUs. It advances the GPUs themselves, from VLSI design schemes and methodologies, to the architectural networks, programming systems, and storage systems involved in GPUs and GPU systems.”
“The Nvidia Research Demand Group is trying to drive demand for Nvidia products by developing software systems and techniques that rely on GPUs to perform well. We have three Graphics Research Groups because we are constantly pushing forward progress in computer graphics. We have five AI groups because the use of the GPU for AI work is very relevant now and is becoming more and more relevant. We also have groups dealing with robotics and unmanned vehicles. And there are a lot of labs focused on geographic regions, for example in Toronto and Tel Aviv.”
From time to time, Nvidia launches the Moonshot initiative, bringing together the efforts of many groups – during one of these initiatives, for example, real-time ray tracing technology was born.
As usual, Dally’s report partially recounted last year’s information, but there were also new data. The size of the group has increased (in 2019 it consisted of 175 people). The development of unmanned driving systems and robotics has naturally increased. According to Dally, Nvidia hired Marco about a year ago. Pavone from Stanford University to lead a new autonomous vehicle research group. Bill didn’t talk too much about the CPU design work, which no doubt is also intensifying.

Below are excerpts from their Dalley talk on increasing the use of AI in chip design.
1. Voltage drop map
“It’s natural for us AI people to try to use this AI to design better chips. We will implement this in two ways. The first and most obvious is that you can take the available computer-aided design tools [и встроить в них ИИ]. For example, we have a tool that takes a map of the power usage of our GPUs and predicts how far the voltage will drop. This is called active voltage drop (IR drop). It takes three hours to run this tool in a conventional CAD,” says Dally.
“Because this is an iterative process, it becomes very problematic for us. Instead, we want to train an AI model that receives the same data; we have completed several design iterations, after which, in fact, we can transfer the power map to the model. As a result, the calculation time was reduced to three seconds. Of course, if you include the feature extraction time here, then the total will be 18 minutes. And we can get results very quickly. In addition, instead of a convolutional neural network, we use a graph neural network to evaluate how often different nodes in the chain switch; the power consumption in the previous example depends on this. We can get much more accurate estimates of energy consumption than with conventional tools, in much less time,” says Dally.


2. Prediction of parasitic characteristics
“I especially like this example, because I worked as a circuit designer for many years: using graph neural networks, we predict parasitic characteristics. In the past, circuit design was a very iterative process – you draw a circuit like the one on the left with two transistors. But you don’t know what characteristics it will have until the layout designer creates a topology from this circuit, gets parasitic characteristics, and only then can you run simulations of the circuit and find out if they meet the specifications, ”Dalley notes.
“Then you change your circuit and give it back to the topology designer, which is a very long and inhumanly time-consuming process. Now we can train neural networks to predict parasitic characteristics without the need to create topologies. That is, the circuit designer can iterate very quickly without having to loop through the manual process of creating layouts. And the graph shown in the figure shows that we get very accurate predictions of spurious characteristics compared to the reference data.”

3. Placement and routing issues
“We can also predict bottlenecks when tracing; this is critical to the topology of our chips. In the normal process, we have a join table that goes through a placement and routing process; it can be quite long and often takes several days. And only after that we can identify bottlenecks by finding out that the current placement of elements is not adequate. We have to refactor the topology and place the macros differently to avoid red areas (see slide below) where there are too many wires and a kind of “traffic jam” for the bits. Now we can not perform the process of placement and tracing, it is enough to take these connection tables and using a graph neural network to predict the places where “traffic jams” form with fairly accurate results. They’re not perfect, but they identify problem areas, after which we can react to this and iterate very quickly, without the need for a full placement and routing process, ”says Bill.

4. Automate the migration of standard cells
“All of the techniques described above use AI to challenge human design decisions. However, an even more amazing feature is the use of AI in the design itself. I will show this with two examples. The first is the system we call NVCell, it uses a combination of annealing simulation and reinforcement learning to design our standard cell library. Every time we create a new technology, let’s say we go from seven nanometers to five, we have a library of cells. A cell can be an AND gate, an OR gate, a full adder. We have many thousands of these cells that need to be redesigned for new technology with a very complex set of rules.”
“Essentially, we do this by using reinforcement learning to place the transistors. But more importantly, after placing them, a lot of design violations occur, and the neural network “passes” them like a video game. This is exactly what reinforcement learning is particularly good at. One great example of this is the use of reinforcement learning for Atari video games. So, it’s like an Atari video game, but this video game is all about fixing design rule violations in a standard cell. By eliminating these violations using reinforcement learning, we can, in fact, complete the design of standard cells. The slide below shows that 92% of the elements of the cell library were created without violating the design rules and the rules for constructing diagrams. At the same time, 12% of them are smaller than human-designed cells, and in general, in terms of cell complexity, this tool does the same as a living designer, or even better.”
“This solves two of our problems. First, we greatly save on human labor. It took a good part of a year for a group of about ten people to port a library of new technology. Now we can do it with a couple of GPUs running for days. After that, people can finalize the remaining 8% of the cells that could not be created automatically. And in many cases, the design designed by the neural network turns out to be more perfect. That is, we save on labor and get a better result than people do.”

