The largest processor in the world – Cerebras CS-1. Parsing

Surely you thought that this is some kind of another clickbait. What is this largest processor in the world? It seems that now we will be told about a processor that is 5 percent larger than the others, and then if we consider this processor only from a certain side. And yes, we want to collect views and readings, but …

Today we are going to tell you about the Cerebro processor called Cerebras CS-1. And it’s really huge!

For example, the GPU that was previously considered the largest is the Nvidia V100 processor, but the new Cerebro processor. It is almost 57 times larger! The area of ​​the chip itself – 462 square centimeters – is almost the same as the area of ​​the entire Nvidia 3090, including the cooling system and connectors.

What can you say if this monster is able to simulate some physical models faster than the laws of physics themselves? Intrigued? Well then, sit down, pour the seagulls. Today there will be a parsing of a truly huge single-chip processor!

So, what is this monster and why is it needed? Let’s immediately answer the second question – this processor is built for machine learning and artificial intelligence. In addition, it will greatly expand the possibilities for various complex modeling and will be able to look into the future. In general, artificial intelligence is an incredibly interesting and relevant topic, and its main limitations are weak computing power. And if you want to learn about real projects using artificial intelligence, Elon Musk has one in stock – Open UI.

If you thought that Moore’s Law with its increase in the number of transistors in the processor every 1.5 years is fast, then look at the needs in the field of AI, because the demand for these calculations doubles every 3.5 months!

The classic approach is to cram a bunch of processors into server racks, to each bring a cooling and power supply system, while each individual processor still needs to be connected to each other, and this, by the way, inevitably causes delays.

Let’s just say – if you take an engine from a Ferrari and stuff it into an old Zhiguli, the car will certainly go faster, but like a Ferrari it still won’t go. Therefore, a fundamentally different approach is needed here, right? to get a real hypercar, you need to take good brakes, suspension, calculate aerodynamics; with computers just the same.

The Cerebro company did this – they decided to develop their system from scratch, that is, in general, everything – from the architecture of the processors themselves, to the cooling system and power supply.

This is a huge machine, consuming 20 kilowatts, and occupying a third of a standard server rack, that is, you can place three of these computers in one rack! And the chip itself, in its essence and purpose, resembles server GPUs from NVIDIA, so let’s compare them. Take the Nvidia Tesla V100.

There are a lot of numbers, get ready! In addition to the size of the crystal itself, the Cerebro processor has four hundred thousand cores, which is 78 times more than the number of cores on the NVIDIA Tesla V100! The number of transistors explodes the brain – 1.2 trillion, versus 21 billion for NVIDIA.

How much memory is there? 18 gigabytes l2 cache memory right on the chip! That’s three thousand times more than the V100. By the way, the 3090 from the same NVIDIA has 6 MB of memory on the chip, just like the V100. Well, it’s even scary to talk about the bandwidth – V100 has 300 Gigabits per second, while Cerebro has 100 PETabits per second. That is, the difference is 33 thousand times!

And in order to achieve similar processing power, they claim that they need one thousand 100 NVIDIA cards, which in total will consume 50 times more power and take up 40 times more space – this is a very significant saving in energy and free space.

This is certainly great – the numbers are amazing. But how did you manage to achieve them?

It’s about size. The chip is big, no, even huge. This is what allows you to place so much of everything on one crystal. And the main thing is that the connection between the elements is instant, because there is no need to collect data from different chips.

However, size is also the main disadvantage of Cerebro.

Let’s get it in order. The first and foremost is heating. The developers of this monster perfectly understood what they were creating and what kind of cooling system was needed, so it, like the processor itself, was developed from scratch. It is a combination of liquid cooling that is directed to cooled copper blocks! The coolant passing through a powerful pump enters the radiator, where it is cooled with a fan, and hot air is already blown out by four additional fans.

With a consumption of 20 kW, which is supplied through twelve power connectors, four goes only to power the fans and pumps for the cooling system. But as a result, they have achieved that the chip operates at half the temperatures than standard GPUs, which ultimately increases the reliability of the entire system.

And of course, I would like to say separately that the engineers created the system so that it allows you to quickly change almost any component, which is very cool, since in the event of a breakdown, this reduces the time of possible downtime.

The chip itself is assembled by TSMC using, you will not believe, a 16 nanometer process technology. And here you can rightly be indignant. How so? Everyone is already making chips at 5 nm, what’s the point of doing at the ancient 16 nm?

This is where the second problem lies. In the production of classic chips, there is inevitably a defect, which leads to the fact that several chips are unusable and thrown away or used for other tasks, and the smaller the process, the higher the defect rate. But when you have the entire silicon wafer – this is one chip, then any mistake in production leads to the fact that the entire wafer can be thrown away. And provided that one plate can be produced for several months and costs about a million dollars, well….

The bottom line is that the guys decided, as it were, to play it safe. After all, the 16 nm process technology is almost seven years old: the details and subtleties of its production are well studied. So to speak – they reduce the risks! But it is worth saying that the development and testing of such a 7 nm chip is already underway, but its output will of course depend on the demand for the first generation! And the numbers there are just huge, just look at the table.

And here you can rightly note that we have not yet said a word about the results that can be achieved with the help of this monster. It is difficult here, since the information is mostly closed, but some details still seep into the media space.

The US Department of Energy’s National Energy Technology Laboratory said CS-1 is the first system to simulate over a million fuel cells operating faster than in real time.

This means that when CS-1 is used to simulate, for example, a power plant based on data about its current state, it can tell what will happen in the future faster than the laws of physics will give the same result. Did you understand? With this PC, you can look into the future with high precision, and if you need to correct and change it. And yet, for example, in a simulation with 500 million variables, Cerebras CS-1 has already overtaken the Joule supercomputer, which ranks 69th in the ranking of the most powerful supercomputers in the world. So it looks like no problem with demand is expected.

Cerebro is planned to be used to predict the weather or temperature inside a nuclear reactor or, for example, to design the wings of an aircraft. Undoubtedly, laboratories and various research centers around the world will find applications for Cerebro. As you can imagine, the computer will be expensive, but the exact price is unknown.

From open sources, we just found 2 Cerebras CS-1 computers for $ 5 million in 2020 at the Pittsburgh Supercomputing Center. But the system is made only to order and for each specific client, so the price may vary.


This is clearly a unique system. And this has never been done before! Most manufacturers believe that it is much more profitable and more efficient to stamp a bunch of small processors, since the likelihood of a defect or breakdown greatly decreases and each mistake is much cheaper. The Cerebro developers decided to take a risky path and, judging by the fact that the Cerebras CS-2 processor is already being tested, their path is successful.

And if everything they said will come true, then we will have a completely new era of server computing, incredible opportunities for creating computer models, new powers of artificial intelligence. There is no doubt that market giants such as Nvidia, Intel, Google, looking at the successful experience of Cerebro, will develop their huge single-chip systems. Can you imagine what will happen if you combine this with quantum computing, which we recently did analysis? Wow!

We will follow the development of technologies, and we will continue to make for you such interesting review materials about the most modern achievements!

PS. By the way, like if you understood the Easter egg in Cerebro – after all, the radiator grill is made in the form of a special grid, which is used in computer modeling for calculations. A reference to the purpose of Cerebro!

Similar Posts

Leave a Reply