One More Machine Learning PC for the Price of an RTX4090

“Artificial intelligence may be the best or worst thing that ever happened to humanity.” Stephen Hawking

Hello everyone! The topic of the Artificial Intelligence (AI) march across the planet, which has been greatly heated up recently, not only does not fade away, but also attracts more and more attention every day, giving no one peace and causing discussions and disputes – from professional to superstitious. LLMs like ChatGPT, Gemini, YandexGPT and other models are being improved, services based on them are becoming more widespread and accessible, and more and more interesting things are appearing that can be done using them.

There are already a large number of open trained LLM models – run your GPT on a leash, as they say, and work wonders. Also, I have long wanted to run both Stable (and not so) Diffusion. In addition, my work is now related to programming and training simpler networks for predictions on time series. So there are no problems with models for experiments, but with computing resources for training, things are much worse. Options with renting servers with video cards are suitable for small short-term experiments due to the high cost and, of course, can be an option for business, but for a home experimental lab this is not the best choice.

In short, I decided to assemble my own PC to run large and small, smart and not so smart, but completely artificial models, and this is what came out of it.

“Porridge from an axe”

As in the famous fairy tale, porridge suddenly appears around the axe, so the PC for machine learning appears around the video card. The choice of the card is the most important step. As I already said, I was interested in large language models, which means that there can't be too much memory. Of the non-professional gaming video cards available on the market, the maximum memory you can count on is 24Gb.

The leader now is Nvidia RTX4090, which gives unattainable results in performance, has the maximum amount of new fast GDDR6X memory, is as fast as an electric broom and beautiful as a new refrigerator on sale. The downside of all its outstanding properties is the inhumane price – approximately from 200K rubles. The toad sitting inside me strangled thoughts of perfection at this stage and made me look for “a three, but in Butovo”. I considered different options, including AMD, by the way, but still they are much inferior in a number of characteristics. As a result, I found out that, despite the fact that Nvidia is already releasing a whole line of 4xxx series, there are no other cards with the desired memory capacity. But in the previous generation 3xxx there is an RTX3090 option. This card, of course, is inferior in performance to the 4090, but is much more affordable and has 24Gb on board.

And since I am not proud, and was ready for a used horse, I turned my gaze to the latest ads of a well-known resource. And on the very first page, bam, and stars the cards aligned! MSI GeForce RTX 3090 Ti Gaming X Trio 24G for only 80K rubles! That is, not just RTX 3090, but also Ti. I didn't delay and my axe for porridge was in my hands that same day.

Below is a little bit about RTX3090Ti

A little about the main star of my build – MSI GeForce RTX 3090 Ti Gaming X Trio 24G. Yes, it is not a flagship. RTX 4090but it also costs significantly less. Despite the fact that the RTX 3090 Ti is inferior to the new product in performance, it has a number of advantages that make it an excellent choice for machine learning.

Why RTX 3090 Ti?

  • 24 GB of video memory: Allows you to work with large models and datasets without having to resort to quality-reducing optimizations.

  • Tensor cores: The Ti version is equipped with 3rd generation Tensor Cores, which significantly accelerate operations related to machine learning and deep learning.

  • Availability: Price 80,000 rubles (in my case) makes it more attractive compared to the RTX 4090, which starts at around 200,000 rubles.

RTX 3090 Ti Capabilities

  • Machine learning: Tensor cores and large memory capacity allow training complex neural networks faster and more efficiently.

  • Image generation: Working with Stable Diffusion and other image generation models becomes more comfortable and faster.

  • Big data processing: High throughput and performance speed up analysis and processing of large volumes of information.

NVIDIA GeForce RTX 3090 Ti comparison And NVIDIA GeForce RTX 4090:

Characteristic

NVIDIA GeForce RTX 3090 Ti

NVIDIA GeForce RTX 4090

Architecture

Ampere

Ada Lovelace

Technological process

8 nm

5 nm

CUDA cores

10,752

16,384

Tensor cores

336

512

RT cores

84

128

Base frequency

1.560 MHz

2.235 MHz

Boost frequency

1.860 MHz

2.520 MHz

Video memory (VRAM) capacity

24GB GDDR6X

24GB GDDR6X

Memory bus

384-bit

384-bit

Memory bandwidth

1.008 GB/s

1.008 GB/s

Power Consumption (TDP)

450 W

450 W

Recommended PSU

850 W

850 W

PCIe version

PCIe 4.0

PCIe 4.0

Tensor cores generation

3rd generation

4th generation

Price (at time of release)

About $1,999

About $1,599

  • Training neural networks: The RTX 4090 delivers between 50% and 100% performance gains over the RTX 3090 Ti depending on the specific model and framework.

  • Inference of models: Thanks to improved Tensor Cores, the RTX 4090 delivers faster inference, especially when using new data formats like FP8.

Tensor cores and their role:

  • RTX 3090 Ti Equipped with 336 3rd generation Tensor Cores, which deliver high performance in machine learning and deep learning operations.

  • RTX 4090 has 512 4th generation Tensor Cores that offer even higher performance and power efficiency, as well as new features such as Sparse Tensor algorithm acceleration and FP8 support.

Performance in teraflops (FP16):

Performance graphs

RTX3090Ti vs RTX4090 Performance Comparison from https://technical.city/

RTX 3090 Ti Advantages

  • Availability: With the release of the RTX 4090, prices for the RTX 3090 Ti may drop, making it more affordable.

  • Memory capacity: Same as RTX 4090, 24GB VRAM, which is critical for larger models.

  • Compatibility: Time-tested Ampere architecture with broad support across various frameworks and drivers.

Advantages of RTX 4090

  • Increased core count: More CUDA, Tensor and RT cores.

  • New technologies: Improved 4th generation Tensor Cores and 3rd generation RT Cores.

  • Power Efficiency: Despite the same TDP, the new architecture offers better performance per watt.

Conclusions

  • RTX 3090 Ti It's still a powerful graphics card for machine learning workloads, especially with 24GB of VRAM, which allows you to work with large models and datasets.

  • RTX 4090 offers significant performance and energy efficiency gains, making it the preferred choice for those who want to maximize model training and inference speed, have a budget, and find a reason to suck it up rather than the other way around.

The main thing is not to oversalt

With the video card in hand, all that was left was to add the remaining components to the mix that could reveal its (the card’s) potential.

CPU

I chose AMD Ryzen 9 7950X3D. I've been wanting to try AMD cpu instead of Intel for a long time, and now the time has come. 16 cores and 32 threads provide high performance in multi-threaded tasks, which is ideal for machine learning. Price in 59,406 rubles seemed justified to me and more attractive compared to Intel, as did the thermal package, by the way.

Motherboard

Such a powerful duo of processor and video card requires a reliable foundation. The choice fell on ASRock X670E STEEL LEGEND DDR5 for 30 293 rubles. The word “legend” in the name inspired confidence, and support for DDR5 and PCIe 5.0 provided a reserve for the future. I really liked that you can install 4 M2 SSDs on it and three of them have good radiators.

RAM

RAM is critical when working with large models. I settled on G.Skill Flare X5 DDR5 5200 MHz 2×32 GB for 18,722 rubles. 64 GB memory allows you to comfortably work with large datasets and models. Later it will be possible to expand to 128 GB motherboard allows.

Cooling system

After several unsuccessful attempts with other coolers, I chose Deepcool LS520 WH for 10,464 rubles. This liquid cooler not only effectively cools the processor, but also looks stylish in white, fits the case perfectly! I will say right away that Deepcool LS720 won't fit into the case.

Storage

For storing data I chose SmartBuy 1 TB SSD Stream P16 for 8 283 rublesThis NVMe SSD provides high read and write speeds, which is important when working with large amounts of data, and its price is quite modest.

power unit

The unit is required to produce at least 850W! Otherwise, the card will not even start. Taking into account the energy consumption of the card and other components, I chose a power supply with a small reserve Deepcool PQ1000M 1000W 80+ Gold for 13,844 rubles. It will cope with its task 100% and ensure stable operation of the system.

Frame

I chose be quiet! PURE BASE 500DX for 11,449 rubles. I already bought such a case for another PC, a beautiful and easy to install component, with good ventilation and quiet operation – what you need for a powerful system.

Final assembly and cost

Here is the final table of components:

Component

Model

Price (rubles)

Video card

MSI GeForce RTX 3090 Ti Gaming X Trio 24G

80,000

CPU

AMD Ryzen 9 7950X3D

59 406

Motherboard

ASRock X670E STEEL LEGEND DDR5

30 293

RAM

G.Skill Flare X5 DDR5 5200 MHz 2×32 GB

18 722

Cooling

Deepcool LS520 WH

10 464

Storage

SmartBuy 1 TB SSD Stream P16

8 283

power unit

Deepcool PQ1000M 1000W 80+ Gold

13 844

Frame

be quiet! PURE BASE 500DX

11 449

Total

232 461

Some pictures

Conclusion

In the end, I built a powerful PC that could handle machine learning, image generation, and other resource-intensive applications. The cost of the system ended up being comparable with the cost of the RTX 4090 alone.

Now I can experiment with large language models, train neural networks, and work with big data at home. I hope my experience will be useful to those who are also thinking about assembling their own PC for machine learning.

P.S. If you are also faced with choosing a video card, remember that it is not always worth chasing flagships. Sometimes compromise solutions can be a great option, especially when it comes to the balance between price and performance.

PSS In continuation I will provide a description and scripts for setting up Linux + CUDA.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *