Storage space performance in Windows server 2025 – first approximation and methodology

For the league of laziness: some whining and dismal figures about software-defined storage, what's going on there.

A bit of theory – Software-defined storage, what is it

About 20 years ago, in 2004, when I was still going to school and reading comics about Real Adventures of a Sexy Vampiredata storage systems lived separately (at a high and very high cost), servers separately. SQL Server 2000 Service Pack 2 (SP2) had just been released, SQL had clustering at the service level (whoever wants to can find the article Clustering Windows 2000 and SQL Server 2000, Brian Knight, first published: 2002-07-12), and, it seems, Oracle had RAC.

Why was there such a division? Because calculating parity and moving data blocks back and forth is an operation that, on the one hand, is not the simplest mathematically, on the other hand, it is routine; using a relatively slow general-purpose processor for them, be it x86, or the Motorola 68060, which had already died by that time, or the still-living UltraSPARC II, is not very rational.

By the mid-2010s, the situation gradually changed. The performance of x86 increased, the cost per operation fell. On the side of classic storage systems, back in 2015, the same 3Par had a separate module for calculating something, in Huawei Oceanstor v2 you could buy a separate LPU4ACCV3 Smart ACC module, but the main load was already calculated on x86 – HPE 3PAR StoreServ 7000 was already on Intel Xeon. By 2019, Huawei switched to Arm, or more precisely, to Kunpeng 920.

Around the same time, or more precisely in Microsoft Windows Server 2012, support for dynamic raids appeared in the form of Storage Spaces, plus SMB Direct and SMB Multichannel appeared, local reconstruction codes were added to R2, a new function appeared in 2016 Server – S2D, storage space direct, but this is already an old story, and then ReFS arrived in time, with its data protection from almost anything, except its deduplication and its own patch for the new year (January 2022 Patch Tuesday KB5009624 for Windows Server 2012 R2, KB5009557 for Windows Server 2019, and KB5009555 for Windows Server 2022.)

Everything would be fine both there and here, BUT.

But. S2D is supported only in the Datacenter edition, and it is not just expensive, but very expensive. It will cost so much money to cover a small cluster of 20 servers with these licenses that it is easier to buy a classic storage system.

But. If you have at least 1 (one) virtual machine with Windows Server running in your cluster, you are still required to license all cores of all cluster nodes with Windows Server licenses. Here you need to consider what is more profitable – to try to close all nodes with STD licenses, with their limit of 2 virtual machines per license, or to license Datacenter.

But. At the same time, you can still NOT have normal deduplication and compression (DECO) on Datacenter, BUT have eternal problems with speed, if your system is configured by clumsy integrators or the same staff recruited for 5 kopecks, and who tests the speed of the storage system by copying a file. Or by running Crystal Disk mark with default settings.

Along the way, you will get problems with backup if you have enabled DECO and have not read the DECO Windows Server backup guide.

It's very simple: if we save on personnel, then we buy a classic dedicated storage system, it has many times fewer settings and knobs that the user can turn in the GUI. We scale the capacity by buying new shelves. Speed does not scale so easily, as you have 2 or 4 or 8 controllers, that's how they cost (up to 16 controllers, if you really need it).

This does not reduce the problems with maintenance, on a classic storage system it is also very desirable to update both the storage system firmware and the disk firmware. On some storage systems earlier (10-15 years ago), updating the firmware could well lead to loss of markings and data loss (ds4800). But there, sometimes replacing disks led to fuckups, like on the ds3500. But vice versa, IBM had a firmware version that worked for 1.5 years, or something like that. But you can't not update the firmware – as recently happened with HPE disks and others, which worked exactly 40,000 hours (Dell & HPE Issue Updates to Fix 40K Hour Runtime Flaw, update to SSD Firmware Version HPD8 will result in drive failure and data loss at 32,768 hours of operation, FN70545 – SSD Will Fail at 40,000 Power-On Hours, etc.).

With the seemingly omnivorous MSA (old versions), you could hurt yourself in a different way – buy the storage system itself, and put cheaper disks there, even SATA with their reliability (Latent Sector Error (LSE) rate) of “1/10(-16)”, and get problems when rebuilding even in Raid 6. It's scary to flash, it's also scary not to flash, but using left-hand disks is not scary.

You can put together a combo: cheap personnel, cheap servers, cheap disks – and have problems with availability, speed, fault tolerance, complete loss of data, and so on.
Choosing the price of a solution is a business choice, and your choice as an employee is to work with a cutter, and then bear responsibility for the screw-up that you chose when working with a cutter, for the same small price.

Continued on Pikabu and in my telegram channel..