million RPS with a thimble in the ointment

We (the devhands.io R&D team) have finished testing the official release of Valkey and comparing it with its progenitor, Redis, of which it is a fork. For those who are not very aware: Valkey was born after a change in the Redis license, under the auspices of cloud providers, primarily AWS.

We focused on throughput and response time depending on the io-threads parameter, which is responsible for “partial parallelism” in these products.

Details of the test bench and testing methodology are given at the end of the article. The following graphs present the results. First, let's look at Redis throughput and response time.

Redis: Throughput

Redis: Throughput

.

Redis: response time

Redis: response time

.

It can be seen that even with one I/O thread, the throughput of Redis is not that small, approximately 160,000 RPS (requests per second, requests per second). And it, contrary to popular belief about Redis’s inability to scale across cores, can grow by about 2-2.5 times. However, already with the number of I/O threads above 8, scaling across cores makes sense, which is clearly visible both in the throughput graph and in the response time graph: it begins to grow noticeably, and performance degrades.

Valkey showed noticeably better results:

Valkey: Bandwidth

Valkey: Bandwidth

.

Valkey: response time

Valkey: response time

.

Despite the fact that in single-thread mode Valkey starts with the same performance as Redis – approximately 160,000 RPS, then the throughput soars to almost 900,000 RPS (we “squeezed out” a million, but with a smaller number and size of keys – about the testing methodology see below). At the same time, the response time remains fantastically low: below 0.1 milliseconds (which, for a second, is only 100 microseconds).

It is impossible not to pay attention to the fact that performance does not increase when the number of I/O threads increases above 8. Most likely, this is a generic problem with the single-main-thread architecture of both Redis and Valkey. And that’s exactly how much – 8 threads – the valkey developers cost in anchor marketing benchmarks (unlocking 1 million RPS). Keep this in mind when planning capacity: on powerful boxes you will either have to scatter the keys across different instances, or use cluster mode (which, by the way, the project works great in both).

So, a summary of the comparison:
(1) Redis scales across cores, but not very well.
(2) Valkey produces 2.5 times more RPS with much lower latency
(3) Neither one nor the other increases throughput when the number of threads is more than 8

Valkey shows excellent results! Despite the small “fly in the ointment” in the form of the lack of scaling across cores on nodes with a CPU above 8, which you can’t even call a spoon, it’s just a thimble. What's next? And then we continue to answer a question that is at least 20 years old: does the Highload project need a cache layer in 2024? Therefore, PostgreSQL 17 is already on the stands (it already produces its million+ RPS at the stand, but eats up all the percent), and soon there will be MySQL 8.4 (baby, we believe in you), Memcached and DragonFly.

Author's Telegram channel: https://t.me/rybakalexey.


Testing methodology

  • bare metal Xeon Gold 6312U 24/48vCPU, 128G

  • mode without writing to disk (without snapshotting and aof)

  • redis-benchmark, GET commands, -c 64 –threads 16

  • 10 million keys, 256 bytes each

  • 5 million requests to calculate latency / throughput

The results are in good agreement with the official benchmarks on EC2 C7g.16xlarge instance:

“The data demonstrates a substantial performance improvement with the new I/O threads approach. Throughput increased by approximately 230%, rising from 360K to 1.19M requests per second compared to Valkey 7.2 Latency metrics improved across all percentiles, with average latency decreasing by 69.8 % from 1.792 ms to 0.542 ms.
Tested with 8 I/O threads, 3M keys DB size, 512 bytes value size, and 650 clients running sequential SET commands using AWS EC2 C7g.16xlarge instance.”
https://valkey.io/blog/unlock-one-million-rps/

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *