so simple and so complex

The first version was written in Tcl and contained only 319 lines of code. It was then rewritten in C and presented in The Hacker News in 2009. The system now supports various programming languages, including C#, Java Go, Python, Node.js.

By the way, the name of the system has nothing to do with the juicy, crunchy root vegetable: in this case, the word “Redis” is an abbreviation for REmote DIctionary Server.

What is it used for?

Redis can be useful for a variety of tasks.

Caching — is the most common task solved with this DBMS. Usually, caching is considered in the context of database requests: to avoid excessive load, the required record is first searched for in the cache. If the data is not in the cache, we request it from the database, and save the result of the query in the cache to return this data much faster next time. The cache content should always be up-to-date — for this, we set the storage duration using TTL.

Distributed lock. It is used when it is necessary to restrict access from several distributed services to a common mutable resource.

Session management. Redis can be used as a centralized session store. It allows you to set the lifetime of a session and generally manage other metadata within a session in a very flexible way.

Limiting the load on a specific service (Rate Limiter). Most often we are talking about the number of API calls per unit of time.

There are different ways to limit the load:

  • The simplest way is to use a regular counter, which is stored in Redis and is available for each user. But this method has a lot of disadvantages – for example, it cannot smooth out peak loads.

  • The sliding window implementation is a more advanced method. We set the time or maximum number of sessions and discard all requests that do not fit into the set limits. This problem can be solved, for example, using the Sorted Set structure (we will discuss the solution algorithm in detail below).

Another way to solve this problem is the “leaky bucket” method or Leaky Bucket: requests are accumulated in the storage, as in a bucket, but since the bucket is leaky, old requests gradually “leak” out of it, freeing up space for new ones. However, if the rate of incoming requests exceeds the rate of leakage of the bucket, new requests will be rejected.

Redis can also be used for analytics and ratings. — using the same Sorted Set:

Of course, we have only touched the tip of the iceberg and left a huge layer of tasks “under water” that can be easily solved using Redis.

What's inside

Essentially, Redis is a large distributed hash table, where the key is an arbitrary string, and the value is one of the supported data structures (strings, lists, hash tables, sets, sorted sets, etc.). Let's briefly go over the characteristics:

  • Full persistence: there are two storage modes – RDB and AOF.

  • Full support for fault tolerance in various topologies for all occasions and any tasks.

  • Information security: access control and data encryption.

  • Full-fledged configuration management and monitoring.

Persistence — an important aspect that needs to be considered in more detail. Redis supports two storage modes — RDB and AOF.

AOF (append only file) implements an operation log, where all new operations are simply appended to the end of the file. This file is human-readable, meaning it can be read and even edited. Unlike RDB, AOF does not block Redis.

This mode also has its drawbacks: for example, gradual growth of the log. But to solve this problem, its compaction is provided – AOF Rewrite. When the file grows too large, Redis launches AOF Rewrite and collapses some commands. For example, if we incremented the counter five times, then instead of five commands after the algorithm runs, only one will remain.

RDB (Redis database) periodically saves a snapshot of the entire dataset. To do this, the main process is forked and the snapshot recording procedure, BGSAVE, is launched. For this reason, in order to save a snapshot during intensive data changes, in the most general case, you will need twice the amount of RAM from the size of the dataset.

Of course, the modern OS kernel has a Copy-on-Write mode, but it will only work with a low intensity of data changes in RAM. And if we are talking about hundreds, millions and billions of keys, then these are very significant volumes.

RDB, unlike AOF, causes blocking and is not recommended for use on master nodes.

Redis supports various deployment topologies:

  1. One node or stand-alone. The simplest topology. It has a right to exist, but only for development and testing environments, since it is not fault-tolerant.

  1. Master-Replica (Secondary). In such a topology, constant asynchronous replication occurs between the master and the replica. To force synchronization, that is, to switch to synchronous mode, Redis has a special WAIT command.

  1. Sentinel. This deployment topology was widely used in early versions of Redis, before full cluster support. It consists of separate special nodes that monitor the operation of the main Redis nodes, track Master failures, and initiate the recovery process. It works on top of the Master-Replica topology.

  1. A full-fledged cluster, which consists of a set of master nodes and a set of replicas (secondary nodes). The Gossip protocol is used for replication: information is distributed in a manner similar to an epidemic – each node transmits information to its known “neighbors”. Clients can work with both the master and the replica, but only reads from the replica.

The data in the cluster is sharded, i.e. divided into segments. The cluster does not use consistent hashing, instead it uses so-called hash slots.

The cluster has a total of 16384 slots. The formula used to calculate the hash slot for a key is crc16(key) % 16384.

Each Redis node is responsible for a specific subset of hash slots. For example:

  • Node A contains hash slots from 0 to 5500.

  • Node B contains hash slots from 5501 to 11001.

  • Node C contains hash slots from 11001 to 16383.

This allows you to easily add and remove cluster nodes, i.e. perform fast resharding.

Supported data types

Redis supports a huge number of data structures. The main ones are, of course, string, list, hash, table and set. There is a set of operations for working with each of them: we work with strings using get, mget, set, append, and with lists we use lpush, lpop, ltrim, llen. There are so many operations and data structures that it is simply impossible to describe them all in this article. Therefore, I suggest considering only specific data types: Sorted Set, Bitmap, HyperLogLog, Stream.

Sorted Set. Based on this structure, we can build the already familiar “sliding window” algorithm (Rate Limiter):

  • For scoring, you can use the timestamp of the incoming request.

  • You can remove obsolete elements that have moved beyond the window using the ZREMRANGEBYSCORE command.

  • The number of requests in a window is calculated using the ZCARD command.

  • If the number of requests does not exceed the limit, then a new request can be added to the window using the ZADD command.

Allowed rate: 2 requests per minute!

Allowed rate: 2 requests per minute!

Bitmap. This data structure is ideal for high-load services – it allows for the fastest and most compact analytics in terms of memory utilization. For example, to create a report on unique users per day.

In essence, this is a bit mask with the ability to write and read at any offset. To record the fact of visiting our resource by one unique user, we need only one bit of information. One bitmap can store 232 bits of information.

We take the user ID: if it is of the integer type, then nothing additional needs to be done, just use it as an offset in the bit mask. If it is of another type, then we use the hash function and take the remainder from the division to get the offset in the mask. And then we execute the SETBIT command to set the desired bit in the mask.

Bitmap Visitor Statistics

Bitmap Visitor Statistics

Stream. Look at the picture. What does it remind you of?

Of course: it's just a copy of the components familiar from the Kafka broker landscape. It's immediately obvious what inspired the creators of Redis.

Here, too, we work with streams, and the terminology is very familiar to us: producer, consumer, consumer group. All commands for working with streams in Redis have the X prefix: for example, XADD. To read data within a consumer group, the XREADGROUP command is used, delivery is confirmed by the XACK command, and so on.

HyperLogLog — is a probabilistic data structure that allows you to determine the power of a set, i.e. the number of unique elements in it. At the same time, it uses only 12 KB of RAM and determines the cardinality of sets up to 264 elements, and the standard error of such an estimate is 0.81%. That is, even less than a percent! To work with this structure, you need the PFADD and PFCOUNT commands.

Performance

Let's evaluate Redis' performance in numbers and compare it with the performance of other components. For clarity, I have drawn a pyramid with Redis in the center. The higher the element in the pyramid, the more productive it is:

Only the first and second level caches (L1 and L2) work higher, and therefore faster, than Redis, with speeds of 1 ns and 10 ns, respectively. Redis works at the speed of RAM, which is about 100 ns. And for comparison: SSD works at a speed of about 100 µs, and an insert query in PostgreSQL is executed in an average of 10 ms.

However, Redis performance can be optimized!

Redis has batch processing or pipelining. The RESP (Redis Serialization Protocol) protocol works over a TCP connection in the request-response mode. To minimize network traffic, you can group commands in one batch – this will significantly increase the overall performance of the system.

Pipelining (time in sec)

Pipelining (time in sec)

The graph shows that grouping 10,000 operations gives a fourfold increase. And with a batch size of 100,000 commands, the increase is already 7-fold!

Conclusion: it is impossible to do without using pipelining in highly loaded systems.

Nuances: what to pay attention to when working with Redis

Unfortunately, no one in this world is perfect: probably, every system has its pitfalls that you should be aware of when diving into working with it. And Redis is no exception. Here are a few nuances that I should warn you about “on the shore”.

  1. Redis has two commands that are very similar in purpose: KEYS and SCAN. They are both needed to obtain a set of keys that match the pattern, but you still shouldn't confuse them.

KEYS — this is a blocking command (and we remember that Redis is a single-threaded thing, which is better not to block).

SCAN — is a stateless command built on the basis of a cursor. The selection obtained using the SCAN command may contain duplicates, but this is not as bad as using KEYS on a prod environment, since duplicates can be easily eliminated algorithmically.

  1. Redis implements two strategies for removing expired keys:

  • lazy delete: on every read and write operation, the expireIfNeeded() function is called;

  • periodic deletion, in which Redis selects a random set of keys using a sampling algorithm within the activeExpireCycle() cycle. Since the set of keys is limited and arbitrary, several such iterations are required. The iteration number is written to the current_db variable.

  1. When talking about using Redis in a high-load system landscape, we can’t ignore the topic of tuningThere are a few key points here:

  • Disable transparent huge pages,

  • Enable vm.overcommit_memory = 1.

  • We limit maxmemory from above at 75-85%.

  • In the network stack section, we increase the value of the tcp-backlog parameter, especially for highly loaded systems.

  • We control the maximum number of clients.

  • I do not recommend using RDB on Redis master nodes because it is a blocking operation.

Are there any alternatives to Redis?

Finally, I suggest considering several DBMSs that position themselves as more productive analogues of Redis: Dragonfly, KeyDB and Garnet. Moreover, these are not just analogues, but drop-in replacements – that is, their use will not require making changes to the code or configuration. Let's figure out if this is really the case!

DragjonflyDB — one of the freshest solutions on the market:

  • in-memory database written in C++;

  • fully compatible with Redis, but not a fork;

  • multi-threaded architecture;

  • a fully-fledged LRU based on the Dashtable algorithm has been rewritten – a 2Q implementation (Redis uses sampling).

The vendor's website contains graphs comparing Dragonfly and Redis to the disadvantage of the latter:

Vendor's Dragonfly Test (OPS)

Vendor's Dragonfly Test (OPS)

Judging by this data, Dragonfly outperforms Redis by an order of magnitude in both read and write operations. In fact, the guys simply took a single-node Redis configuration and compared their system with it. But you and I know that such configurations are not suitable for a production environment. A full-fledged fault-tolerant cluster is needed for comparison. That is, this graph is not entirely correct and was clearly created for the purpose of promoting the “dragonfly”.

What's the reality?

  1. If we try to test the declared drop-in replacement, we will find that Dragonfly does not support key-space-notifications. And they are extremely necessary for solving many problems.

  2. Dragonfly still doesn't have horizontal scaling. The creators claim that this is an advantage: supposedly, vertical scaling is more efficient. But all our experience suggests the opposite: modern solutions should de facto scale horizontally.

In fact, the reaction of the creators of Redis was not long in coming: they conducted their own comparison, taking a full-fledged Redis cluster and writing tests using the pipelining mechanics. And – oh miracle! – the results were completely different:

Alternative Test (OPS)

Alternative Test (OPS)

KeyDB has been actively developed since 2019 and is essentially a multi-threaded fork of Redis.

  • “under the hood” is the same multi-threaded architecture;

  • the declared performance increase is x5 (compared to a single-node Redis configuration);

  • compatible with Redis drop-in replacement;

  • own implementation of replication (Active Replica, Multi-master).

Looks great, but why hasn't everyone moved to KeyDB and continues to use Redis? The thing is, both implementations of fault-tolerant topology are, to put it mildly, not the most reliable.

If you look at other criteria, it becomes clear why KeyDB never became a full-fledged replacement for Redis:

Garnet. Microsoft's alternative to Redis has only just come out, but is already generating interest:

  • It is an open source cache storage under the MIT license;

  • written in C#;

  • own multithreaded engine Tsavorite (fork of Microsoft FASTER storage);

  • the storage is divided into two: main (for strings) and object (for complex objects);

  • all data is stored in the C# heap;

  • compatible with Redis drop-in replacement.

Microsoft also compared their product with similar products and made a chart. We must give them credit – they used pipelining for the analysis. The results are impressive: Garnet really outperforms its competitors in many respects and in the future may even replace Redis.

We can talk about the possibilities of using Redis for a very, very long time. In this article, I tried to give basic information that will help you get acquainted with this DBMS and start using it for your purposes. As experience of application and operation in many information systems, including highly loaded ones, shows, Redis is a reliable, scalable, efficient and, most importantly, very productive tool for solving many typical tasks – from caching to analytics. And today it stands out favorably against the background of similar systems.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *