Implementation of chat with message caching

I'll start by describing the problem.

Let's say you want to create a chat and store messages for it. It's possible you could add a simple database (DB) for this, such as MySQL or even NoSQL.

However, constantly retrieving messages from the database can be expensive and time-consuming. Especially if there are a large number of unauthorized users or users with certain roles in the chat, for whom, just like for unauthorized ones, it is undesirable to waste server database resources. In addition, it is logical to cache all user chat messages somewhere other than the main database, since this is the most sought-after information. It makes sense to use Redis for caching. I liked the video that explains what Redis is in 100 seconds – Redis in 100 Seconds.

Typically, many people use Redis as a key-value (dictionary) storage. By the way, the video briefly explains that Redis is a little more than key-value, as many people are used to thinking.

Our task is somewhat more complicated; it’s not easy to get messages from Redis by key, as usual. We also want to retrieve messages with various flexible, customizable and custom queries, depending on different incoming parameters and conditions, filter and sort… In general, make queries to Redis in almost the same way as we are used to interacting with the SQL database. It is logical to duplicate the work of the MySQL server for the functionality above and add Redis as a cache for chat messages.

There’s just one problem: Redis is a NoSQL database cache and has rather limited functionality.

You say… We can get messages and filter them on the server side in code using any necessary logic. What if there are tens or hundreds of thousands of messages?! This is extremely ineffective.

It would be much more efficient to make a request to Redis so that it filters, sorts and produces the result like a regular SQL database.

Are you surprised or doubt that this is possible? Maybe!

Welcome to the cat!

But… Let's first still look at a more complete description of the task.

With each sending, it is logical to send and deliver a message to the recipients, for example, via a web socket. Also enter this message into the database so that all this is stored somewhere. This code will not be in the example, but it is important for general understanding.

We intend to develop an API on the server that, given several different input parameters, produces different results. If necessary, it will access the real MySQL database, and if necessary, it will access the cache in Redis, simulating Where-like queries, as for SQL. At the same time, we will avoid brute force, in which the complexity can be O(n), which we really wouldn’t want.

Fortunately, Redis has an additional module for this – RediSearch. Initially, it is needed for full-text search. But first things first.

Go…

At the end of the article there will be an example of using transactions in Redis for batch updates. This is important for understanding the whole issue. An example of the code used in the article to obtain data is available at github.

I'll make a reservation: I don’t want to prove the point that it’s good to use NoSQL cache in SQL and the relational paradigm without thinking about a data model that is actually suitable for each case and that this can be done fully. After all, why mix sour with sweet? But sometimes it happens that part of the SQL logic of the data model needs to be repeated in NoSQL for convenience, as well as for business efficiency, and this may be required by the case described at the very beginning.

Here and further I will use JS pseudo-code syntax: everyone knows JS, and this code can be easily rewritten into any other languages.

When creating a message on the server, we will write code to add a record to Redis.

Next is the pseudocode, where {value} is the value for the parameter of the set of possible HSET values ​​in Redis.

redisClient.hSet('messages:'+ {id}, {
  id: {id},
  chatId: {chatId},
  message: {message},
  userId: {userId},
  createdAt: {createdAtDt},
  data: {dataJson},
  read: 0
});

This produces a set of hash records that will generate hset calls, where the key 'messages:'+{id} is the message id. The id value can be the id of the created message from the database, because it can appear before the message is written to Redis.

The values ​​of the object fields in the hashes that correspond to the keys above duplicate the values ​​of the columns in the messages table in the SQL database.

Most of the fields in the object itself should be self-explanatory, but I'll explain the least obvious ones.

read is a boolean flag. Indicates whether the message has already been read. By default, it is assumed that no one has read the message yet.

Next, we will write a function for preparing a universal search filter for both SQL and Redis. The data object that this function returns can be used as a template for a Where query for an SQL database and for preparing a Where query twin for Redis.

static prepareSearchCriteria = (id, limit, onlyOwn, userId, currentUser) => {
  ...
  let searchCriteria;
  let whereCriteria = {
    chatId: id  
  };
   
  if (onlyOwn) {      
    whereCriteria['userId'] = currentUser.id;
  }
  else if (userId !== null && userId !== undefined) {         
    whereCriteria['userId'] = userId;
  }
        
  searchCriteria = {
    where: whereCriteria,
    limit: limit, 
    order: [['createdAt', 'DESC']]  
  } 
  return searchCriteria;
}

I'll explain the non-obvious.

Line 8. If onlyOwn == true, then this parameter has higher priority over the others for sampling, and messages for yourself must be retrieved first.

Line 15. We form a searchCriteria object, on the basis of which the further query for SQL and Redis will grow.

Before we get into receiving messages, I highly recommend check out RediSearchat least superficially, if you are not familiar – Here And Here.

Looking ahead: why RediSearch and not another module like RedisJSON? RediSearch can do full text search. RediSearch can also process JSON structures.

Retrieving data from RediSearch is noticeably faster than using the Redis – SCAN command, and even more so KEYS. The comparison graph with redis SCAN in the examples above can be seen here. One search for SCAN took 10 seconds, and RediSearch took 40 ms.

Data is retrieved by index, highly optimized output is possible and no O(n):) And with certain formats of data structures for search (including our case) and template for scan, it cannot be guaranteed that brute force will be avoided.

Now let's configure RediSearch so that it copes with the request that we generated above. First, you need to create an index in RediSearch to search for the required fields and their types using the FT.CREATE command.

const { SchemaFieldTypes } = require('redis');

static async initialize() {    
  await redisClient.connect();
  try {
    await redisClient.ft.create('idx:messages', { 
      userId: SchemaFieldTypes.NUMERIC,   
      chatId: SchemaFieldTypes.NUMERIC,
      userId: SchemaFieldTypes.NUMERIC,
      message: SchemaFieldTypes.TEXT,
      ...
    },
    {
      ON: 'HASH',
      PREFIX: 'messages'
    });
  } catch (e) {
    if (e.message === 'Index already exists') {
      console.log('Index exists already, skipped creation.');
    } else {
      // Something went wrong, perhaps RediSearch isn't installed...
      console.error(e);
      process.exit(1);
    }
  }
}

The JS command redisClient.ft.create calls the FT.CREATE command from Redis, and the redisClient.ft.search command calls the FT.SEARCH command, respectively.

Let's move on directly to receiving messages from chat rooms.

Below is a code fork to make it clear how close and almost identical the retrieval is in different branches for the SQL database and for Redis.

Let's assume that the messages() function is tied to a REST request:

POST /chat/{id}/messages

Body
{
    limit: { Limit сообщений – опционально }, 
    onlyOwn: {Логический флаг, только ли сообщения для текущего пользователя – если true то приоритет в выборке над остальными опциональными полями},
    userId: {id пользователя для которого достать сообщения – опционально}
}

This is not a GET, but a POST, because the request can change the state of the data in the database and Redis, we will see this later.

The example shows a top-level function for receiving messages, which prepares a request, if the case is for a SQL database and for Redis. Let's skip the details; you can see them in the example on github. Let me just say that in the example we use Redis for all roles that do not belong to a specific type.

static messages = async (req, res) => {
  ...  
  let searchCriteria = ChatController.prepareSearchCriteria(id, limit, onlyOwn, userId, currentUser);
  ...
    let messages;
    if (currentUser.typeId == RolesEnum.API) { 
        //code for fetching from SQL DB
        ...
        messages = ...;
    }
    else {
        //code for fetching from  Redis
        ...
        messages = ...;
    }
   ....
}

Now let's look at Redis; the functionality is more interesting.

static handleRedisReadMessages = async (searchCriteria, currentUser) => {
  let whereCriteria = searchCriteria.where;
  let redisArrParams = [];
  let redisStrParams = "";
  redisArrParams = ChatController.prepareRedisSearchParams(whereCriteria);
  redisStrParams = redisArrParams.join(" ");
  let respMessages = await redisClient.ft.search('idx:messages', redisStrParams, {
    LIMIT: {
      from: 0,
      size: searchCriteria.limit
    },
    SORTBY: {
      BY: searchCriteria.order[0][0],
      DIRECTION: searchCriteria.order[0][1]
    } 
  });
  let filteredMessages = ChatController.filterAndMapMessages(respMessages);

  //transaction part
  let importMulti = redisClient.multi();
  let shouldRedisUpdate = ChatController.isShouldUpdateMessagesInTransaction(respMessages, importMulti, currentUser);
  ChatController.execTransactionMessagesUpdate(shouldRedisUpdate, importMulti);
  
  return filteredMessages; 
}

We look at line 5 and see ChatController.prepareRedisSearchParams(whereCriteria), where a request to Redis is generated from a universal query, suitable for both SQL and Redis. The prepareRedisSearchParams function can be seen below.

Line 6. redisStrParams = redisArrParams.join(” “) – we glue together what we got to send messages for the request to Redis.

Line 7. Call already ft.searchpassing it what happened for the universal request, at the same time we set the sorting, in this example it will be by field createdAt.

The ChatController.isShouldUpdateMessagesInTransaction and ChatController.execTransactionMessagesUpdate methods relate to working with transactions; we will talk about them separately later.

Let's look directly at preparing the parameters for searching in Redis:

static prepareRedisSearchParams = (whereCriteria) => {
        let  redisSearchParams = "";
        redisSearchParams = Object.entries(whereCriteria).map(([key, value]) => {
            let resParam = null;
            if (typeof value == "boolean") {
                let bval = value == true ? 1 : 0;
                resParam = `@${key}: [${bval} ${bval}]`;
            }
            else {
                resParam = `@${key}: [${value} ${value}]`;
            }
    
            return resParam;
        });
        return redisSearchParams;
}

Line 1.

whereCriteria — We have already seen this universal object with search parameters.

Object.entries(whereCriteria).map(([key, value])

Methodically we take out the search parameters and form them into strings of the @{key} format: [{value} {value}]

Writing value twice is not a typo. Such query syntaxeven if an exact match.

A separate branch with code has been written for logical types. If there were other types in the request that require special processing, then they would also have to be written.

We do mapping to conveniently display returned messages to the user.

An example of preparing a request for RediSearch.

When calling a function static messages = async (req, res) => {}

A set of parameters was passed.

Limit – 5, onlyOwn – false, userId= 7, chatId = 1.

Where: userId – user id for which you need to make a request in the database, chatId – chat id.

Result of the function prepareSearchCriteria: searchCriteria = {“where”:{“chatId”:”1″,”userId”:7},”limit”:5,”order”:[[“createdAt”,”DESC”]]}

Result of preparing the query string for RediSearch: redisStrParams = `@chatId: [1 1] @userId: [7 7]`

Transactions in Redis

It’s also worth saying about transactions in Redis.

For example, after a request to receive messages, we also want to mark which users are still selected and add a separate column for this in the messages table, calling it, for example, firstFetchedByUserId. For simplicity, this parameter was not introduced, but a column was added – whether the message was read or not – read. Although this is a poor, limited approach, it looks like a semblance of working with relational data :).

For the SQL case, this is easy to do – for ORM Sequalize, mass update by criterion, take the id of those messages that have already been read.

await Message.update({
    read: true
}, {
    where: {
        id: idsRead,
    },
});

In the case of Redis, such an update is also easy to do. But the first thing Google search suggests for such a solution is LUA scripts. One could think of them as distant relatives of stored procedures and functions of an SQL database. Only this solution is not suitable for the case with replication and any cases where there are more than 1 database nodes, so it is not suitable.

This is where Redis transactions come to the rescue.

First, we determine whether messages need to be updated if at least one was read by a different user than the one who created it.

static isShouldUpdateMessagesInTransaction = (respMessages, importMulti, currentUser) => {
        let isUpdate = false;
        respMessages.documents.forEach(mes => {
            if ( parseInt(mes.value.userId) != currentUser.id )  {
                if (!mes.value.read) {
                    importMulti.hSet(mes.id, {
                        "read": 1,
                    });
                    isUpdate = true;
                }
            }
        });
        return isUpdate;
}

Next, we simply perform a transaction where the records are updated in bulk.

static execTransactionMessagesUpdate = (redisUpdate, importMulti) => {
        if (redisUpdate) {
            importMulti.exec(function(err,results){
                if (err) { throw err; } else {
                  console.log(results);
                  client.quit();
                 }
            });
        }
}

Conclusion.

This is the main thing I wanted to show; Perhaps the attentive reader will notice that there are also other modules, like RedisJson, but I wrote above why I didn’t use it for a similar task.

RediSearch is also full of other commands like FT.AGGREGATE

The purpose of the article is to show how powerful modules are, especially RediSearch, and that you can get visible profits by using them. For example, you can significantly save machine resources using the right approach. What was and was done in a specific business task. The resource savings were significant, but even more significant were the customer's savings in nerve cells compared to the traditional method of obtaining data from Redis (without using modules).

The official list of modules with their brief descriptions can be found on the official Redis website – Here.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *