Is it really impossible to do without kafokas and rabbits when you receive 10,000 events per second?

I once hosted a webinar on how to receive 10,000 events per second. He showed this picture, the audience saw a lilac layer, and it began: “Guys, why do we need all these kafkas and rabbit, can we really not do without them?” We answered: “Why, why, to pass the social security!”

Very funny, but let me explain it anyway.

We can host events immediately in the green area and force our applications to write them in a clickhouse.

But clickhouse loves to get messages in batches

In other words, it’s better to stuff a million messages into it, instead of writing one at a time. Kafka, Rabbit or Yandex.Q act as a buffer, and we can use it to control the incoming load.

As it happens: one second came 10 thousand events, the next – a thousand, the next – 50 thousand. This is normal, users turn on their mobile apps randomly. In this case, then 2 thousand, then 10 thousand messages will directly enter the clickhouse. But with the help of a buffer, you can save messages, then get a million events out of this piggy bank and send them to the clickhouse. And here it is – the desired stable load on your cluster.

This is all a story about queues

– Firstly, queues can be used to transfer messages between different services.

For example, for background tasks. You go to the admin panel of the store and generate a sales report for the year. The task is time consuming: you need to read millions of lines from the database, it is troublesome and very long. If the client hangs constantly with an open http-connection – 5, 10 minutes – the connection may be cut off and he will not receive the file.

It makes sense to perform this task asynchronously in the background. The user clicks the button “generate a report”, they write to him: “It’s okay, the report is generated, it will come to your mail within an hour.” The task automatically goes to the message queue and then to the worker’s “desk”, who will execute it and send it to the user to the mailbox.

– Second case about a bunch of microservices that communicate via the bus.

For example, one service receives events from users and transfers them to a queue. The next service pulls events and normalizes them, for example, checks that they have a valid e-mail or phone number. If all is well, he shifts the message further, to the next queue, from which data will be written to the database.

– Another point is the fall of the data center where the base is hosted.

End, nothing works. What will happen to the messages? If you write without a buffer, messages will be lost. Clickhouse is not available, clients have fallen off. Memory will help out to some extent, but with a buffer it is safer – you just took and wrote messages to a queue (for example, to a kafka). And the messages will be stored there until they run out of space or until they are read and processed.

How to automatically add new virtual machines when the load increases

To test the load, I wrote an app and tested auto-scalable groups

We are creating an instance group. We give it a name and indicate the service account. It will be used to create virtual machines.

resource "yandex_compute_instance_group" "events-api-ig" {
  name               = "events-api-ig"
  service_account_id =

Then we specify the virtual machine template. We indicate CPU, memory, disk size, etc.

instance_template {
    platform_id = "standard-v2"
    resources {
      memory = 2
      cores  = 2
    boot_disk {
      mode = "READ_WRITE"
      initialize_params {
        image_id =
        size = 10

We indicate to which network interface to cut it.

    network_interface {
      network_id =
      subnet_ids = [,,]
      nat = true

The most interesting thing is the scale_policy.

You can set a fixed scale group with three instances A, B, C.

scale_policy {
    fixed_scale {
      size = 3

  allocation_policy {
    zones = ["ru-central1-a", "ru-central1-b", "ru-central1-c"]

Or use auto_scale – then the group will be automatically scaled depending on the load and parameters.

scale_policy {
auto_scale {
    initial_size = 3
    measurment_duration = 60
    cpu_utilization_target = 60
    min_zone_size = 1
    max_size = 6
    warmup_duration = 60
    stabilization_duration = 180

The main parameter to pay attention to is the cpu utilization target. You can set a value beyond which Yandex.Cloud will automatically create a new virtual machine for us.

Now let’s test auto-scaling with increasing load

Our application accepts various events, checks the jason and directs to kafka.

There is a load balancer in front of our instance group. It accepts all requests that come to the address on port 80, and forwards them to our instance group – on port 8080.

I have a virtual machine that makes a test load using Yandex.Tank. For the test, I installed 20k requests in 5 minutes.

So, the load has gone.

At first, all nodes will be loaded in all three zones (A, B, and C), but when we exceed the load, Yandex.Cloud should deploy additional instances.

The logs will show that the load has increased and in each region the number of nodes has increased to two. One more machine was added to balancing, the number of instances also increased everywhere.

At the same time, I had an interesting moment. One instance in Region C was writing data (from when data was received to writing) in 23 milliseconds, while an instance from Region A had 12.8 milliseconds. This is due to the location of the kafka. Kafka is located in Region A, so entries go to it faster.

It is not necessary to put all kafka instances in one region.

When another machine was added, the new load subsided, the CPU indicator returned to normal. Full analytics on the test run can be viewed at the link:

How to write an application to work with queues and clipboards

The app is written in golang. First, we import the built-in modules.

package main

import (


Then we connect – this is a library for working with kafka.

We register so that metrics are passed to the Metrics API.

We also connect to work with rabbitmq.

This is followed by the backend parameters that we will write to.

var (
    // Config options
    addr     = flag.String("addr", ":8080", "TCP address to listen to")
    kafka    = flag.String("kafka", "", "Kafka endpoints")
    enableKafka    = flag.Bool("enable-kafka", false, "Enable Kafka or not")
amqp    = flag.String("amqp", "amqp://guest:guest@", "AMQP URI")
enableAmqp    = flag.Bool("enable-amqp", false, "Enable AMQP or not")
sqsUri    = flag.String("sqs-uri", "", "SQS URI")
sqsId    = flag.String("sqs-id", "", SQS Access id")
sqsSecret    = flag.String("sqs-secret", "", "SQS Secret key")
enableSqs    = flag.Bool("enable-sqs", false, "Enable SQS or not")


    // Declaring prometheus metrics
    apiDurations = prometheus.NewSummary(
            Name:       "api_durations_seconds",
            Help:       "API duration seconds",
            Objectives: map[float64]float64{0.5: 0.05, 0.9: 0.01, 0.99: 0.001},

Kafka address – (string).

Enable kafka or not – since the application can write to several different backends.

The application has the ability to work with three queues.

The first is kafka.

The second is amqp for rabbit.

And the third stage is sqs for Yandex.Q.

Next, we open and set common global variables for working with our backend. We register prometheus settings for display and visualization.

In main, we turn on kafka, rabbit and create a queue called Load.

And if we have sqs enabled, we create a client for Yandex.Q.

Further, our application accepts several input points via http:

/ status just gives okey, this is a signal to the load balancer that our application is running.

If you submit a request to / post / kafka, your jason will be sent to kafka. / Post / amqp and / post / sqs also work.

How kafka works

Kafka is a simple, non-whimsical and very effective tool. If you need to quickly receive and store many messages, kafka is at your service.

Somehow on one of the projects it was important to keep within a small budget. And just imagine, we take the cheapest machines without SSD (and kafka writes sequentially and reads sequentially, so you don’t have to spend money on expensive disks), put kafka and zookeeper. Our modest solution for three nodes can easily withstand the load of 200 thousand messages per second! Kafka – it’s aboutput it on and forget it ”, for a couple of years of work the cluster has never disturbed us. And it cost 120 euros a month.

The only thing to remember is that kafka is very demanding on the CPU, and it really doesn’t like it when someone nearby eats up percent. If she has a neighbor at her side, she will start to slow down.

Kafka is organized like this: you have a topic, you can say that this is the name of the queue. Each topic is broken into parts up to 50 partitions. These partitions are hosted on different servers.

As you can see in the diagram, topic load is divided into 3 partitions. Partition 1 is on Kafka 1, the second partition is on Kafka 2, and the third is on 3. This completely distributes the load. When the cluster starts accepting the load, messages are written into one topic, and the kafka scatters them across the partitions, drives them in a circle. As a result, all nodes are loaded evenly.

You can get confused and split the topic into 50 partitions, put 50 servers and place 1 partition on each server – the load will be distributed over 50 nodes. And this is very cool.

Partitions can be replicated thanks to zookeeper. Kafka needs at least 3 zookeeper nodes. For example, you want your partition to be replicated to 2 nodes. Indicate replication factor 2 and each partition will be uploaded 2 times to random hosts. And if your node falls, then thanks to the zukiper, the kafka will see it: “aha, the first node is down”, kafka 2 will take the first partition.

How I deployed kafka with Terraform

IN repositories we have a terraform file called

First, we will raise 3 zookeepers: resource “yandex compute instance ”“ zookeeper count = 3 ”.

Then we find “zookeeper_deploy”, which is deployed by our zookeeper. It will be good if it is carried out on separate machines, where there is nothing but it. Next, we collect the IDs of the nodes and generate the file. Launch ansible to configure the zookeeper.

We raise the kafka in the same way as the zukiper and, what is important, after it.

How RabbitMQ works

Due to the fact that kafka essentially simply saves messages to disk and, upon request, gives them to the client from the right place, it is very, very nimble. Rabbit performance is much lower, but it is packed with features to the eyeballs! Rabbit tries to do a lot that naturally entails consumption of resources.

Rabbit is no longer so simple – here you have exchanges with routing, and a bunch of plugins for delayed messages, deadletter and other stuff. The rabbit himself is watching the messages. As soon as the consumer has confirmed the processing of the message, it is deleted. If the consumer falls off in the middle, the rabbit will return the message to the queue. In general, a good harvester when you need to transfer messages between services. The price of this is performance.

Rabbit does almost everything within himself. It has many built-in tools, you can connect different plugins, many settings for working with messages and queues.

If you need to transfer messages between services in a small amount, your choice is definitely RabbitMQ. If you need to quickly save a bunch of events – metrics from clients, logs, analytics, etc. Is your choice of kafka. You can read more about comparing the two instruments. in my article

And one more thing: a rabbit doesn’t need a zukiper.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *