Automatic scaling of Symfony consumers in Kubernetes [Практическое руководство]

We at Debricked have been using Symfony on our web server for quite a while now. During all this time he has served us very well, and when Symfony developers announced Messenger component in Symfony 4.1, we were already looking forward to trying it out. Since then, we’ve been using this component to send emails to an asynchronous queue.

However, we recently had a need to outsource the handling of GitHub events we receive from our integration with GitHub, from our web server to a separate microservice (to improve performance). We decided to resort to producer/consumer pattern (producer/consumer), which provides the Messenger component, since it will allow us to asynchronously send various events to the queue, and then immediately acknowledge their reception in GitHub.

However, compared to sending emails, some GitHub events can take a long time to process. We also have no control over when these events occur, so the load can be completely unpredictable and irregular. We needed a solution that allowed our consumers to scale automatically.

Kubernetes Autoscaling to the rescue

Since most of our infrastructure was already deployed in Kubernetes on Google Cloud, it was more than worth trying to enable it for our consumers. Kubernetes offers something called Horizontal Pod Autoscalerwhich allows you to automatically scale your pods based on some metric.

This autoscaling tool already has a built-in CPU metric. We can set a CPU target for our pods and Kubernetes will automatically adjust the number of pods to match the target we set. We will use this metric to ensure that the number of Pods our consumers are running always matches the current load.

Preparing a Docker Image to Run Consumers

Having made sure that Kubernetes can help us with our task, we now need to create a suitable Docker image to run our pods. We base our consumer image on our base image, which in turn is based on Debian and contains our backend logic, including the logic for the GitHub consumer/event handler.

To control how the Symfony consumer works recommends a tool called “Supervisor”, so we add it to our image and run it in a Docker CMD directive as shown in the code example below:


USER root


RUN apt update && apt install -y supervisor

# Cleanup
RUN rm -rf /var/lib/apt/lists/* && apt clean

COPY ./ /
COPY ./supervisord_githubeventconsumer.conf /etc/supervisord.conf
COPY ./ /

CMD supervisord -c /etc/supervisord.conf

The configuration file

If you look closely at this code, you will notice that we also add two files that are related to running Supervisor(d). These files look like this:


command=bash /
# Перезапуск при получении неожиданных кодов завершения
# Ожидаем код завершения 37, возвращаемый при наличии стоп-файла
# Ваш пользователь
# Число потребителей на первоначальный запуск. Мы вынуждены использовать большое значение, потому что мы привязаны к операциям ввода/вывода

bash script

if [ -f "/tmp/debricked-stop-work.txt" ]; then
  rm -rf /tmp/debricked-stop-work.txt
  exit 37
  php bin/console messenger:consume -m 100 --time-limit=3600 --memory-limit=150M githubevents --env=prod

This is a fairly standard Supervisor configuration, but there are a couple of things worth noting. We execute the bash script, which in turn either exits with code 37 (more on that in the next section) or executes the consume command of the Messenger component using our GitHub event consumer. We also configure Supervisor to automatically restart on unexpected failures, i.e. any status code other than 37.

In our case, we will simultaneously run a large number of consumers (70), due to the fact that the load is very dependent on I / O operations (IO-bound). By running 70 consumers at the same time, we can fully load our CPU. This is necessary for the Horizontal Pod Autoscaler CPU metric to work properly, otherwise the load would be too low, causing the scaling to hang at the minimum number of replicas, regardless of the queue length.

Graceful Pod/Consumer Reduction

When Autoscaler decides that the load is too high, it starts new pods. Due to the asynchronous nature of the messenger component, we don’t have to worry about concurrency issues like race conditions. Everything will just work out of the box, so increasing the number of pods/consumers will not cause any problems, but what happens when the load gets too low and Autoscaler decides to downscale the instance?

By default, Autoscaler simply abruptly terminates a running Pod if it decides it is no longer needed. This, of course, presents a problem for the consumer, as it may be in the process of processing a message. We need a way to gracefully shut down the pod, process the message we’re currently dealing with, and then exit.

In the previous section of the Dockerfile, you may have noticed that we have copied a file called into our image. This file looks like this:

# Этот скрипт выполняется при завершении пода

touch /tmp/debricked-stop-work.txt
chown www-data:www-data /tmp/debricked-stop-work.txt
# Приказываем воркерам остановиться
php bin/console messenger:stop-workers --env=prod
# Ждем удаления файла
until [ ! -f /tmp/debricked-stop-work.txt ]
	echo "Stop file still exists"
	sleep 5

echo "Stop file found, exiting"

When executed, this bash script will create a file /tmp/debricked-stop-work.txt. Because the script also calls php /app/bin/console messenger:stop-workersit will gracefully stop the current workers/consumers, causing Supervisord to restart When the script is restarted it will immediately exit with status code 37 because the file already exists /tmp/debricked-stop-work.txt. This in turn will cause Supervisor to exit because 37 is the exit code we expect.

As soon as the Supervisor is done, so will the Docker image, since the Supervisor is our CMD, and will also terminate because will delete the file /tmp/debricked-stop-work.txt before exiting with code 37. That’s how we achieved a graceful shutdown!

But you may wonder when We will complete it within PreStop lifecycle events Kubernetes container.

This event fires every time the container needs to be stopped, such as when autoscaling terminates. This is a blocking event, which means that the container will not be removed until this script completes – which is exactly what we want.

To set up a lifecycle event, we just need to add a few lines of code to our deployment configuration, as shown in the snippet below:

apiVersion: extensions/v1beta1
kind: Deployment
	name: gheventconsumer
	namespace: default
    	app: gheventconsumer
    	tier: backend
	replicas: 1
        			app: gheventconsumer
            	app: gheventconsumer
        	terminationGracePeriodSeconds: 240 # Consuming might be slow, allow for 4 minutes graceful shutdown
            	- name: gheventconsumer
              	  imagePullPolicy: Always
              	  Lifecycle: # ← this let’s shut down gracefully
                          		  command: ["bash", "/"]
                      	  cpu: 0.490m
                      	  memory: 6500Mi
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
	name: gheventconsumer-hpa
	namespace: default
    	app: gheventconsumer
    	tier: backend
    	kind: Deployment
    	name: gheventconsumer
    	apiVersion: apps/v1
	minReplicas: 1
	maxReplicas: 5
    	- type: Resource
          	  name: cpu
          	  targetAverageUtilization: 60

Are you shocked? Don’t worry, here is a shutdown flow diagram:


In this article, we figured out how to dynamically scale Symfony Messenger consumers depending on the load, including gracefully disabling them. The result is high message throughput at the lowest cost.

Tonight will pass public lesson “Filters in the API Platform”, to which we invite everyone. On it, we will consider filtering by entity fields and filtering by fields of related entities; and also write our own filter (filtering by fields from a JSON column).

Similar Posts

Leave a Reply