sync vs async vs…

Any solution has a lifespan, even the coolest, most reliable and modern one.
/Json Statement/

Today I will tell you how one of our decisions took its last breath, which led to to a small fuckerand how great study helped us buy time and avoid even more screw-ups – or not?

Interlude

We are developing a service for working with self-employed people. He ensures uninterrupted payments to the self-employed, receives a check from them and in every possible way simplifies the routine work of our clients. Despite the fairly simple business process in the service itself, together with the dependencies the picture will be something like this:

Service and its “friends”

Service and its “friends”

This is a real diagram, but the names of other services that are somehow involved in our process, or we are part of their process, have been removed from it. Our service is marked in green.

The stack is quite trivial: FastAPI for receiving requests and rapid prototyping of APIs and Django for working with the database and admin panel. Both technologies give us their advantages and everyone is happy.

But there were also disadvantages. Friday evening. The problems started when one of the white squares started sending more requests than we could handle.

Indigestion

Since our stack involves synchronous work with the database, all API methods are also synchronous. FastAPI it allows and delegates starlette launching synchronous handlers in threads, which in turn anyio.

The number of threads is strictly regulated and, thanks to replication, their total number for all clusters made it possible to process the flow of requests with a large margin. This is true until the execution time of the method begins to fade into the sunset. Then the minimum RPS of a thread-based service can be expressed as follows:

RPS_{min} = \frac{Threads}{MethodExecutionTime_{max}}

And with sufficient request intensity, the maximum RPS value becomes very close to the minimum. Which is exactly what happened. The method of checking self-employed status has reached the peak of its popularity. And when the flow of all incoming requests exceeded the number that we are able to digest, requests serving the front began to fall off. Indigestion began.

First, second and third aid

The patch came as quickly as the problem. Now we were giving 429 HTTP Status Code to the aggressor, limiting the method to half the streams available to us.

Of course, you can’t live like this, and since we have two bottlenecks (at the entrance and at the exit), each of them deserves attention.

At the exit: when working with a source, we now subtly feel its nature, and if it shows signs of inaccessibility, then we remember this for a short time and do not turn to it again, signaling its inaccessibility with an error in response.

At the entrance in the application you need to do something with threads, and we did it.

Handshake with asynchrony

The popularity of the status checking method lies in its simplicity. It is essentially a proxy to our tax authorities, which means there are no other dependencies, and it can be rewritten without problems in an asynchronous manner. The time has come for the selected FastAPI❣️Django bundle to show its main advantage – flexibility – replacement def on async def .

A couple of hours, taking into account adapting tests and copy-pasting synchronous code, and modern software is ready for production! However, promising does not mean getting married. And modern lib versions with modern settings stubbornly refused to be friends with somewhat forgotten proxy servers: SSLV3_ALERT_HANDSHAKE_FAILURE

To find out who doesn’t love whom and why, and especially how to reconcile them, we had to spend some more time. We will definitely come back to this, but in the meantime, the thought was moving far ahead, in search of a solution for the entire service, because the hour is not exactly right, and we won’t get off that easily.

We cannot leave the method synchronous. We can rewrite it to asynchronous, but in the future this approach will lead to a complete rewrite of the project as soon as a significant part of the BL is affected. Knowing all this, we need another, a third way, a third side of the coin, whatever. Time must be won at all costs.

Friday evening, the situation is a little more stable, but it is still burning brightly.

Green thread

Now, in order for us to understand each other, we will have to study a little the basics of such technology as greenlet and get to know the concept Green Threads generally.

In general, Wikipedia gives comprehensive description:

green threads are threads of execution that are controlled instead of the operating system produced by the virtual machine. <…> They are managed V user spacerather than kernel space, which allows them to work in the absence of native thread support.

If you think that this is only relevant for Python, then I recommend taking a look at Wikipedia, the list of languages ​​will surprise you in an interesting way.

Speaking in your own words green threads allow you to implement cooperative multitasking and reduce the cost of creating threads, since the OS will not know anything about them, which means all the overhead costs for isolating them are no longer needed.

For CPython this is done by the library greenletand further, so as not to be confused with native flows, I will simply say greenlets.

Greenlets

“We made centuries of discoveries by following an untrodden path. And ahead everything seems to be covered in dust, as if abandoned by you.”
/Rehabilitation engineer/

The technology is so old that few people have heard about it, and if they did, it was from their great-grandfathers, because the first version was in 2006:

In fact, this is a very small library, despite its long life path. And already with support for Python 3.12.

As for popularity, GitHub alone has >300k repositories.

Who are you? What you?

Greenlets are manually controlled microflows. Micro – this means they consume several times less resources compared to regular (native) streams. Streams – this means they behave like streams and have all their properties.

Let's look a little deeper into how they work. The library gives us two main primitives:

  • Class greenlet to run a function in a greenlet (microflow);

  • function greenlet.switch(*args, **kwargs) to switch from one greenlet to another to transfer the execution flow. The same function allows you to transfer data between greenlets like generator.send(val). If the greenlet has not yet been launched, then the data will be used as parameters for the function that needs to be launched in the greenlet.

Let's see what it looks like using an example taken from the documentation. I changed it a little so that you can see how data is transmitted through switch .

We have created greenlets, but haven't launched anything yet. In order to transfer control and launch the greenlet, you need to do switch() .

An important point to pay attention to is the line 'test1 done'. She was the one who returned from the function switchbecause the return works similarly switchbut brings us back to whoever launched the greenlet.

If you overlay this process on the timeline, it will look like this:

Manuscripts don't burn

Manuscripts don't burn

How are you made?

It works very simply, but it is very difficult to understand. An important feature of greenlets is that they are written entirely in C and C++.

Which side are you on?  :)

Which side are you on? 🙂

Briefly about what is happening. We store Python thread state, namely frame references, exceptions, and so on. Plus, we save a pointer to the stack so that when we return, we can continue execution from the right place in the program. The pointer to the stack, of course, is the same one that is stored in the processor registers at the time the program is executed.

Therefore, for each platform there is assembly code that does this. In the screenshot above this is a challenge slp_switch.

And every click switch causes thread state to switch inside Python. Essentially, the operating system does the same thing by simulating multitasking.

For those who want to study in detail how and why it works, there is article explaining the implementation of coroutines in different languages.

And what should I do with you?

There is a lot you can do. As you understand, greenlets alone do not solve the problem directly. They have exactly the same relation to asynchrony as this story has to cooking (very distant or not?!).

Let's remember where we left off. We have a synchronous function, inside of which another synchronous function is called. And there can be such functions and calls N, and the call tree can go very deep. And we need to turn it into asynchronous.

An example of call nesting in pseudocode

An example of call nesting in pseudocode

An example of real call nesting

An example of real call nesting

Asynchronous functions are those that transfer control while waiting. No waiting – no asynchrony, no transfer of control – no asynchrony. There are several projects that make a projectile approach. And the first one caught my eye greenletio.

greenletio says: “There is no need to rewrite the entire code, just hide it behind a greenlet, and shift the I/O onto your shoulders asyncio” Simply put, any synchronous function can be turned into native asynchronous. Sounds good. And most importantly it looks simple:

Decorator greenletio.async_

Decorator greenletio.async_

You decorate your function via greenletio.async_and all she has to do is do switch during a blocking operation and forward through it awaitable object that will be processed in a standard way via await. But even this greenletio does it for you. greenletio.patch_blocking will replace the implementation of standard packages with one that will behave exactly like this.

greenletio/green/threading.py

greenletio/green/threading.py

Don't be confused by the function await_this is another primitive available in the kit greenletio. It allows you to wait for asynchronous functions, internally it is normal greenlet.switch

The same approach is used in SQLAlchemy to support the asynchronous driver.

sqlalchemy/util/concurrency.py

sqlalchemy/util/concurrency.py

A kind of portal for awaitable-objects, which transfers them from bottom to top, bypassing the entire stack and the need to replace def on async def .

You betrayed me, greenletio… why?!

We managed to start the project, although not right away. The author of the project only supports the public API of modules, but the crutches, even if they are from the authors of CPython, do not.

https://github.com/miguelgrinberg/greenletio/issues/12

However, due to implementation features, some packages simply cannot be used. This feature lies in the fact that Lock.acquire() can be called outside of the “portal”. For example, when a library is imported or you supply metrics in an asynchronous handler (without a portal) using prometheus_clientwhich internally uses locks to control access.

The following libs were included in the blacklist:

  • uvicorn

  • prometheus_client

  • sentry_idk

And when Sentry was turned on, the code began to crash. I realized that there was simply no point in fighting further; the tool lacked the functionality to ensure the functionality of such a complex project.

The only way out of the situation is to somehow learn await– wait at any time, whether we are in the context of a greenlet or an eventloop, and do this in any function, whether it is declared as def or async def .

And… it's already done, 9 years ago.

What is the strength in, brother?

“Have you seen eventloop? I screwed it up”
/Json Statement/

asyncio-gevent — a fork and reincarnation of older analogues, rewritten in a modern way and for modern versions of gevent and Python.

A little about what it is and why:

README.md

  • running asyncio on gevent (by using gevent as asyncio's event loop);

  • running gevent on asyncio (by using asyncio as gevent's event loop, still work in progress);

  • converting greenlets to asyncio futures;

  • converting futures to asyncio greenlets;

  • wrapping blocking or spawning functions in coroutines which spawn a greenlet and wait for its completion;

  • wrapping coroutines in spawning functions which block until the future is resolved.

The first implementation of this approach appeared in 2014, along with asyncio. And it even worked for python 2.7, since there was an asyncio backport to python2.

But why him?

gevent – library for organization cooperative access to network resources, which uses greenlets and libuv/libev to organize an eventloop. In pre-asyncio times, gevent has two qualities that are important to us:

  • time-tested and used in large and complex projects (there are no serious problems with monkeypatching);

  • implements asynchrony for greenlets (we remember that greenlets themselves are not responsible for this).

Usage asyncio-gevent in mode asyncio on top gevent is a production-ready solution according to the author and allows you to switch from greenlet to coroutine and back in an absolutely native way for the current context, be it greenlet.join() (wrapper from gevent) or await future.

To do this, you only need to meet a couple of conditions:

  • call gevent.monkey.patch_all() as soon as possible;

  • set EventLoopPolicy: asyncio.set_event_loop_policy(asyncio_gevent.EventLoopPolicy())

Together with the ability gevent By patching the entire standard library, we automatically get fair asynchrony for our synchronous code without changing the code. Now it, along with coroutines, will compete for processor time inside the eventloop scheduler.

Implementing asyncio on top of gevent asyncio_gevent/event_loop.py

Implementing asyncio on top of gevent
asyncio_gevent/event_loop.py

Unfortunately, it's hard to dive into how this works in more detail, namely the inner workings of the eventloop, the mechanism await and we won’t make it to the selectors today. It's Friday evening after all.

Now – what's the result?

Solution for asyncio-gevent It started right away and is working in our production. There are no problems with libraries, everything is compatible.

To integrate with fastapi and django, I had to do a few tricks:

  • patch sync_to_async / async_to_sync versions from asgiref (Django aio) and anyio (FastAPI aio) ****on functions from asyncio-gevent;

  • take the implementation of asgiref.local.Local directly from the wizard, since it was rewritten to Orthodox contextvars, otherwise there were problems with accessing the admin panel of Dzhanga using a non-standard URL;

  • teach asyncio-gevent forward the context (patches have been tested and will be contributed).

The weekend was saved and there were no sleepless nights.

Monday

Every Monday starts with news. And we are no exception. We were happily greeted by the DBA with the questions “Where do you need so many connections to the database?”

The Fastapi+Django combination has an important patch for anyio, which guarantees that the connection is closed after the request is completed, as is done in regular Django.

Same patch forgot applied incorrectly for asyncio-gevent, had to be rewritten. At the same time, we added a connection pool, otherwise it would open for every request.

The rest can be expressed in numbers after a small load test:

native-threads (x10)

asyncio-gevent

asyncio

max(RPS)

12

25

SSL_HANDSHAKE_ERROR

Number of Users

Before Deny of Service

40

800

AVG(response time)

R.P.S. quite low in both cases. There are reasons for this. First, we are limited by the external system and its response speed. Secondly, we use connection-pool internally before it to also limit our appetite. However, during testing we were still able to send 25 requests per second for some time. The most interesting thing is that this was done at peak load.

Number of Users Before Deny of Service — it took so many users to get the service up and running (mainly in terms of memory).

AVG (response time) grew in proportion to the growth in the number of users, without changes for each of the options.

conclusions

  • this is truly a production-ready solution, as the author of asyncio-gevent writes;

  • be prepared to pay CPU/RAM, there is no such thing as magic and you will have the same problems as with asyncio as the size of the eventloop grows. Scale horizontally, you should not forget about this method;

  • know about the problem in all details ≠ know how to solve it;

  • “Magic” As a Service:

    • gevent and greenlet turned out to be not so scary and magical technologies. Their assigned role at the infrastructure level allowed us not to complicate the codebut increase the lifetime of the project and its throughput and even shareware;

    • any well-developed technology is indistinguishable from magic, this is not a reason to be afraid of it and/or not to use it.

  • when there is a working solution, no one cares SSL_HANDSHAKE_ERROR.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *