Ways to bypass the GIL to improve performance

At first glance, GIL seems like a reasonable compromise to simplify development. However, when there are multi-core processors and the need for high-performance computing arises, GIL seriously limits scalability and parallelism.

In this article we will look at ways to bypass the GIL and the first way is to use multiprocessing instead of multithreading.

Using multiprocessing instead of multithreading

Multithreading operates on threads within a single process, sharing memory and state, while multiprocessing runs separate processes, each with its own memory and state.

You can implement multiprocessing using either multiprocessing. It allows each process to run its own Python interpreter and therefore its own GIL. That is, each process can fully use a separate processor core, bypassing the restrictions imposed by the GIL on multi-threaded code execution.

Basic functions of multiprocessing

Locks are used to synchronize access to resources between different processes. For example, you can use Lock to ensure that only one process can execute a specific piece of code at a time:

from multiprocessing import Process, Lock

def printer(item, lock):
    lock.acquire()
    try:
        print(item)
    finally:
        lock.release()

if __name__ == '__main__':
    lock = Lock()
    items = ['тест1', 'тест2', 'тест3']
    for item in items:
        p = Process(target=printer, args=(item, lock))
        p.start()

Semaphores similar to locks, but they allow you to restrict access to a resource not by one, but by several processes at the same time:

from multiprocessing import Semaphore, Process

def worker(semaphore):
    with semaphore:
        # работа, требующая синхронизации
        print('Работает')

if __name__ == '__main__':
    semaphore = Semaphore(2)
    for _ in range(4):
        p = Process(target=worker, args=(semaphore,))
        p.start()

Events allow processes to wait for a signal from other processes to begin performing certain actions:

from multiprocessing import Process, Event
import time

def waiter(event):
    print('Ожидание события')
    event.wait()
    print('Событие произошло')

if __name__ == '__main__':
    event = Event()

    for _ in range(3):
        p = Process(target=waiter, args=(event,))
        p.start()

    print('Главный процесс спит')
    time.sleep(3)
    event.set()

Queues V multiprocessing allow you to securely exchange data between processes:

from multiprocessing import Process, Queue

def worker(queue):
    queue.put('Элемент от процесса')

if __name__ == '__main__':
    queue = Queue()
    p = Process(target=worker, args=(queue,))
    p.start()
    p.join()
    print(queue.get())

Asynchronous programming

asyncio is a library for writing competitive code using the syntax async/await, introduced in Python 3.5. It serves as the basis for many asynchronous Python frameworks

Unlike multi-threaded execution, asyncio uses a single thread and event loop to manage asynchronous operations, which allows you to bypass GIL restrictions.

Let's say you need to collect headers from several web pages. We will useaiohttp as an asynchronous HTTP client for sending requests:

import asyncio
import aiohttp

async def fetch_title(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            html = await response.text()
            return html.split('<title>')[1].split('</title>')[0]

async def main(urls):
    tasks = [fetch_title(url) for url in urls]
    titles = await asyncio.gather(*tasks)
    for title in titles:
        print(title)

urls = [
    'https://example.com',
    'https://example.org',
    'https://example.net',
    # предположим, здесь список из тысячи URL
]

asyncio.run(main(urls))

Function fetch_title asynchronously fetches HTML content for a given URL and returns the contents of the tag <title>. A main creates tasks for each URL and runs them in parallel using asyncio.gather(). In this way, thousands of web requests can be made simultaneously, optimizing the time it takes to wait for responses from servers and using resources efficiently.

Integration with external C/C++ modules

A very useful way to bypass GIL restrictions and improve performance when working with CPU-intensive tasks. Creating extensions allows you to directly access system calls and C libraries, bypassing the overhead of the Python interpreter and GIL.

A Python extension to C or C++ is a shared library that exports an initialization function. The function returns a fully initialized module or instance PyModuleDef. For modules with ASCII names, the initialization function must be called PyInit_<имямодуля>. For non-ASCII module names, the punycode encoding and prefix are used PyInitU_.

To create a module in C, we start by defining the module's methods and method table, and then define the module itself:

static PyObject *method_fputs(PyObject *self, PyObject *args) {
    char *str, *filename = NULL;
    int bytes_copied = -1;

    if (!PyArg_ParseTuple(args, "ss", &str, &filename)) {
        return NULL;
    }

    FILE *fp = fopen(filename, "w");
    bytes_copied = fputs(str, fp);
    fclose(fp);

    return PyLong_FromLong(bytes_copied);
}

static PyMethodDef CustomMethods[] = {
    {"fputs", method_fputs, METH_VARARGS, "Python interface to fputs C library function"},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef custommodule = {
    PyModuleDef_HEAD_INIT,
    "custom",
    "Python interface for the custom C library function",
    -1,
    CustomMethods
};

PyMODINIT_FUNC PyInit_custom(void) {
    return PyModule_Create(&custommodule);
}

After defining the initialization function and module methods, we create setup.py file to compile the module:

from distutils.core import setup, Extension

setup(name="custom",
      version="1.0",
      description="Python interface for the custom C library function",
      ext_modules=[Extension("custom", ["custommodule.c"])])

By executing the command python3 setup.py install in the terminal, the module will be compiled and installed, becoming available for import into Python.

Python and C/C++ have different exception systems. If, for example, you need to throw a Python exception from a C extension, you can use the Python API to handle exceptions. For example, to throw away ValueError if the line is less than 10 characters:

if (strlen(str) < 10) {
    PyErr_SetString(PyExc_ValueError, "String length must be greater than 10");
    return NULL;
}

This way you can use Python's predefined exceptions or create your own.


Also, some libraries, for example NumPy, Numba and Cython, have built-in capabilities for bypassing the GIL.

In conclusion, I would like to recommend you free webinar about queues and deferred execution using the example of RabbitMQ in .Net .

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *