stackful vs stackless

In this article I want to explain the difference between stackless And stackful coroutines: how they differ, what their pros and cons are, and also tell us in general terms how multithreading is implemented in some programming languages.

I try to write articles in simple and understandable language so that they can be understood by as wide a range of readers as possible. Therefore, simplifications will be made in the text.

Let's start:

The operating system has processes are instances of programs that have their own virtual address space and resources. For example, the process could be your browser in which you read articles, or an email application.

Flow is a lightweight process. Each process can have one or more threads – this is the minimum unit of execution within a process.

Basic flow elements:

  • Stack (the place where temporary data is stored),

  • Registers processor (for storing current calculations),

  • Thread ID (identifier),

  • Pointer to instructionwhich must be executed,

  • State thread (for example, “ready to run” or “waiting”).

Typically, threads are implemented at the operating system level (such threads are called kernel threads), but this approach has disadvantages:

  1. High cost of context switching:

    • Switching between threads requires complex operations and takes up a lot of resources, which slows down program execution.

  2. Kernel dependency:

    • If a thread is blocked (for example, waiting for I/O to complete), the kernel transfers control to another thread. Performance depends on how well the kernel manages threads.

  3. Limited scalability:

    • If a process creates many threads, this increases the load on the operating system and its scheduler, which can lead to poor performance.

To solve these problems, threads were created in user space.

In this approach, thread management is not performed by the operating system, but at the level of user libraries. This allows you to switch between threads without involving the OS kernel, which makes the process faster.

There are different kinds of threads in user space such as fibers, green threads And coroutines (they will be discussed in this article).

A closer look at the coroutine model

In general terms, coroutines are special functions that allow you to pause execution at one place and continue it from the same place later, making them very useful in asynchronous operations and for efficient use of resources. Ordinary functions cannot do this: they are executed to the end, without stopping. The main purpose of coroutines is to help manage multitasking and perform different tasks in turn, but without the complexity associated with threads.

Coroutines can pause their execution

Coroutines can pause their execution

Coroutines have several models multiplexing. This occurs when one or more threads are created within user space for a single operating system thread.

There are three main models for multiplexing threads within a user process: 1:1, 1:N, M:N. The main idea of ​​these models is that they allow a single operating system thread to be shared among multiple user space threads.

  • Model 1:1:

    For each operating system thread, one stack is created in user space. This means that the number of threads in user space matches the number of threads in the operating system.

  • Model 1:N:

    This model is characterized by having multiple stacks in user space with only one thread in the operating system. This allows multiple user space threads to use a single OS thread.

  • Model M:N:

    In this model, for N threads in the operating system, M stacks are created in user space. This allows for more flexible allocation of resources between threads.

Multiplexing coroutines

Multiplexing coroutines

There are two main approaches for implementing coroutines: stackful and stackless.

  1. Stackful coroutines:

    These coroutines have their own stack. This means that coroutines themselves manage their own execution rather than being dependent on the operating system. Due to the presence of their own stack, such coroutines can store more information: they remember not only the current state of the function, but also the hierarchy of nested calls to other functions that were called within this coroutine. This allows, for example, to suspend the execution of one coroutine at any level of function calls and transfer control to another coroutine.

  2. Stackless coroutines:

    Unlike stackful coroutines, stackless coroutines do not have their own stack. Their execution is controlled as state machine (state machine), which is generated by the compiler. It's as if the function's execution is simply switching between different states – for example, between waiting for data and continuing execution after receiving it.

    In practice, stackless coroutines are most often implemented through keywords like async/await. When a function meets awaitit pauses until the result is ready, and then continues executing.

Switching between coroutines

As already mentioned, coroutines save a significant amount of resources when switching compared to operating system threads. Switching between coroutines does not require the participation of the operating system kernel, which significantly reduces resource costs. Let's take a closer look at what happens during a context switch.

Switching between stackful coroutines

Let's start with stackful coroutine. They have a stack that stores all the necessary information. Before passing control to another coroutine, it is necessary to save the stack, as well as local variables, the return address, and other additional information. The coroutine then passes control to the scheduler, which chooses which coroutine to call next.

When switching between stackful coroutines, it is important to save and restore the entire stack and execution context (for example, processor registers). This increases the number of operations that must be performed before a context switch. When the coroutine is resumed, it loads its data from memory and begins execution from where it left off.

Switching between stackless coroutines

Stackless coroutines They work on a similar principle, but store only a minimal amount of data about their state. This could be information about local variables and where the coroutine paused, which is usually stored as a program counter.

Since stackless coroutines do not have a dedicated call stack, there is context switching overhead very small. When a coroutine wants to give up control, it passes it to the scheduler. The scheduler updates its state to “pending” and resumes execution of the other coroutine, which changes its state to “running”.

Stackful coroutines more cluttered due to the presence of a stack, but have greater functionality compared to stackless coroutines.

Stackful coroutines have their own stack, as mentioned earlier. This means that memory is allocated to the stack, which varies by implementation, but on average ranges from 64 KB to 1 MB per coroutine. In addition, additional costs for storing registers and return addresses must be taken into account. However, compared to the stack, this overhead is negligible.

For stackless coroutines, you only need to save local state, which is approximately 2 KB per coroutine. They have no stack restrictions, which makes them very lightweight.

For example, if you take a system consisting of 10,000 coroutines that consume 64 KB per stack, then the total memory consumption will be about 640 MB. At the same time, for stackless coroutines, the total memory consumption will be about 2 MB, assuming that they consume an average of 2 KB of memory.

A little about how the stack is implemented:

There are two main approaches: fixed and dynamic (segmented) stack.

Fixed and dynamic stack

Fixed and dynamic stack

1. Fixed stack

When using a fixed stack, each coroutine allocates a fixed amount of memory during its creation. This stack has a predetermined maximum size, which remains unchanged throughout the life of the coroutine.

Advantages:

Flaws:

  • If the stack size is chosen too large, the user may not use all of the allocated memory, wasting resources.

  • If the stack size is too small, there is a risk of overflow, which can cause the program to crash.

2. Dynamic stack

In coroutines with a dynamic (segmented) stack, its size starts small and increases as needed, allocating additional memory. New memory segments are allocated and associated when more stack space is needed.

Advantages:

Flaws:

  • The complexity of implementing memory allocation logic increases.

  • There is a small performance overhead for initializing new segments.

  • Fragmentation problem: The stack can become highly fragmented if it consists of many small segments scattered throughout memory.

A dynamic stack allows you to flexibly size it, shrinking or growing it as needed, making it more resource efficient.

Where are coroutines used?

Different programming languages ​​implement coroutines differently, providing developers with different mechanisms for dealing with asynchrony.

In this table we have collected top 10 programming languages and indicated which types of coroutines are used in each of them: stackfull or stackless. This will help you better understand how different languages ​​approach coroutines and what benefits they offer.

YAP

Type of coroutines

Description

Python

Stackless

Uses async/await and generators, which are controlled by a scheduler.

Go

Stackfull

Uses “goroutines” that grow dynamically as needed.

JavaScript

Stackless

Uses async/await and promises, without allocating a fixed stack.

Kotlin

Stackless

Supports coroutines using suspend functions for temporary suspension.

Rust

Stackless

Implements async/await through mechanisms similar to state machines.

Lua

Stackfull

Implements coroutines using coroutine.createwhich have their own stack.

C#

Stackless

Uses async/await using state, making coroutines lightweight.

Ruby

Stackless

Uses Fiberlightweight coroutines without a fixed stack.

Scala

Stackless

Implements Future And async/awaitstate-based ones.

Swift

Stackless

Uses async/awaitallowing you to write asynchronous code without a separate stack.

Conclusion

Coroutines are a powerful tool for managing multitasking and asynchronous execution in modern programming languages. They provide a more lightweight approach to executing parallel tasks than traditional threads, allowing developers to efficiently utilize resources and optimize application performance.

We looked at the two main types of coroutines – stackful and stackless – and learned about their characteristics, advantages and disadvantages. Stackful coroutines, such as Go's goroutines, offer rich functionality thanks to their dynamic stack, while stackless coroutines implemented in other languages ​​provide fast switches between tasks with minimal overhead.

Also subscribe to my TG channelthere I try to write about technology in understandable language, in short posts.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *