C++20 coroutines and multitasking on the example of stm32 controllers

Didn’t want to offend anyone KDPV (primarily @Saalur), indeed far from the first time it becomes clear.

Introduction

One of the most striking innovations that the language received in the C++20 standard is support for coroutines (or coroutines). Software developers for microcontrollers can immediately notice that a coroutine is similar to a task in an operating system. Habré already contains materials on this topic, for example, “Using C++20 coroutines in conjunction with NRF52832 and GTest” by @Firthermant and “CoroOS: operating system concept for microcontrollers based on C++20 coroutines” by @Saalur В At the same time, I cannot but note that it is not easy to understand the presented materials and source codes right off the bat, especially for those programmers who are not yet familiar enough with coroutines in C ++. In my material, I will try to analyze at a simpler level the issues of applying the new language standard when developing the task scheduler. In a sense, this article can be considered preparatory before reading the above.

So, let’s look at a few simple task scheduling options from the most primitive to something remotely resembling an operating system.

The simplest cooperative multitasking

Cooperative multitasking involves the “voluntary” transfer of control from one task to another. In terms of the operating system, as a rule, above all tasks there is a “super task” – a scheduler that takes control from the suspended code and transfers it to the next task. An example of the functioning of the simplest system of two tasks is shown in Figure 1 (every second circle of the cycle is marked with green arrows).

The proposed system consists of two tasks, the first of which contains three operators and is ready to return control (go to the waiting state) after execution оператора 2and the second task consists of four statements and is also ready to enter the waiting state after execution оператора 2.

Thus, multitasking is implemented by the following sequence of statements execution:

Order of execution of statements
  1. Task 1. Operator 1.

  2. Task 1. Operator 2.

  3. Task 2. Operator 1.

  4. Task 2. Operator 2.

  5. Task 1. Operator 3.

  6. Task 2. Operator 3.

  7. Task 2. Operator 4.

  8. Go to point 1.

Figure 1. Variant of cooperation of two tasks.
Figure 1. Variant of cooperation of two tasks.

This execution order can easily be shifted to the coroutine mechanism:

  1. Coroutine objects are owned by the scheduler, which allows it to transfer control to tasks in turn.

  2. Continuation of the execution of the task by the scheduler – operation resume for the next coroutine-task.

  3. Suspending the execution of a task and transferring control to the scheduler – operation suspend for the current coroutine.

Thus, the scheduler code in such a system looks something like this:

auto tasks = { Task1(), Task2() }; // Создание задач

for(;;)
{
  // Последовательное продолжение выполнения задач
  for (auto& t : tasks)
    t.resume();
}

And the task itself is:

task TaskX()
{
  // Начальная инициализация
  for(;;)
  {
    // Операторы 1 .. N1
    co_await std::suspend_always(); // приостановка выполнения
    // Операторы N1+1 .. N2
    co_await std::suspend_always(); // приостановка выполнения
  }
}

In the most primitive case (namely, the one we are considering so far), the coroutine’s waiting object (the operator argument co_await) is standard std::suspend_always()implying simply the transfer of control back to the caller – the scheduler.

Type of task in this case also the simplest one, its source code is given below. It is only worth noting that it will inevitably be necessary to redefine the operator new, however, the memory manager is only required to allocate a memory area in the buffer once, since this operation will be performed once for each task. The release of memory is not implied at all, since the classical coroutine task contains an infinite loop. And in general, a separate memory manager, in principle, is not needed at all, you can declare a buffer in the type itself task or task::promise_typehowever, this approach is fraught with excessive consumption of memory, since different tasks may require a different size, in which case the buffer size will have to be determined as the maximum among all tasks.

Of the Russian-language materials on coroutines, I liked the most lecture Konstantin Vladimirov, I advise you to look at both parts, since I tried to make all further code as close as possible to how the material was presented by the lecturer.

The source code of the task structure
struct task {
  struct promise_type {
    using coro_handle = std::coroutine_handle<promise_type>;

    auto get_return_object() {
      return coro_handle::from_promise(*this);
    }

    auto initial_suspend() {
      // Изначально задача остановлена
      // Хотя можно вернуть suspend_never, чтобы дать возможность проинициализировать всё необходимое
      return std::suspend_always();
    }

    auto final_suspend() noexcept {
      // Задача будет содержать бесконечный цикл, поэтому в этот метод попадать не должны
      return std::suspend_never();
    }

    void unhandled_exception() {
      // В этот метод тоже
    }

    void* operator new(std::size_t size) {
      return MemoryManager::Allocate(size);
    }

    void operator delete(void* ptr) {
      // По той же самой причине (бесконечное выполнение задач) в этот метод попасть не должны
      MemoryManager::Deallocate(ptr);
    }
  };

  using coro_handle = promise_type::coro_handle;

  task(coro_handle handle) : _handle(handle) {}

  // Деструктор можно не определять, так как он ни разу не будет вызван  

  void resume() const {
    _handle.resume();
  }
private:
  coro_handle _handle;
};

The code of the entire program (I used my own library to work with peripherals) is presented under the cut. For simplicity, only one operator is left in the tasks co_await, and the payload is waiting for a byte from the UART and writing the response. Performance tested on the stm32f103c8t6 controller.

Full program code
#include <iopins.h>
#include <usart.h>

#include <coroutine>

class MemoryManager {
  static const uint32_t Capacity = 512;

public:
  static void* Allocate(std::size_t size) {
    _size += size;
    return &_data[_size - size];
  }

  static void Deallocate(void* ptr) {
    // Задачи-корутины вечные, поэтому в этот метод попасть не должны
  }

private:
  static uint8_t _data[Capacity];
  static uint16_t _size;
};
uint8_t MemoryManager::_data[MemoryManager::Capacity];
uint16_t MemoryManager::_size = 0;

struct task {
  struct promise_type {
    using coro_handle = std::coroutine_handle<promise_type>;

    auto get_return_object() {
      return coro_handle::from_promise(*this);
    }

    auto initial_suspend() {
      // Изначально задача остановлена
      // Хотя можно вернуть suspend_never, чтобы дать возможность проинициализировать всё необходимое
      return std::suspend_always();
    }

    auto final_suspend() noexcept {
      // Задача будет содержать бесконечный цикл, поэтому в этот метод попадать не должны
      return std::suspend_always();
    }

    void unhandled_exception() {
      // В этот метод тоже
    }

    void* operator new(std::size_t size) {
      return MemoryManager::Allocate(size);
    }

    void operator delete(void* ptr) {
      // По той же самой причине (бесконечное выполнение задач) в этот метод попасть не должны
      MemoryManager::Deallocate(ptr);
    }
  };

  using coro_handle = promise_type::coro_handle;

  task(coro_handle handle) : _handle(handle) {}

  // Деструктор можно не определять, так как он ни разу не будет вызван  

  void resume() const {
    _handle.resume();
  }
private:
  coro_handle _handle;
};

using usart1 = Zhele::Usart1;
using usart2 = Zhele::Usart2;

task Task1()
{
  usart1::Init(9600);
  usart1::SelectTxRxPins<Zhele::IO::Pa9, Zhele::IO::Pa10>();
  usart1::Write("Init task1\r\n", 12);

  for(;;) {
    if (usart1::ReadReady()) {
      auto s = usart1::Read();
      if (s == '1')
        usart1::Write("Task1: you write '1'\r\n", 22);
      if (s == '2')
        usart1::Write("Task1: you write '2'\r\n", 22);
    }
    co_await std::suspend_always();
  }
}

task Task2()
{
  usart2::Init(9600);
  usart2::SelectTxRxPins<Zhele::IO::Pa2, Zhele::IO::Pa3>();
  usart2::Write("Init task2\r\n", 12);

  for (;;) {
    if (usart2::ReadReady()) {
      auto s = usart2::Read();
      if(s == '1')
        usart2::Write("Task2: you write '1'\r\n", 22);
      if(s == '2')
        usart2::Write("Task2: you write '2'\r\n", 22);
    }
    co_await std::suspend_always();
  }
}

int main()
{
  auto tasks = {Task1(), Task2()};
  
  for(;;)
  {
    for (auto& t : tasks)
      t.resume();
  }
}

Preemptive multitasking on events

Although the unconditional suspension of tasks is not without meaning, more often the suspension of execution is associated with the expectation of some external event, without which further execution of the task is impossible in principle. In our example, this event is the receipt of the next byte via UART. In this case, the operator argument co_await reasonable to do not standard suspend_alwaysand some awaitable– the object that we will now try to describe.

There are several requirements for a wait object:

  1. Must contain promisesuitable for the operator co_await.

  2. Must have a status flag (whether or not the expected event took place, whether or not the execution of the coroutine can be continued).

  3. It should have a list inside itself (in our case, we will limit ourselves to a single instance) of consumers who are waiting for this very event.

Generally speaking, the third point is moot, because the notification of pending tasks can be done in other ways, for example, by saving them somewhere separately.

In this case, the type task can be simplified by removing the descriptor from its fields coro_handle (since, unlike the previously proposed option, the coroutine will not have a specific owner. The expected event will temporarily own it). When an event occurs, the corresponding consumer is activated. At the current stage, this procedure occurs directly in the interrupt handler, which makes the system preemptive.

Simplified type code task (in Vladimirov’s lecture it is the type resumable_no_own):

task structure code
struct task {
  struct promise_type {
    using coro_handle = std::coroutine_handle<promise_type>;

    auto get_return_object() {
      return coro_handle::from_promise(*this);
    }

    // В двух следующих двух методах возвращается suspend_never
    auto initial_suspend() {
      return std::suspend_never();
    }

    auto final_suspend() noexcept {
      // Задача будет содержать бесконечный цикл, поэтому в этот метод попадать не должны
      return std::suspend_never();
    }

    void unhandled_exception() {
      // В этот метод тоже
    }

    void* operator new(std::size_t size) {
      return MemoryManager::Allocate(size);
    }

    void operator delete(void* ptr) {
      // По той же самой причине (бесконечное выполнение задач) в этот метод попасть не должны
      MemoryManager::Deallocate(ptr);
    }
  };

  using coro_handle = promise_type::coro_handle;

  // В конструкторе ничего не происходит
  task(coro_handle handle) {}
};

The most important element is the event type, its code is:

event class code
class event {
  using coro_handle = std::coroutine_handle<>;

  struct awaiter {
    event& _event;
    coro_handle _handle = nullptr;

    awaiter(event& event) noexcept : _event(event) {}

    bool await_ready() const noexcept { return _event.is_set(); }

    void await_resume() noexcept { _event.reset(); }

    // Регистрация потребителя события
    void await_suspend(coro_handle handle) {
      _handle = handle;
      _event.set_awaiter(this);
    }
  };

public:
  // Метод установки (активации) события
  void set() {
    _set = true;
    // Возобновление соответствующей задачи
    _consumer->_handle.resume();
  }

  // Проверка состояния события
  bool is_set() {
    return _set;
  }
  
  // Сброс события
  void reset() {
    _set = false;
  }

  // Метод регистрации потребителя
  // Можно реализовать целый список потребителей
  void set_awaiter(awaiter* consumer) {
    _consumer = consumer;
  }

  // Перегруженный оператор co_await, так как тип event не является awaitable
  // Подробное разъяснение необходимости этой перегрузки можно найти в лекции Владимирова
  awaiter operator co_await() noexcept {
    return awaiter(*this);
  }

private:
  awaiter* _consumer = nullptr;
  bool _set = false;
};

With the preemptive activation of the coroutine directly from the interrupt handler, the meaning of a separate scheduler is lost, so the main loop in the function main empty.

Full program code
#include <iopins.h>
#include <usart.h>

#include <coroutine>

class MemoryManager {
  static const uint32_t Capacity = 512;

public:
  static void* Allocate(std::size_t size) {
  _size += size;
  return &_data[_size - size];
  }

  static void Deallocate(void* ptr) {
  // Задачи-корутины вечные, поэтому в этот метод попасть не должны
  }

private:
  static uint8_t _data[Capacity];
  static uint16_t _size;
};
uint8_t MemoryManager::_data[MemoryManager::Capacity];
uint16_t MemoryManager::_size;

struct task {
  struct promise_type {
  using coro_handle = std::coroutine_handle<promise_type>;

  auto get_return_object() {
    return coro_handle::from_promise(*this);
  }

  // Отличие от предыдущей реализации только в следующих двух методах
  auto initial_suspend() {
    return std::suspend_never();
  }

  auto final_suspend() noexcept {
    // Задача будет содержать бесконечный цикл, поэтому в этот метод попадать не должны
    return std::suspend_never();
  }

  void unhandled_exception() {
    // В этот метод тоже
  }

  void* operator new(std::size_t size) {
    return MemoryManager::Allocate(size);
  }

  void operator delete(void* ptr) {
    // По той же самой причине (бесконечное выполнение задач) в этот метод попасть не должны
    MemoryManager::Deallocate(ptr);
  }
  };

  using coro_handle = promise_type::coro_handle;

  // В конструкторе ничего не происходит
  task(coro_handle handle) {}
};

class event {
  using coro_handle = std::coroutine_handle<>;

  struct awaiter {
    event& _event;
    coro_handle _handle = nullptr;

    awaiter(event& event) noexcept : _event(event) {}

    bool await_ready() const noexcept { return _event.is_set(); }

    void await_resume() noexcept { _event.reset(); }

    // Регистрация потребителя события
    void await_suspend(coro_handle handle) {
      _handle = handle;
      _event.set_awaiter(this);
    }
  };

public:
  // Метод установки (активации) события
  void set() {
    _set = true;
    // Возобновление соответствующей задачи
    _consumer->_handle.resume();
  }

  // Проверка состояния события
  bool is_set() {
    return _set;
  }
  
  // Сброс события
  void reset() {
    _set = false;
  }

  // Метод регистрации потребителя
  // Можно реализовать целый список потребителей
  void set_awaiter(awaiter* consumer) {
    _consumer = consumer;
  }

  // Перегруженный оператор co_await, так как тип event не является awaitable
  // Подробное разъяснение необходимости этой перегрузки можно найти в лекции Владимирова
  awaiter operator co_await() noexcept {
    return awaiter(*this);
  }

private:
  awaiter* _consumer = nullptr;
  bool _set = false;
};

using usart1 = Zhele::Usart1;
using usart2 = Zhele::Usart2;

event usart1Rx;
event usart2Rx;

char u1, u2;

task Task1()
{
  usart1::Init(9600);
  usart1::SelectTxRxPins<Zhele::IO::Pa9, Zhele::IO::Pa10>();
  usart1::Write("Init task1\r\n", 12);
  usart1::EnableInterrupt(usart1::InterruptFlags::RxNotEmptyInt);

  for(;;) {
    co_await usart1Rx;

    if (u1 == '1')
      usart1::Write("Task1: you write '1'\r\n", 22);
    if (u1 == '2')
      usart1::Write("Task1: you write '2'\r\n", 22);
  }
}

task Task2()
{
  usart2::Init(9600);
  usart2::SelectTxRxPins<Zhele::IO::Pa2, Zhele::IO::Pa3>();
  usart2::Write("Init task2\r\n", 12);
  usart2::EnableInterrupt(usart2::InterruptFlags::RxNotEmptyInt);

  for (;;) {
    co_await usart2Rx;

    if(u2 == '1')
      usart2::Write("Task2: you write '1'\r\n", 22);
    if(u2 == '2')
      usart2::Write("Task2: you write '2'\r\n", 22);
  }
}

int main()
{
  Task1();
  Task2();
  
  for(;;) {}
}

extern "C" {
    void USART1_IRQHandler() {
        u1 = usart1::Read();

        usart1Rx.set();
        usart1::ClearInterruptFlag(usart1::InterruptFlags::RxNotEmptyInt);
    }

    void USART2_IRQHandler() {
        u2 = usart2::Read();

        usart2Rx.set();
        usart2::ClearInterruptFlag(usart2::InterruptFlags::RxNotEmptyInt);
    }
}

The code above suggests an objectively ugly solution with global variables u1, u2. You can easily get rid of them by adding the result to the object event, which will allow you to concisely write inside the coroutine char s = co_await usartNRx;

Cooperative multitasking at events

The previous code can be upgraded to provide event-based cooperative multitasking. In the interrupt handler, we will forgo setting the event flag immediately, and instead add the event object to the queue to notify its consumer. This will require very few changes.

You can add a global event queue to the program:

std::queue<Event*> TasksQueue;

In the main loop (scheduler) it remains only to activate the events in order of priority:

for(;;) {
  if (!TasksQueue.empty()) {
    TasksQueue.front()->set();
    TasksQueue.pop();
  }
}

In the interrupt handler, instead of activating the event immediately, add the event to the queue:

void USART1_IRQHandler() {
  u1 = usart1::Read();

  TasksQueue.push(&usart1Rx);
  usart1::ClearInterruptFlag(usart1::InterruptFlags::RxNotEmptyInt);
}
// Код обработчика UART2 аналогичный

In this modification, when an event occurs, the current task is given the opportunity to reach the first one in the course of operator execution co_awaitthat is, no displacement occurs.

Important! This code did not work with the option -oswhich I associate with using std::queue. Without optimization (with option -O0) everything works correctly.

Cooperative multitasking with priorities

An obvious development of the system is the prioritization of tasks. This can be done by simply adding a queue (or several queues for each priority level), and support for dynamic task priority is almost “free” if the priority is not the task itself, but the event expected by it ( @Saalur opened this topic in more detail), so let’s take a look at this example.

To support priorities, it is enough to supplement the previous implementation with not one queue, but several (its own for each priority level), or provide for sorting when inserting into the queue. To demonstrate, let’s assume that there are two levels, high and low, and consider changes to the source code.

First of all type event takes precedence:

// Вложенное перечисление приоритетов
enum class Priority
{
  Low,
  High
};

// Появился конструктор с параметром
event(Priority priority) : _priority(priority) {}
// А также метод, возвращающий приоритет события
Priority priority() const {return _priority;}

Global event variables must be created with the right priorities:

event usart1Rx(event::Priority::High);
event usart2Rx(event::Priority::Low);

In the main loop (scheduler), you must first process all tasks with a high priority, and only after that with a low one:

for(;;) {
  if (!HighPriorityTasksQueue.empty()){
    HighPriorityTasksQueue.front()->set();
    HighPriorityTasksQueue.pop();
    continue;
  }
  if (!LowPriorityTasksQueue.empty()) {
    LowPriorityTasksQueue.front()->set();
    LowPriorityTasksQueue.pop();
  }
}

And in the interrupt handler, insertion into the queue depends on the priority of the event. The current implementation is not very beautiful, but the code is written solely for demonstration, so I think it’s permissible.

void USART1_IRQHandler() {
  u1 = usart1::Read();

  (usart1Rx.priority() == event::Priority::High ? HighPriorityTasksQueue : LowPriorityTasksQueue).push(&usart1Rx);
  usart1::ClearInterruptFlag(usart1::InterruptFlags::RxNotEmptyInt);
}
// Код обработчика UART2 аналогичный

It’s easy to see that the possibilities for prioritization are almost limitless. For example, you can combine the priority of the coroutine itself and the waiting object and choose the highest (or lowest) of the two.

Conclusion

In this article, I tried to consider the issue of using C++ coroutines for task scheduling, offering the most primitive system and options for its consistent development. Further upgrades can already be considered some kind of operating system, which @Saalur has nicely revealed. I would also like to thank the respected @lamerok, who advised me on the issue of context switching (I highly recommend reading his article), although I decided not to include this issue, since it does not apply to coroutines.

I hope that I have been able to demonstrate at least to some extent with simple examples how the latest C ++ standard can be applied to task dispatching.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *