main implementation options

In this article we will look at the main tools for implementing Elxir distributed computing.

Basics

Let's start with the main character – the virtual machine BEAM. BEAM is the heart of both Erlang and Elixir. Unlike most virtual machines, BEAM was originally designed with the needs of telecommunications systems in mind, where fault tolerance and low latency are important.

Main features of BEAM:

  1. Easy processes: BEAM manages millions of lightweight processes with minimal overhead. These processes are isolated.

  2. Multithreading and Parallelism: BEAM automatically distributes processes across available processor cores.

  3. Actor model: In Elixir and Erlang, processes do not share memory and communicate with each other exclusively through messages.

  4. Hot code swap: BEAM supports hot code update.

To understand how easy the process is on BEAM, let's look at a simple example:

spawn(fn -> IO.puts("Hello from a process!") end)

This code creates a new process that prints a message to the console. Creating and terminating the process requires virtually no resources.

Now let's look at the three main components on which distributed systems are built in Elixir: Node, Process And Message Passing.

Nodes

Node — is an independent instance of the BEAM virtual machine. In the context of distributed systems, a node is a single node that can run on a single server or on multiple different servers.

To connect one node to another, use the command:

Node.connect(:"node2@localhost")

You can run multiple nodes on the same or different hosts, and each node will communicate with the others via messages. All nodes in the network must use the same cookie for authentication.

Processes

Processes in Elixir are the fundamental building blocks of distributed systems. Each process is isolated and communicates with other processes only through messages.

An example of a simple process that can receive messages:

defmodule SimpleProcess do
  def start do
    spawn(fn -> loop() end)
  end

  defp loop do
    receive do
      {:msg, from, text} ->
        IO.puts("Received message: #{text}")
        send(from, {:ok, "Message processed"})
        loop()
    end
  end
end

# Запуск процесса
pid = SimpleProcess.start()

# Отправка сообщения
send(pid, {:msg, self(), "Hello, process!"})

# Получение ответа
receive do
  {:ok, reply} -> IO.puts("Reply: #{reply}")
end

A process can receive and process messages, and send responses.

Message transmission

Message passing in Elixir is done using operators send And receiveThis model of interaction is called actor model and allows you to create systems in which processes do not block each other.

As mentioned earlier, processes in Elixir are isolated from each other and do not share memory, which reduces the likelihood of multithreading errors.

Now let's look at how to set up and run a distributed cluster on Elixir.

Running multiple nodes is the first step towards creating a distributed cluster.

This is how you can run multiple nodes on your local PC:

# Запуск первой ноды
iex --sname node1 --cookie my_secret_cookie

# Запуск второй ноды в новом терминале
iex --sname node2 --cookie my_secret_cookie

Once the nodes are running, they can be connected to each other:

# Подключение node1 к node2
Node.connect(:"node2@localhost")

# Проверка подключения
Node.list() # должно вывести [:node2@localhost]

Once the nodes are connected, you can organize interaction between processes on different nodes. To do this, just use the same commands send And receivebut specifying the PID of the process on another node.

Example of sending a message to another node:

# На node1
pid = Node.spawn(:"node2@localhost", fn ->
  receive do
    {:msg, text} -> IO.puts("Message from node1: #{text}")
  end
end)

# На node2
send(pid, {:msg, "Hello from node1!"})

The process is created on node2but sending the message is done with node1.

It is important not only to organize interaction between processes, but also to ensure their reliability. This is achieved with the help of supervisors — OTP components that automatically restart processes if they fail.

An example of a simple supervisor that manages processes on different nodes:

defmodule ClusterSupervisor do
  use Supervisor

  def start_link(_) do
    Supervisor.start_link(__MODULE__, :ok, name: __MODULE__)
  end

  def init(:ok) do
    children = [
      {Task, fn -> Node.spawn(:"node1@localhost", MyWorker, :start_link, []) end},
      {Task, fn -> Node.spawn(:"node2@localhost", MyWorker, :start_link, []) end}
    ]

    Supervisor.init(children, strategy: :one_for_one)
  end
end

# Запуск супервизора
ClusterSupervisor.start_link(nil)

The supervisor manages two processes running on different nodes. If either process fails, the supervisor will automatically restart it.

Useful tools

Distributed task

Task — is a basic abstraction in Elixir for managing asynchronous tasks, which simplifies starting processes and processing results. In the context of distributed systems Task allows you to run tasks on different nodes, making the code more parallel and, therefore, faster.

Built-in module Task can be used to create distributed tasks that run on remote nodes.

Let's say there is a task that is time-consuming and resource-intensive for a single computer. You can distribute this task across multiple nodes in a cluster using Task:

defmodule DistributedTaskExample do
  def start(nodes) do
    nodes
    |> Enum.map(&Task.async(fn -> perform_task(&1) end))
    |> Enum.map(&Task.await/1)
  end

  defp perform_task(node) do
    Node.spawn(node, fn ->
      # симуляция длительной операции
      Process.sleep(1000)
      IO.puts("Task completed on node: #{Node.self()}")
    end)
  end
end

# Использование:
nodes = [:"node1@localhost", :"node2@localhost"]
DistributedTaskExample.start(nodes)

Create and run tasks on multiple nodes using Task.async/1 And Task.await/1Each node performs its task, and the results are synchronously collected on the node from which the launch was initiated.

Task good for relatively simple tasks that can be easily parallelized without requiring complex state management or error handling.

GenServer

GenServer — is a component in the OTP ecosystem on Elixir, which is a server process with the ability to handle requests from other processes. It has a set of abstractions for building robust servers that run on a single node or in a distributed system.

GenServer – a great choice when you need to maintain state between calls, handle messages, asynchronous requests, or manage complex computations.

To start using GenServerlet's create a simple example of a distributed server that will maintain a common state between all connected nodes:

defmodule DistributedCounter do
  use GenServer

  # API
  def start_link(initial_value) do
    GenServer.start_link(__MODULE__, initial_value, name: __MODULE__)
  end

  def increment() do
    GenServer.call(__MODULE__, :increment)
  end

  def get() do
    GenServer.call(__MODULE__, :get)
  end

  # Callbacks
  def init(initial_value) do
    {:ok, initial_value}
  end

  def handle_call(:increment, _from, state) do
    {:reply, state + 1, state + 1}
  end

  def handle_call(:get, _from, state) do
    {:reply, state, state}
  end
end

# запуск на одной ноде
{:ok, _pid} = DistributedCounter.start_link(0)

# увеличение счётчика с другой ноды
Node.connect(:"node2@localhost")
:rpc.call(:"node2@localhost", DistributedCounter, :increment, [])

# получение значения счётчика
DistributedCounter.get() # должно вернуть 1

DistributedCounter is a distributed counter that increments its value when the method is called increment/0. :rpc.call/4allows you to call functions on remote nodes.

Swarm

Swarm — is a library for dynamically distributing processes in an Elixir cluster. It automatically manages the distribution of processes between nodes, ensuring a uniform load and high fault tolerance. Swarm integrates well with GenServer and other OTP components.

Suppose you want to create a service that handles many tasks, and you want these tasks to be distributed across all available nodes. Using Swarmyou can easily achieve this:

defmodule MyWorker do
  use GenServer
  use Swarm, strategy: :gossip

  def start_link(_) do
    GenServer.start_link(__MODULE__, %{}, name: :my_worker)
  end

  def handle_call(:do_work, _from, state) do
    # Симуляция работы
    Process.sleep(1000)
    {:reply, :ok, state}
  end
end

# Запуск работы
Swarm.register(:my_worker, MyWorker, [])
GenServer.call(:my_worker, :do_work)

Swarm will automatically distribute the process MyWorker to available nodes.

Horde

Horde — is a library for creating distributed supervisors and registries.

Example of use Horde to create a distributed supervisor that manages a set of processes:

defmodule MyApp.DistributedSupervisor do
  use Horde.DynamicSupervisor

  def start_link(_) do
    Horde.DynamicSupervisor.start_link(
      name: __MODULE__,
      strategy: :one_for_one,
      members: :auto
    )
  end

  def start_worker(worker_module, args) do
    Horde.DynamicSupervisor.start_child(__MODULE__, {worker_module, args})
  end
end

defmodule MyWorker do
  use GenServer

  def start_link(_) do
    GenServer.start_link(__MODULE__, %{}, name: :my_worker)
  end

  def handle_call(:do_work, _from, state) do
    Process.sleep(1000)
    {:reply, :ok, state}
  end
end

# Запуск супервизора
{:ok, _pid} = MyApp.DistributedSupervisor.start_link(nil)

# Добавление worker в распределённый супервизор
MyApp.DistributedSupervisor.start_worker(MyWorker, [])

Horde controls the supervisor, which distributes processes across nodes. If one of the nodes fails, Horde will automatically move processes to other nodes.


In conclusion, we invite all system analysts and those interested in this area to an open lesson on August 20 – there we will consider the differences between user scenarios (Use Cases) and user stories (User Story). You can sign up for the lesson for free on the online course page.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *