Asynchronous telegram bot on bash, through the eyes of a C# programmer

There are many articles on the Internet about how to create simple bash-telegram bots that would perform, for example, alerting. Often this comes down to an eternal loop that pulls tg-api every few seconds. But what if I have more wants than such a solution can provide? Wants:

  • The conversation is conducted asynchronously in several chats

  • There are more chats than processes in the application

  • The bot remembers what stage each conversation is at

  • The process of writing a bot should at least resemble working with popular OOP languages

  • The bot should be easily scalable to a larger number of users.

This is a story about an attempt to pervert oneself to create a small but convenient tool for writing bots that meet my requirements.

Why, and most importantly why?

Why?

Why?

I was driven by a sporting interest: “Is it possible to design a tool so that it would be as convenient as possible for me, a person who simply loves bash, but does not work on it, has a poor understanding of bash programming practices and is generally accustomed to C#?”

Preparing to write

Features of the language

To start with, I would like to highlight those things in bash that, in my subjective opinion, can create difficulties during coding:

  • No multithreading support

  • No classes or structures

  • There are only two scopes of variables, global and local (within functions)

  • Not full dictionary support

Auxiliary features

Error processing

Before writing, I decided to implement tools for convenient work with errors.

Features that I considered necessary:

# Глобальная функция получения стектрейса.
trace_call() {
    local frame=0
    # Цикл с перебором всей глубины стека.
    while caller $frame; do
        local info=$(caller $frame)
        # Сохранение стека в одну строку через "->".
        stack_trace="$stack_trace -> $info"
        ((frame++))
    done

    echo "$stack_trace"
}
export -f trace_call 

# Пример использования.
trace_call
# Глобальная функция логирования.
default_logger(){
    # Получение сообщения и уровня важности через параметры.
    local message=$1
    local level=$2
    local current_time=$(date +"%Y-%m-%d %H:%M:%S")
    local stack_trace=`trace_call`
    # Сохранение в формате JSON
    local json=$(cat <<EOF
{
  "time": "$current_time",
  "message": "$message",
  "level": "$level",
  "stack_trace": "$stack_trace",
}
EOF
)
    echo $json >> $LOG_FILE
}
export -f default_logger 

# Пример использования
default_logger "Sum func error" "ERROR" 
  • Checking for successful execution of any script and subsequent launch of the handler (analogous to catch), also in one function

# Глобальная функция обработки ошибок.
catch() {
    # Получение кода завершения предыдущей функции.
    local exit_code=$?
    local catch_function=$1

    # Проверка кода на успешность.
    if [ $exit_code -ne 0 ]; then
        # Запуск переданного скрипта обработки.
        eval $catch_function
    fi
}
export -f catch 

# Пример использовани.
sum_func
catch '
  # Код который запуститься в случае ошибки.
  default_logger "Sum func error" "ERROR" 
'

(Probably, instead of the catch function, the || operator could have been used, but catch was prettier)

Automatic initialization of all functions within the project

I wanted to achieve that when writing a project, it would always be possible to use any function from the rest of the project. For example, write func1, and know for sure that it is already initialized and no “func1: command not found” will appear. So, I needed to create a mechanism for initializing all functions from all files at once before launching the bot itself.

It turned out that there is a simple implementation. I made it a rule to write only all the code inside functions, and also divided the project into some “packages”. The packages were directories that contained .sh files with a set of functions. Then, to connect a “package”, I just had to run each file inside the directory in any order.

Thus, the entire tool became a set of functions that use each other. And to launch the entire bot, it became necessary to connect all the “packages”, and then simply launch several of the functions provided by the “packages”.

Below is a function that takes a directory address and recursively runs each of the .sh files inside it.

using() {
    local directory=$1

    for path in "$directory"/*
    do
        if [[ -f "$path" && "$path" == *.sh ]]; then
            source "$path"
        elif [[ -d "$path" ]]; then
            using "$path"
        fi
    done
}
export -f using 

In the future, it turned out that the decision to write all the code only inside functions was correct for other reasons:

  • The global scope was no longer polluted with local variables

  • By reading the name of the function, you could always understand what a particular piece of code does.

  • The code has become more frequently reused

Simplified generation of background processes

I wanted to quickly add cyclic background processes and also automatically clean them up when the program is finished.

The add_job and job_runner_start functions have been implemented. One of them adds the function and parameters passed to it to the JOBS_LIST array, and the second one starts the functions inside the JOBS_LIST as separate processes.

add_new_pid_for_killing and process_killer_start were written to automatically remove all child processes after the application is closed. The add_new_pid_for_killing function adds a PID to the CHILD_PIDS array. And process_killer_start connects a termination signal handler that will kill processes from the array.

JOBS_LIST=()

add_job(){
    # Получение функции которая должна стать отдельным процессом.
    local function="$1"
    # Информация о том, как много одновременных процессов нужно.
    local num_of_process=$2
    # Таймаут в секундах, между перезапусками функции.
    local timeout=$3

    # Запись в массив всей переданной информации с разделителем ":".
    combined_record="$function:$num_of_process:$timeout"
    JOBS_LIST+=("$combined_record")
}


job_runner_start(){
    for job in "${JOBS_LIST[@]}"
    do
        IFS=':'
        read -r func num_of_process timeout <<< "$job"
        
        for ((i=1; i<=num_of_process; i++))
        do
            # Запуск функции как отдельного процесса.
            job_start "$func" $timeout &
            catch 'default_logger "error of run job: $func" "ERROR"'
            local pid=$!
            # Добавление дочернего процесса в список тех, кого нужно отключать.
            add_new_pid_for_killing $pid
            catch 'default_logger "error of write pid" "ERROR"'
        done
    done
}

job_start(){
    func="$1"
    timeout=$2

    while true; do
        $func
        catch 'default_logger "error of start of $func" "ERROR"'
        sleep $timeout
    done
}
CHILD_PIDS=()

add_new_pid_for_killing(){
    local pid=$1
    CHILD_PIDS+=("$pid")
}
export -f add_new_pid_for_killing

cleanup(){
    for pid in "${CHILD_PIDS[@]}"
    do
        kill $pid
    done
}

# Подключение обработчика события EXIT.
process_killer_start(){   
  trap cleanup EXIT
} 

Solution architecture

To process messages, I have put forward the following conditions.

  1. The order of message processing within a single chat must not be violated

  2. Each chat must be stored as two structures: a FIFO channel with messages and some current state of the chat.

  3. It should be possible to add an unlimited number of handler processes, so that each of them is equally loaded with message processing.

The most optimal architecture I could come up with was to add a balancer process. The balancer would issue commands to specific workers to process messages from specific chats.

In such an architecture, load distribution falls on the shoulders of the balancer, and the processors turn into something like state machines.

Interaction of processes

Interaction of processes

Message-Reader process operation algorithm:

  1. Got a list of new messages from the API

  2. Distributed new messages into FIFO files of specific chats

  3. Threw the balancer notifications about each new message

Balancer operation algorithm:

  1. Received notification of a new message or of the release of a handler

  2. Randomly selected a pending chat (If there are any)

  3. Randomly selected a free handler (If there are any)

  4. Transmitted the processing command

Worker operation algorithm:

  1. Waiting for a command from the balancer

  2. Getting chat id

  3. Read the current chat state and the last unprocessed message

  4. Execute business logic, save new state

  5. Sending a notification to the balancer that the handler is free

The writing process

Unfortunately, I did not dare to insert all the code at once and analyze all the details of the implementation, however, I will mention the most significant points. Also, there is link on the project's GitHub, where all the code and instructions for launching are present.

Balancing method:

balance() {
# Перебор существующих workers
# WORKER_FIFO_LIST это словарь вида worker_pid:current_chat_id.
  for i in "${!WORKER_FIFO_LIST[@]}"; do
      IFS=":" read -r worker_pid chat_id <<< "${WORKER_FIFO_LIST[i]}"

      # Проверка, привязан ли какой либо чат.
      if [[ "$chat_id" == "0" ]]; then

          # Перебор текущих чатов
          # CHAT_DICTIONARY словарь вида chat_id:num_of_new_messages
          for current_chat_id in "${!CHAT_DICTIONARY[@]}"; do

              # Поиск чата с необработанными сообщениями,
              # который на данный момент еще не обрабатывается.
              if [[ "${CHAT_DICTIONARY[$current_chat_id]}" -gt 0
                        && ! "${WORKER_FIFO_LIST[*]}" =~ "$current_chat_id" ]]; then
                  echo "$worker_pid:$current_chat_id"
                  WORKER_FIFO_LIST[i]="$worker_pid:$current_chat_id"

                  ((CHAT_DICTIONARY[$current_chat_id]--))

                  if [[ "${CHAT_DICTIONARY[$current_chat_id]}" -le 0 ]]; then
                      unset CHAT_DICTIONARY[$current_chat_id]
                  fi

                  # Отправка команды.
                  local fifo="$WORKERS_FIFO_DIR/$worker_pid"
                  echo "$current_chat_id" > "$fifo"
                  echo "Назначен chat_id $current_chat_id воркеру $worker_pid"
                  break
              fi
          done
      fi
  done
} 

Trivial message processing, in which there are three states:

  1. Chat not started

  2. “SOME_STATE”

  3. “OTHER_STATE”

process_message_base() {
    local chat_id=$1
    local state_file="$CHAR_STATES_DIR/${chat_id}.json"
    local fifo_file="$BASE_FIFO_DIR/$chat_id"
    local state=()

    # Попытка получения состояния.
    if [[ -f "$state_file" ]]; then
        readarray -t state < <(jq -r '.[]' "$state_file")
    else
        state=("SOME_STATE")
        // Сохранение нового состояния.
        echo "$(jq -n --argjson arr "$(printf '%s\n' "${state[@]}" | jq -R . | jq -s .)" '$arr')" > "$state_file"
        // Some Logic
    fi

    if [[ "${state[0]}" == "SOME_STATE"* ]]; then
        // Some logic
    fi

    if [[ "${state[0]}" == "OTHER_STATE"* ]]; then
        // Other logic
    fi
} 

conclusions

As a result, I got a tool with about 300 lines of code, allowing to quickly write scalable and asynchronous telegram bots. In this respect, I achieved my goal.

Also, this project has room to grow. In the future, you can replace files and fifo channels with Kafka or RabbitMQ, and store chat states in Redis. You can also think about parallelizing the balancer's work.

But there is a nuance.

In the process of writing, I learned many features of the Bash language that really make it very difficult to calmly write code (for example, that a dictionary cannot be returned from a function). For this reason, even using the current tool, writing complex business logic will take an unforgivably long time. The asynchronous telegram bot in bash has become a little more accessible to me, but it still remains in the “perverted” section.
But it was exciting! =)

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *