Telegram bot using Yandex.Disk (Python)

In 2022, I began to look at what was being done with the EP, and there, like Python, it showed promise, I read it, looked at all sorts of comparisons, and chose it. According to reviews, I was attracted by work with data, neural networks and genetic algorithms, to revive my program for roulette – for what else. I started reading Lutz, pumped up other books, leafed through something. I began to look, study the toolkit – versions of python, its Jupiter laptops, anaconda, VSC, PyCharm. Who else would have painted on the shelves how to use this PyCharm, some methods of working with it, but not on the Internet (or not found), they copy-paste the same thing from each other. I tried the GUI, the regular ones were not impressed, like QT, maybe I didn’t try it. I found Delphi4Python, climbed to see what was being done in their camp, tried Alexandria – I liked that there is a port in python, it even works somehow, I study as much as possible. I found their free PyScripter, I liked it, and I’m sitting on it for now. I do not have such big difficulties, it is enough.

I sketch out helper scripts for work: either to make an archiver, or to rename something in the amount of 200-300 files. And then in 2023 I learn about telegrams and its bots. At work, Vatsap was mainly for group chats, and therefore did not intersect. Phew, that’s enough, probably let’s talk about the bot).

Task

There is a huge array of regulatory and technical documentation: GOSTs, SP, OR, RD, STO, SNiP, SanPiN, etc., etc., etc. During the work, a lot of things accumulate in any PTO-shnik. Many people carry external drives for this.

Even during the maintenance of the facility, there is an accumulation / creation of executive and not only – various documentation: project, correspondence, AVK, AOSR, other acts, contracts, approvals, orders, permits, permits. And so I thought – we put it all in the cloud, and the bot gives access on request, and a lot of people from engineering staff may need a document, even when you are in the field or on vacation (otherwise run to the computer and send info, as always no one can find anything) )) Here to the bot and send them all, let him take the rap). Well, I’m lazy, so I have to learn programming in order to make the computer work for itself)).

In general, the world is now moving towards BIM technologies in construction. Such a bot will be somewhere around such a topic. Plans for the future: you send the bot an act of incoming control, and it draws up the verification log itself … oh, dreams 🙂

Ideas and implementation

The bot decided to do it for the time being for the standardization, it will run in, then it will be possible to think about production needs. The files will be in the cloud by folders. The bot should send a standard in response to a request. For the most part, files will be pdf, as a universal format that does not depend on encodings, etc.

Functionality

  • Upon request, the bot gives information according to the standard. I put NTD into the cloud in folders, the bot goes through the folder and if it finds it, it gives an answer. At the beginning, he downloaded the file from the cloud to the disk, sent it to the user, then deleted the file so as not to litter the disk. And here is the bottleneck – a file that the user does not need can be sent. But if the bot is deployed (placed on a VPS / VDS), then the disk may not be enough, so I later changed it to issue links to the file

  • The search folder can be switched with the command. I made the commands both Russian and English, both uppercase and lowercase. For example: /gost (or /gost or /GOST) – switches the search to the folder with GOSTs; /vsn – to a folder with BCH, etc.

  • Teams /start, /about And /help – in principle, everything is standard.

  • Got a team /sms – to send a message to the developer, i.e. to me).

  • Got a team /status – the bot gives an answer to it whether it works and what is the current search folder.

  • You can send a file from NTD to the bot. When he does not find the standard, he writes like that, if they say you send it, then after the check it will already be there. It accepts pdf, doc formats and puts it in the cloud in a separate folder. I will check manually, otherwise you never know if they send pictures from adult sites.

  • Maintains a report in which the user’s id is written and what was requested, sms or files sent.

  • Service commands are made separately, they are needed to control the bot. Made in alphanumeric code so that no one would guess. By such a command, the bot can send a report (such as /show report) or upload it to the cloud (such as /report to the cloud). There is still an idea to make a complete stop of the bot, but for now I removed such a command – there are errors.

Implementation:

We register a bot in the TG, we get a token – it’s like everywhere else.

In Yandex, through the ID, we get a token for our cloud disk – look, everything is also painted.

I open someone’s example of the implementation of the echobot and drove it.

I store tokens in a separate cfg_token.py file, it contains 2 lines with tokens:

telebot_token ='тут абракадабра от ТГ'
## токен ЯД
ya_dsk_token = 'тут абракадабра от ЯД'

First I used the PyTelegramBotAPI (Telebot) module, then I changed it to Aiogram – when there are a lot of users, it’s better to have asynchrony, like so. Still need a module from Yandex.

pip install aiogram
pip install yadisk

Import block, everything is standard:

from aiogram import Bot, types
from aiogram.dispatcher import Dispatcher
from aiogram.utils import executor

import cfg_token
import yadisk
import glob, os
import sys
import time
from datetime import datetime

import sqlite3 as sl

Perhaps there are extra ones, while you are doing 100500 times you will try, then one thing, then another.

Initial settings:

bot = Bot(token=cfg_token.telebot_token)
dp = Dispatcher(bot)
codirovk = 'utf-8'
# токен яндекс диска
y = yadisk.YaDisk(token=cfg_token.ya_dsk_token)
# загружаемый файл должен содержать в своем имени
format_name_files =['ГОСТ', 'Гост', 'гост', 'GOST', 'Gost', 'gost', 'SP', 'sp',
'СП', 'сп', 'VSN', 'vsn', 'ВСН', 'всн', 'STO', 'sto', 'СТО', 'сто']
# загружаемый файл должен иметь расширение
format_ext_files = ['.pdf', '.doc', '.docx', '.rtf']
search_dir="GOST" #папка по умолчанию стартовая для поиска

The encoding had to be specified explicitly for the text file of the report, so there is a variable coding. And then in one file even the Asian hieroglyphic began to break through sometimes.

##----записать данные в рапорт---------------
def report_to_txt(str15):
    try:
        with open('Report.txt', 'a', encoding=codirovk) as file4:
            file4.write(str15)
    except Exception as e:
        print('Ошибка: '+e)

When I started doing it, I trained on the GOST folder. A special folder was created in the cloud for uploading files and reports, then I will correct it when necessary. Working out of commands as well as in other examples. I shortened them here, I have more written there, for interpretation

@dp.message_handler(commands=['start', 'старт'])  ## команда /start
async def process_start_command(message: types.Message):
    set_base_bot(message.from_user.id, 'GOST')
    await bot.send_message(message.from_user.id, "Прива! Я бот-помошник! Ищу НТД и выдаю их Вам")
    await bot.send_message(message.from_user.id, "Текущая папка для поиска НТД: "+get_base_bot(message.from_user.id)+'. Её можно переключить командой (см. /help)')
    await bot.send_message(message.from_user.id, "Введите запрос на НТД (можно только номер или часть наименования):")

@dp.message_handler(commands=['help', 'хелп']) ## команда /help
async def process_help_command(message: types.Message):
    # тут не выставляем папку поиска, берем ее из базы
    await bot.send_message(message.from_user.id, "Введите запрос на НТД (можно только номер или часть наименования) и отправьте мне, а я поищу где-то и если найду, то отправлю Вам файл, по 1шт за раз.")
    await bot.send_message(message.from_user.id, "В данный момент включен поиск в папке: "+get_base_bot(message.from_user.id))

The command to switch the search folder, the rest are done similarly:

@dp.message_handler(commands=['GOST', 'gost', 'ГОСТ', 'гост']) ## команда /GOST
async def process_gost_command(message: types.Message):
    set_base_bot(message.from_user.id, 'GOST')
    await bot.send_message(message.from_user.id, "Установлена текущая папка для поиска НТД: "+get_base_bot(message.from_user.id))

The command to report to the developer, uses the previously given function report_to_txt:

@dp.message_handler(commands=['sms', 'смс']) ## команда /sms сообщение разработчику
async def process_sms_command(message: types.Message):
    report_to_txt('\nПользователь id'+str(message.from_user.id)+' отправил сообщение: '+message.text)
    await bot.send_message(message.from_user.id, "Сообщение разработчику отправлено")

User request. What the user requested may not be one, I show up to 7 documents found and how many of them in total, otherwise they will enter a request = ‘5’ and issue 400 documents. Therefore, the algorithm is configured to produce the 1st exact result.

For a long time I suffered with a shortened version of the link to the file, and somewhere in the corner of the Internet, with peripheral vision, I noticed a flashed version. A lot of people ask about this, but there is no direct answer. HERE HE IS: y.publish(file) + y.get_meta(file).public_url (Who knew that public_url should be written behind). I process like this:

@dp.message_handler(content_types=['text'])   ## получаем сообщение от юзера
async def get_text_messages(message: types.Message):
    search_dir = get_base_bot(message.from_user.id)
    report_to_txt('\nПользователь id'+str(message.from_user.id)+' сделал запрос на поиск в папке ' + search_dir +': '+message.text)
    await bot.send_message(message.from_user.id, 'Запускаю процесс поиска в папке ' + search_dir +' : '+message.text)
    if y.check_token():
        # ищем в папке документ содержащий запрос
        if not y.is_dir('/'+search_dir):
            await bot.send_message(message.from_user.id, 'Папка ' + search_dir +'  не обнаружена. Шо-то поломалось. Извините.')
        else:
            Spis = []
            for item in y.listdir(search_dir):
                if message.text in item['name']:
                    if len(Spis) < 7:
                        await bot.send_message(message.from_user.id, 'Обнаружен документ: '+item['name'])
                    Spis.append(item['name'])
            if len(Spis) == 0:
                await bot.send_message(message.from_user.id, 'Извините, пока такого документа не нашлось.')
                await bot.send_message(message.from_user.id, 'Но если Вы мне его сюда скинете, после проверки я его добавлю.')
            if len(Spis) == 1:
                    # ------------ВАР2 - даем ссылку на файл-------------------
                    y.publish('/'+search_dir+'/'+Spis[0])  # делаем публичный файл
                    # шлем ссылку
                    await bot.send_message(message.from_user.id, y.get_meta('/'+search_dir+'/'+Spis[0]).public_url)
                    # ------------ВАР1 - грузим файл в телегу через свой диск
                    # await bot.send_message(message.from_user.id, 'Загружаю. Ждите...')
                    # Скачивает на свой диск
                    # y.download('/'+search_dir+'/'+Spis[0], Spis[0])
                    # Отправляем в телегу
                    # f = open(Spis[0],"rb")
                    # await bot.send_document(message.from_user.id,f)
                    # f.close()
            if len(Spis) > 1:
                await bot.send_message(message.from_user.id, 'Найдено документов: '+str(len(Spis)) + '. Уточните запрос:')
    else:
        await bot.send_message(message.from_user.id, 'Извините по каким-то причинам диск не доступен. Попробуйте в другой раз')
        # ВАР1 -       когда скачиваем файл из облака для телеги на свой диск
        # remove_files() - тут функция удаления загруженного файла

I process the uploaded file from the user like this:

@dp.message_handler(content_types=['document']) # получаем файл от юзера
async def handle_file(message):
    try:
        pr1=0
        pr2=0
        # проверяем, содержит ли имя файла нужное название
        for item in format_name_files:
            if item in message.document.file_name:
                pr1=+1
        # проверяем, нужного ли формата файл
        for item in format_ext_files:
            if item in message.document.file_name:
                pr2=+1
        # если файл такой как надо, то качаем
        if (pr1 > 0) and (pr2 > 0):
            file_id = message.document.file_id
            file = await bot.get_file(file_id)
            file_path = file.file_path
            # ------------Вариант1 загрузки файлов на диск-------------------
            ## await bot.download_file(file_path, os.path.join('Download', message.document.file_name))
            # ------------Вариант загрузки файлов в яндекс-облако------------
            # путь к загружаемым в облако файлам от пользователей
            src="https://habr.com/GOST/Download/"+ message.document.file_name
            print(src)
            # грузим в облако файл от пользователя
            if y.is_file(src): # если такой файл есть то яндекс даст ошибку, поэтому: вот
                src="/GOST/Download/Double-"+datetime.now().strftime("%d.%m.%Y-%H.%M.%S")+'-'+ message.document.file_name
            y.upload(await bot.download_file(file_path), src)
            await bot.send_message(message.from_user.id, 'Загрузил. Спасибо. После проверки добавлю в свою базу:)')
            # сделать запись после загрузки в файл Report.txt
            report_to_txt('\nПользователь id='+str(message.from_user.id)+' прислал файл: '+message.document.file_name)
        else:
            await bot.send_message(message.from_user.id, 'Простите, но присланный Вами файл не содержит в имени тип НТД (ГОСТ, СП, ВСН и т.д.) и/или не подходит по формату, нужен .pdf или .doc')
    except Exception as e:
        print('Ошибка: '+e)
        await bot.send_message(message.from_user.id, 'Я наверное не смогу загрузить, шо-то сломалось и выдает ошибку: '+e)

The bot is almost ready, we add polling:

if __name__ == '__main__':
    executor.start_polling(dp)

##  executor.start_polling(dp, skip_updates=True)
##Параметр skip_updates=True позволяет пропустить накопившиеся входящие сообщения, если они нам не важны

But as you can see from the code, something else unfamiliar flashes there:

get_base_bot(message.from_user.id)
set_base_bot(message.from_user.id, 'GOST')

And the following problem arose: if Vanya selects the GOST folder, and while she drives in a request, Katya will switch the search folder to SP at that time and Vanya will receive the result for the SP folder. Cool. “What to do, what to do? – You need something. Hmm.”

In my thoughts, there are 2 options: through a file like txt, etc., or through a database. Chose through the built-in sqlite3 python. The small DB will store the id and the selected folder. To work, you need 2 functions, one installs, the other reads. In fact, they are almost the same and there are thoughts of making one. When the folder name is empty (=”) – then getand if specified, then set. Maybe I’ll fix it later. The program can be improved endlessly.

As a result, these functions are:

# запись данных о юзере - установка папки поиска для юзера, шоб друг другу не сбивали
def set_base_bot(user_id, name_dir):
    con = sl.connect('databasebot.db')
    with con:
        cur = con.cursor()
        cur.execute("CREATE TABLE IF NOT EXISTS user_seadir(id INTEGER NOT NULL PRIMARY KEY, seadir TEXT)")
        con.commit()
    with con:
        cur = con.cursor()
        cur.execute("SELECT seadir FROM user_seadir WHERE id = " + str(user_id))
        dat = cur.fetchone()
        if dat is None:
            cur.execute('INSERT INTO user_seadir (id, seadir) values(?, ?)', (user_id, name_dir))
            con.commit()
        else:
            cur.execute('UPDATE user_seadir SET seadir = ? WHERE id = ?', (name_dir, user_id))
            con.commit()
    con.close()

# получение данных о юзере - запрос папки поиска для юзера
def get_base_bot(user_id):
    con = sl.connect('databasebot.db')
    with con:
        cur = con.cursor()
        cur.execute("CREATE TABLE IF NOT EXISTS user_seadir(id INTEGER NOT NULL PRIMARY KEY, seadir TEXT)")
        cur.commit()
    with con:
        cur = con.cursor()
        cur.execute("SELECT seadir FROM user_seadir WHERE id = " + str(user_id))
        dat = cur.fetchone()
        if dat is not None:
            return dat[0]
        else:
            return 'GOST'
            cur.execute('INSERT INTO user_seadir (id, seadir) values(?, ?)', (user_id, 'GOST'))
            con.commit()
    con.close()

You still need to figure out these functions, where, how to commit, whether the file is being overwritten, in general, there are a lot of questions about the nuances, but few people write about them, and the information on the Internet varies.

Well, now you will be funny. Get works as it should, Set hits an error (the above is already corrected):

def set_base_bot(user_id, name_dir):
…
      cur.execute('UPDATE user_seadir SET seadir = ? WHERE id = ?', (user_id, name_dir))

I looked for which farm I didn’t plow, everything turned out to be simple – the python also doesn’t know what I need 🙂

UPD. While I was making the article, I combined the 2 above functions into one:

def sget_base_bot(user_id, name_dir):
    con = sl.connect('databasebot.db')
    cur = con.cursor()
    with con:
        cur.execute("CREATE TABLE IF NOT EXISTS user_seadir(id INTEGER NOT NULL PRIMARY KEY, seadir TEXT)")
    with con:
        cur.execute("SELECT seadir FROM user_seadir WHERE id = " + str(user_id))
        dat = cur.fetchone()
        if name_dir == '':                    # блок запроса установленной папки
            if dat is not None:
                return dat[0]
            else:
                return 'GOST'
                cur.execute('INSERT INTO user_seadir (id, seadir) values(?, ?)', (user_id, 'GOST'))
        else:                                 # блок установки папки поиска
            if dat is None:
                cur.execute('INSERT INTO user_seadir (id, seadir) values(?, ?)', (user_id, name_dir))
            else:
                cur.execute('UPDATE user_seadir SET seadir = ? WHERE id = ?', (name_dir, user_id))
            return name_dir
    cur.close()
    con.close()

This function has become double – set and get in one vial: if the string is specified name_dir , then it works like set – setting the folder, if ” (empty string) is sent – issuing information on the installed folder. The search folders themselves are indicated in the response to the /help command and are in the menu. Over time, the range will expand.

end UPD

When the user first appears in the bot, I include the GOST folder for him. And then he himself will choose, help and a menu to help.

I also made a menu for the bot. I will also describe, otherwise there is confusion in the internet. I made a file with txt commands, not even a file – just write the text for copy-paste even in a notepad, even in a Word. Here is my:

Next, we go as they write everywhere, we reach the message and then the points 1-press the button, 2- an invitation will appear, 3-copy-paste our list of commands and click on send. Mustache.

Perhaps my trouble was in the multi-line message. But in the end the menu turned out:

Structure

The structure of the bot in the folder is like this:

1.__pycache__ – folder

2.bot.pyw

3.cfg_token.py

4.databasebot.db

5.report.txt

Folder 1 is created by Python itself. In it, he adds the cfg_token.py bytecode file. It is better not to show it to anyone, inside you can draw out tokens.

Files p.4 and p.5 are made by the program itself – if deleted, it will do it again).

The bot itself is item 2 – bot and item 3 – tokens.

And so on the Yandex disk:

The bot extension was changed from .py to .pyw to not display the console window (I work in Win).

Phew, that’s all. But no, they forgot about hosting.

A little about hosting

So hosting options:

1) on your computer, launched, there is an Internet – the bot is working. So far, mine works like this, i.e. while I’m at work.

2) Well, in the internet there are options with VPS / VDS – for the most part they are paid. Making a free bot, I’m not ready to spend. It seems that for now PythonAnywhere has remained more or less as an option.

3) With two sub-options – a smartphone, which is essentially a computer, all the time on the network, all the time it works, than not a server)). Of the minuses – it may drain the battery, but what just doesn’t drain it:

  • Pydroid3 – works), but sometimes, if not often, it breaks. This program has cool modules / packages that are paid, but we were lucky with aiograms and Yandex);

  • UserLAnd with Ubuntu CLI – it works even cooler, you need to install python (built-in version 2.7 like), packages too, also mc (it’s something like Norton Commander under dos) to pick up bot files from the downloads folder. There, at the root, I made a bot folder, and put the bot files into it.

Commands that I used:

cd – go to the root,

ls – directory listing,

cd bot – go to a folder,

python3 bot.py – I launch the bot (I return the extension from pyw to py)

You can also be smart with services, so that it would start and work by itself – hands didn’t reach. And also thinking the old phone under the server for this))

(This gave me when I launched a bot on my computer and smart, simultaneous work)

(I turn off the bot on Ubuntu in the morning, it will be launched on my computer)

  • Termux refused to work at all in this regard, maybe I’m not a programmer or not a Linux user, I didn’t defeat him.

Plans

  1. Asynchrony of the built-in database – I don’t know how relevant yet. Maybe you can score). In pip-ah, someone seems to have bungled it, but if I’m not mistaken, it was for postgres.

  2. I made a service command to stop the bot (such as /stop bot) through sys.exit () gives a bunch of errors. I studied many ways to interrupt the script, I tried it – it hits errors. As I understand it, this is due to the asynchrony of the bot, perhaps aiogram is the reason for this. Somewhere they posted a version of the bot with a list of tasks and a gradual stop of the bot processes and a beautiful, error-free stop. But, this is not for me yet, it will be difficult. If you don’t bother, it might work. I am currently studying this issue.

  3. Ask Yandex for a separate disk for the bot and put it there, make folders, fill in the regulations. To do this, just change the token in the bot. upd made.

  4. I don’t like searching by folder, there is an idea to create a separate table in the database for links to files + a service command for indexing will have to be invented. If I add any files to the cloud, then I will command the bot to do the indexing. Or maintain such a table as requests are made: first, search in the database, if not, then search on disk.

  5. Well, polish the code, to the ideal)

Conclusion

A tool has been created for accessing and filling the NTD database via a messenger.

Look like that’s it

P.s.1. Constructive criticism is categorically welcome.

P.s.2. I didn’t finish the experiments with the bot, there may be malfunctions in the process, and well, whoever needs it – Normativkabot.

P.s.3. If suddenly the TG fails, I think it will not be difficult to sketch a GUI with similar functionality and launch it by word of mouth.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *