Sending and processing media messages

Greetings! Once again I want to thank you for your support and interest in my work. It's nice to know that the information I share is useful to many of you, and in some cases even causing a stir.

Today we will once again dive into the world of Telegram bots on Aiogram 3.x and look at sending and processing media messages.

If you have not yet read my previous article, where I covered in detail the basics of working with a Message handler, I highly recommend doing so. In that article I discussed such important aspects as:

  • Copying messages (copy method)

  • Replacing text in a message (edit_text method)

  • Replacing the keyboard in messages (edit_markup method)

  • Sending messages using the forward, answer, reply and send_message methods

This knowledge is critical to understanding how to work with media messages, as they use similar techniques with slight differences depending on the type of message. For example, to work with photos, the methods answer_photo, reply_photo, send_photo are used, and for documents – answer_document, reply_document, send_document, and so on.

It is also important to remember that the rules that apply to text messages also apply to media messages, except that there is no text object (message.text) in the media. Where possible (for example, when adding a caption to a photo or video), the caption element is used

Methods for editing text in media are replaced with methods for editing captions (edit_caption), and the edit_media method is used to replace media content. Replacing keyboards for all media messages that support them is no different from replacing keyboards for text messages.

Today we will take a closer look at the different types of media content and their features:

  • Photo

  • Video

  • Video messages

  • Audio

  • Audio messages

  • Sending a media group

  • Sending animations, stickers and more

  • Let's analyze the imitation of actions

  • I’ll show you a few tricks that will allow you to bypass some of the limitations of Telegram, and in general will save you a lot of time if you decide to seriously engage with bots

To fully understand working with the message object and the features of media messages, I strongly recommend studying my previous article before delving into today's material.

Let's start studying media in Telegram bots on Aiogram 3.x!

Sending files

Regardless of the type of media files, sending can be done in the following way:

  • Physically sending files (bytes) via FSInputFile

  • Via file ID (it doesn’t matter if it’s a photo, video, audio, etc.)

  • Through a URL link (the format is supported for most file types, it is important that the link leads to the honey file, and not just to the page where this file is located).

Let's get started with the analysis. Now we will consider the topic of sending files, and to make it easier to demonstrate this process to you, I have prepared a folder called all_media in the root of the bot project. I placed the following file types in it:

  • Several regular videos

  • Small square video (for sending a round video)

  • A couple of small audio files

  • Several photos of different sizes

I also prepared several links to media content (pictured). I strongly recommend that you take a break now and prepare media for testing sendings. At the very least it will be more interesting.

In the root file create_bot.py I import os and write the path to the file like this:

all_media_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'all_media')

Next, I will simply import this path (variable) in the bot handlers.

I strongly recommend that you also specify the path to folders and files this way. This structure will allow you not to worry about file accesses (paths) when you move your bot from Windows to Ubuntu, for example (file paths are written differently).

Further, to indicate a specific file, you can use the following entry:

photo_file = os.path.join(all_media_dir, 'photo.jpg')

Agree, it's convenient.

1. Physically sending files (bytes) via FSInputFile

This method allows you to download files from your local device (server, local computer). To use this method you first need to import:

from aiogram.types import Message, FSInputFile

Let's now send an audio message and just audio. To do this, I suggest linking future file sending to special commands: /send_audio, /send_voice and so on.

We write a handler to send an audio message:

@start_router.message(Command('send_audio'))
async def cmd_start(message: Message, state: FSMContext):
    audio_file = FSInputFile(path=os.path.join(all_media_dir, 'new_message_tone.mp3'))
    await message.answer_audio(audio=audio_file)

If you do not want the bot to send the real file name, then pass the filename argument to FSInputFile indicating the file name.

What is worth paying attention to is how we passed the path to the file. Using the same principle, you can transfer the path to any media file (document, photo, video, animation, etc.).

Further, as I said above, the method for responding to a command is not very different from that of a text. The only thing you need to provide is the path to the audio file, a link to it (namely, a direct download link) or the file ID (more on this a little later).

Along with the audio message, you can send a keyboard (inline or text) and you can add a description (plain or formatted) via caption. You will learn further why you need to be careful with text keyboards, but for now an example:

@start_router.message(Command('send_audio'))
async def cmd_start(message: Message, state: FSMContext):
    audio_file = FSInputFile(path=os.path.join(all_media_dir, 'new_message_tone.mp3'))
    msg_id = await message.answer_audio(audio=audio_file, reply_markup=main_kb(message.from_user.id),
                                        caption='Моя <u>отформатированная</u> подпись к <b>файлу</b>')
    print(msg_id.message_id)

Using the example, I deduced print(msg_id.message_id), but this is not the only thing that may interest us in this object. After sending the file, we can intercept its file_id. This is very useful and important.

I output to the console msg_id.audio.file_id and this is what I got:

CQACAgIAAxkDAAIBu2ZsgniQlznR1VxJqbHB2pwjKuq2AALmSAACGKhoS96YeMoflmQgNQQ

I hope you understand the principle. If media messages are sent to the bot (photos, videos, audio, etc.) or it sends them itself, you will have the opportunity to intercept the file_id of these media files.

Essentially, the file ID is a kind of link on the Telegram servers, thanks to which the bot, which is the owner of this file, has the opportunity to resend files, but not physically, but by sending a link to this file.

Let's try sending this audio via ID.

@start_router.message(Command('send_audio'))
async def cmd_start(message: Message, state: FSMContext):
    # audio_file = FSInputFile(path=os.path.join(all_media_dir, 'new_message_tone.mp3'))
    audio_id = 'CQACAgIAAxkDAAIBu2ZsgniQlznR1VxJqbHB2pwjKuq2AALmSAACGKhoS96YeMoflmQgNQQ'
    msg_id = await message.answer_audio(audio=audio_id, reply_markup=main_kb(message.from_user.id),
                                        caption='Моя <u>отформатированная</u> подпись к <b>файлу</b>')

We look and see that the bot was able to send audio.

In addition to the obvious advantage of saving disk space, this approach allows you to send files that weigh hundreds of megabytes literally in milliseconds. For example, they sent you a large file on Telegram, let it be a whole movie. You liked it and decide to send this video to another person.

You will spend seconds on transfer, and you will not have to download the file. Sending files using their identifiers (ID) works on the same principle.

Here's how to retrieve IDs for different content types:

  • Audio – message.audio.file_id

  • Document – message.document.file_id

  • Video – message.video.file_id

And so on. The only difference will be in the photographs, and now we will focus on this.

Sending and processing photos

Surely you know that photos in Telegram can be sent with or without compression. In addition, photographs have their own previews (small size). Due to all these features, each photo file always has several identifiers, and in order to get the best quality photo, we need to do the following:

msg_id.photo[-1].file_id

That is, by sending one photo, you are sending a whole list of photos, and the best quality photo in this list will be the last one (index -1). Let's look at an example:

@start_router.message(Command('send_photo'))
async def cmd_start(message: Message, state: FSMContext):
    photo_file = FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-14_20-13-40.jpg'))
    msg_id = await message.answer_photo(photo=photo_file, reply_markup=main_kb(message.from_user.id),
                                        caption='Моя <u>отформатированная</u> подпись к <b>фото</b>')
    print(msg_id.photo[-1].file_id)

We see that the photo was sent, and in the console I received the photo identifier:

AgACAgIAAxkDAAIBwGZshp7dSSQi0VKxt6RKJgseyMHxAALM4DEbGKhoS4tvyaZWY29DAQADAgADeAADNQQ

We copy and try to send via the identifier.

@start_router.message(Command('send_photo'))
async def cmd_start(message: Message, state: FSMContext):
    # photo_file = FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-14_20-13-40.jpg'))
    photo_id = 'AgACAgIAAxkDAAIBwGZshp7dSSQi0VKxt6RKJgseyMHxAALM4DEbGKhoS4tvyaZWY29DAQADAgADeAADNQQ'
    msg_id = await message.answer_photo(photo=photo_id, reply_markup=main_kb(message.from_user.id),
                                        caption='Моя <u>отформатированная</u> подпись к <b>фото</b>')
    print(msg_id.photo[-1].file_id)

Everything worked out. Great.

Now let's send the photo via a link (you can also send other types of content via a link, but it is important that the link leads to the file, and not just to the page where the media content is stored).

@start_router.message(Command('send_photo'))
async def cmd_start(message: Message, state: FSMContext):
    # photo_file = FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-14_20-13-40.jpg'))
    # photo_id = 'AgACAgIAAxkDAAIBwGZshp7dSSQi0VKxt6RKJgseyMHxAALM4DEbGKhoS4tvyaZWY29DAQADAgADeAADNQQ'
    photo_url="https://indirimlerce.com/wp-content/uploads/2023/02/phyton-ile-neler-yapilabilir.jpg"
    msg_id = await message.answer_photo(photo=photo_url, reply_markup=main_kb(message.from_user.id),
                                        caption='Моя <u>отформатированная</u> подпись к <b>фото</b>')
    print(msg_id.photo[-1].file_id)

Everything worked out. Choose the file sending format that is convenient for you. Let's continue.

Now let's look at the methods edit_caption (rewriting the description to the media) and edit_media, but before we continue, I want to draw your attention to a very important point. Editing media and descriptions (edit_caption) will not be possible if you have attached a text keyboard to your media message.

That is, let's say you send a video with a caption and a text keyboard. There are no problems, everything is sent, but when trying to call the method edit_caption or edit_media you will get the error “Unable to change message.”

In this context, the following solutions are possible:

  1. Change the description, keyboard or media when you did not initially call any keyboard or called the inline keyboard from the media (then there will be no problems).

  2. Save the object of the sent message with media (get the media ID and description, copy, delete the media and resend).

Let's try:

@start_router.message(Command('send_video'))
async def cmd_start(message: Message, state: FSMContext):
    video_file = FSInputFile(path=os.path.join(all_media_dir, 'IMG_3998.MP4'))
    msg = await message.answer_video(video=video_file, reply_markup=main_kb(message.from_user.id),
                                     caption='Моя отформатированная подпись к файлу')
    await asyncio.sleep(2)
    await msg.edit_caption(caption='Новое описание к моему видео.')

We get an error:

aiogram.exceptions.TelegramBadRequest: Telegram server says - Bad Request: message can't be edited

Just remove the keyboard and try again:

@start_router.message(Command('send_video'))
async def cmd_start(message: Message, state: FSMContext):
    video_file = FSInputFile(path=os.path.join(all_media_dir, 'IMG_3998.MP4'))
    msg = await message.answer_video(video=video_file, caption='Моя отформатированная подпись к файлу')
    await asyncio.sleep(2)
    await msg.edit_caption(caption='Новое описание к моему видео.')

And everything works fine (everything will work correctly with an inline keyboard).

Now I’ll show you how to get around the problem with the inability to change the description if you have a text keyboard:

@start_router.message(Command('send_video'))
async def cmd_start(message: Message, state: FSMContext):
    video_file = FSInputFile(path=os.path.join(all_media_dir, 'IMG_3998.MP4'))
    msg = await message.answer_video(video=video_file, reply_markup=main_kb(message.from_user.id),
                                     caption='Моя отформатированная подпись к файлу')
    await asyncio.sleep(2)
    await message.answer_video(video=msg.video.file_id, caption='Новое описание к тому же видосу',
                               reply_markup=main_kb(message.from_user.id))
    await msg.delete()

Everything worked out.

Please note: method edit_caption also overwrites the keyboard. That is, if you had an inline keyboard with a media message, then if you do not transmit to edit_caption reply_markupthe keyboard will be removed.

edit_media method

This method takes one required argument: media. There should be one of the classes: InputMediaAnimation, InputMediaDocument, InputMediaAudio, InputMediaPhoto or InputMediaVideo. They are all imported from aiogram.types. There is an interesting point here: you can easily replace one file type with another. For example, there was a photo with a description, but instead there will be a video with a description. It all depends on your imagination.

It is recorded using this design using a video example:

new_video_file = FSInputFile(path=os.path.join(all_media_dir, 'IMG_4044.MP4'))
media = InputMediaVideo(media=new_video_file, caption='Новое видео и у него новое описание.')

Please note that you can convey a description inside, but you need to be careful here. The description is passed as an argument to InputMediaVideo.

Let's look at a specific example to make it clear:

@start_router.message(Command('send_video'))
async def cmd_start(message: Message, state: FSMContext):
    video_file = FSInputFile(path=os.path.join(all_media_dir, 'IMG_3998.MP4'))
    msg_1 = await message.answer_video(video=video_file,
                                       caption='Моя <b>отформатированная подпись</b> к файлу')
    await asyncio.sleep(2)
    await msg_1.edit_caption(caption='Новое описание к видео 1')

    await asyncio.sleep(2)
    new_video_file = FSInputFile(path=os.path.join(all_media_dir, 'IMG_4044.MP4'))
    await msg_1.edit_media(media=InputMediaVideo(media=new_video_file, caption='Новое видео и у него новое описание.'),
                           reply_markup=inline_kb())

Here we have combined answer_video, edit_caption And edit_media. I advise you to save this code somewhere, for example, by bookmarking the article. You are unlikely to find such information anywhere in the context of aiogram 3.

I hope this is clear. If you have questions, write in the comments. I'll give you a hint.

Example of sending voice and video messages:

@start_router.message(Command('send_voice'))
async def cmd_start(message: Message, state: FSMContext):
    await message.answer_voice(voice=FSInputFile(
        path=os.path.join(all_media_dir, 'krasivyie-snyi-nevinnost-zvezdnyiy-fon-zvukovyie-effektyi-43378.mp3')))


@start_router.message(Command('send_video_note'))
async def cmd_start(message: Message, state: FSMContext):
    await message.answer_video_note(video_note=FSInputFile(path=os.path.join(all_media_dir, 'IMG_4044.MP4')))

In order for a video message to be sent in a round shape, it must initially be square, but from personal experience I will say that if you want to imitate video messages, it is better to record them separately and save them through the admin panel.

Here's a small example. Without an admin panel, true, but you’ll understand the point.

@start_router.message(F.video_note)
async def cmd_start(message: Message, state: FSMContext):
    print(message.video_note.file_id)

It’s clear that there must be an FSM and a connected database, but we’ll get to that. Now we’ll just “catch” it through a magic filter F.video_note video message and display its identifier in the console.

The result is the following ID:

DQACAgIAAxkBAAICKGZspGExG2ZPTe6cxgrHFgl9V8caAALvSgACGKhoS8XEd0xdU4AKNQQ

We send:

@start_router.message(Command('send_video_note'))
async def cmd_start(message: Message, state: FSMContext):
    await message.answer_video_note(video_note="DQACAgIAAxkBAAICKGZspGExG2ZPTe6cxgrHFgl9V8caAALvSgACGKhoS8XEd0xdU4AKNQQ")

We see that the video message was successfully sent.

Sending a media group

The media group in the Telegram API is the strangest and, in my opinion, the most unfinished element. Despite the fact that a media group seems to be a separate object, it is perceived as a collection of individual objects.

There is no handler or filter to specifically calculate a media group. It turns out that when it becomes necessary to process files from a media group, you have to process photos and videos separately (yes, a media group can only consist of photos, videos, photos + videos). This is not so bad, but aiogram 3 is asynchronous. So it turns out to be a madhouse during processing.

I recently wrote a project. The task was to transfer posts from the Telegram channel to Odnoklassniki, and there each post is a media group and always video + photo + text. It will be interesting, I’ll tell you how I closed the task.

Now that I've shared my pain, I'll show you how to send a media group. Everything will be clear here. To send a media group we will need to import from aiogram.types InputMediaVideo (you already know how to work with it) and InputMediaPhoto.

Next, we need to separately generate a list of media objects. The list can include up to 10 media. Now I’ll collect my list, demonstrate the code and the result, and then we’ll discuss it.

@start_router.message(Command('send_media_group'))
async def cmd_start(message: Message, state: FSMContext):
    photo_1 = InputMediaPhoto(type="photo",
                              media=FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-05_09-32-15.jpg')),
                              caption='Описание ко <b>ВСЕЙ</b> медиагруппе')
    photo_2 = InputMediaPhoto(type="photo",
                              media=FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-14_20-13-40.jpg')))
    photo_3 = InputMediaPhoto(type="photo",
                              media=FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-05_09-32-15.jpg')))
    video_1 = InputMediaVideo(type="video",
                              media=FSInputFile(path=os.path.join(all_media_dir, 'IMG_4045.MP4')))
    photo_4 = InputMediaPhoto(type="photo",
                              media=FSInputFile(path=os.path.join(all_media_dir, 'photo_2024-06-14_20-16-27.jpg')))
    video_2 = InputMediaVideo(type="video",
                              media=FSInputFile(path=os.path.join(all_media_dir, 'IMG_3978.MP4')))

    media = [photo_1, photo_2, photo_3, video_1, photo_4, video_2]
    await message.answer_media_group(media=media)

Features to pay attention to:

  • You cannot bind any keyboard to a media group (unless you bind a text keyboard to some message before the media group).

  • If you leave a comment on the first object of the media group, it will be general. But if you link a description to several elements of a media group, then to read it you will need to open each element and read it.

  • A media group can only consist of photos, videos, photos + videos.

Imitation of bot actions

We have already looked at imitation of typing, but there is also:

  • Simulating a voice message recording

  • Simulating video message recording

  • Simulating video sending

To simulate actions by a bot, we need to import:

from aiogram.utils.chat_action import ChatActionSender

The general principle of operation is as follows. Working with ChatActionSenderwe use an asynchronous manager with. IN ChatActionSender we pass the bot object, the chat in which it should start simulating actions, and the type of imitation it should do.

Another option is to use special methods from ChatActionSender. In this case there will be no need to pass the parameter action.

  • typing – typing

  • upload_video – video download

  • record_video_note – record video message

  • record_voice – record a voice message

If you're using PyCharm, you can specify a dot after ChatActionSender and see what actions are available.

Next, if the bot can perform an action very quickly (for example, sending a message with a video via file_id or sending pure text), we can set an asynchronous pause of 2-3 seconds so that the user has time to see the simulation.

But there are times when it is necessary to send a large media file (video, for example) directly from the local machine. In this case, this imitation becomes indispensable. The user does not think that the bot is frozen, but sees that it is currently loading a video, recording a voice message, recording a video message, etc.

It actually looks interesting.

Let's add simulation to the handlers for sending audio and video messages.

@start_router.message(Command('send_voice'))
async def cmd_start(message: Message, state: FSMContext):
    async with ChatActionSender.record_voice(bot=bot, chat_id=message.from_user.id):
        await asyncio.sleep(3)
        await message.answer_voice(voice=FSInputFile(
            path=os.path.join(all_media_dir, 'krasivyie-snyi-nevinnost-zvezdnyiy-fon-zvukovyie-effektyi-43378.mp3')))


@start_router.message(Command('send_video_note'))
async def cmd_start(message: Message, state: FSMContext):
    async with ChatActionSender.record_video_note(bot=bot, chat_id=message.from_user.id):
        await asyncio.sleep(3)
        await message.answer_video_note(
            video_note="DQACAgIAAxkBAAICKGZspGExG2ZPTe6cxgrHFgl9V8caAALvSgACGKhoS8XEd0xdU4AKNQQ")

        
Simulating a voice message recording

Simulating a voice message recording

Simulating video message recording

Simulating video message recording

Conclusion

Friends, I understand that there is a lot of information in this and the previous article. You don't have to keep all this in your head. The most important thing is a general understanding of the principles, and all the details will come with experience.

In addition, from my own experience I will say that the most important thing is not just absorbing content, but practice. At the moment, if you are reading me, then I have covered all the basic topics of interaction with the bot, with the exception of FSM, the database (I will put this in one article), and maybe a couple more topics, but everything else is already there. Just repeat after me and you will succeed.

That's all for me. I hope for your positive response.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *