Samurai robot part 2. Hokku-bot for posting in VK

Hey! Earlier, I told you how I created a Telegram bot that writes haiku and selects a picchu for the topic. In this article I will tell you how I learned to filter images by size and how to work with vk_api. The last article came out a little crumpled, so in this article we will reconsider the whole principle of work.

Project Goals

Strictly speaking, the bot does not write haiku. He just makes up a three-line from the already existing lines, inserting them randomly. Thus, we get a work that is not entirely meaningful, although if you look at the translations of the original haiku, ours do not differ much. I will be glad to any ideas how we can implement the correct structure of haiku by syllables (Hokku building).

For me, the most important thing here is not the principle of constructing haiku (although I would like to improve the quality of the verses themselves), but working with API and requests. Therefore, this bot is more needed for my education, development and entertainment (all the same, the bot is a meme). Before I start, I want to remind you that you can see the source code at GitHuband see the result of the work in Telegram channel and VK group

How to compose haiku?

Haiku or haiku poems are traditional Japanese three-line poems composed according to a certain structure. Anticipating disputes in the comments (haiku or haiku), I will leave link. Here you can fully get acquainted with Japanese poetry.

We, in order to get a little closer to the style, we will simply collect three lines, without any law. It turns out that haiku themselves will have neither structure nor meaning and will generate meme entries with meme authors. Isn’t it funny?

To do this, I pre-assembled a database of traditional Japanese haiku and the names of the authors, from where the bot will take the information. After all the filtering and collection of popular Japanese names (I hope this is not racism), we get the following documents:

Number of options

To calculate the number of options, you need to turn to combinatorics. Our algorithm takes a random string every iteration. So options are possible when we have all 3 lines repeated.

Lone cricket.

Lone cricket.

Lone cricket.

– Dao Dao Dao 1111 AD

Now you need to strain a little to remember that this option corresponds to accommodationin our case also with repetitions.

It turns out that for us it is:

6.8 billion different options. If the bot sends messages every 8-16 hours, then on average this amount will be enough for us for 9 million years. 🙂

Telegram channel

Finally, when it is clear what should work and how, it’s time to write our bot. Initially, the project was conceived as a Telegram bot, so we start by creating a bot in Bot Father. Turning to the class telebot in library pyTelegramBotApi and at the same time save the channel name.

bot = telebot.TeleBot(token = BOT_TOKEN)
CHANNEL_NAME = '@hokky_t'

Now we need to create the haiku itself, which we will send to the channel. To do this, we write each line into a list, from where we will take elements for haiku. We do the same with the names of authors.

def hokky_bot():
    f = open('hokky.txt', 'r', encoding='UTF-8')  # Открываем файл с хокку
    all_hokky = f.read().split('\n')  # Записываем каждую строчку в отдульный элемент списка
    f.close()

    f = open('names.txt', 'r', encoding='UTF-8')  # То же самое для файла с именами
    all_names = f.read().split('\n')
    f.close()
    j = 0
    print('Power on!')   
    while j < 10000:
        a = randint(0,1)  # генерируем случайное число для вставки н.э или до н.э.
        if a == 1:
            era="до н.э."
        else: 
            era="н.э"
        i =0 
        name = [1, 2, 3]
        text = [1, 2, 3]
        while i<=2: 
            name[i] = all_names[randint(1, len(all_names)-1)]  # Формируем списки из 3 строчек хокку и 3-х имён
            text[i] = all_hokky[randint(1, len(all_hokky)-1)] 
            i += 1
        message = (f'{text[0]}\n{text[1]}\n{text[2]}\n\n     - {name[0].title()} {name[1].title()} {name[2].title()}, {randint(0, 2022)} г. {era}')
        j += 1
        search = text[randint(0,2)] 
        print(f'Японская живопись {search}')
        picture(message, search)
        time.sleep(randint(28800, 57600))

From the number of function usage randint() you can already understand how well our algorithm creates poems. I’ll tell you about the penultimate three lines later, but for now, our code already knows how to create this:

At this point, the text part is finished and has not been further developed. Now you can think about the visual component. Initially, the idea was to add text to the background of an image. But the text on the picture from the internet looked completely unreadable, and creating a stroke or shadow didn’t help.

search = text[randint(0,2)] 
print(f'Японская живопись {search}')
picture(message, search)

After all the options, the one where we are looking for pictures for the query ‘Japanese painting + line from haiku’ turned out to be optimal

We pass the message text itself and the search query to the function picture().

def picture(message, search):
    # Код для вставки своего хокку в изображение из request_photo
    # im = requests.get(request_photo('японcкая живопись', search))  
    # out = open("img.jpg", "wb")
    # out.write(im.content)
    # out.close()
    # image = Image.open('img.jpg')

    # # Создаем объект со шрифтом
    # font = ImageFont.truetype('font.name', size= int(image.width/15))
    # draw_text = ImageDraw.Draw(image)
    # draw_text.text(
    #     (int(image.width/50), int(image.height/4)),
    #     message,
    #     # Добавляем шрифт к изображению
    #     font=font,
    #     fill="#d60000") # Цвет текста
    url = request_photo(f'Японская живопись {search}')  # Функция поиска изображения
    bot.send_photo(CHANNEL_NAME, photo = url, caption = message)  # Отправляем в тг
    vk_post(url=url, message=message)  # Отправляем пост в ВК

Here we call the request_photo() function. It returns a link to a random image from a Yandex search query.

The vk_post() function is a bit of a spoiler, more on that later.

def request_photo(message):
    req = requests.get("https://yandex.ru/images/search?text="+message)
    ph_links = list(filter(lambda x: '.jpg' in x, re.findall('''(?<=["'])[^"']+''', req.text)))
    ph_list = []
    for i in range(1, 10):
        if len(ph_links[i]) > 5:
            if ph_links[i][0:4] == "http":
                size = ph_size(ph_links[i])[0]
                print(size)
                if size > 500:
                    ph_list.append(ph_links[i])
                    print(ph_list)

    return ph_list[randint(0, len(ph_list) - 1)]

Everything is simple here. First, we get the page code on request in Yandex Pictures. We then make a list of all the html document objects ending in .jpg, after which, through the loop, we filter only those that start with http. And the newly selected links are already sent to the function ph_links(), which in turn returns the size of the image. We get the width of the image, after which we add to the new list only those whose width is greater than 500 pixels. Thus, we filter out “soapy” pictures. And at the end, the function returns a random image from the resulting list.

Function ph_size sends a request to the url of the image we have chosen, after which it returns the parameter p.image.size. The introduction of such a filter greatly slowed down the bot. We also had to reduce the selection of links from the search page (we take only 10) due to the captcha output by the system.

def ph_size(url):
    resume_header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0",
    "Accept-Encoding": "*",
    "Connection": "keep-alive", 
    'Range': 'bytes=0-2000000'}  
    data = requests.get(url, stream = True, headers = resume_header).content
    p = ImageFile.Parser()
    p.feed(data)   
    if p.image:
        return p.image.size 
    # (1400, 1536) 
    else: 
        return (0, 0)

Another problem that I had to face is an error in obtaining a certificate:

[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

How to deal with this error, I never found, but it magically disappears when you turn it on VPN. So, after fixing the details, the Telegram channel is over. The bot sends us a haiku with a picture in the haiku theme.

Image example on request "Japanese painting lay down to sleep at night."
An example of a picture for the query “Japanese painting lay down to sleep at night.”

This result suits me very well. It’s no longer a shame to show such a Telegram channel to friends and laugh together. But what about making the bot manage several social networks at the same time. Let’s start with VK.

VK group

The easiest way to post to a community is API VK. Python library available vk_api to create api requests. With sending messages, everything is simple, you can refer to the method wall.post, indicating the community id with a minus sign. The main thing is that the user be an admin, or use the built-in token from the community settings. It is done like this:

vk_session = vk_api.VkApi('LOGIN', 'PASSWORD')
vk_session.auth()
vk = vk_session.get_api()
vk.wall.post(message=message, owner_id = '-213199160') 

Sending photos is a bit more difficult. You must first upload the photo to the album, then get photo id specify it in the parameter attachments. In the VK documentation details how to use this method. That is, to upload to the album, we must first save it to a local folder. As a result, we get the following function:

def vk_post(url, message):
    vk_session = vk_api.VkApi('LOGIN', 'PASSWORD')
    vk_upload = vk_api.upload.VkUpload(vk_session)
    vk_session.auth()  # Входим в аккаунт
    vk = vk_session.get_api()  # Возвращает VkApiMethod(self)

    im = requests.get(url=url)  # Скачиваем изображение
    f = open("img.jpg", "wb")
    f.write(im.content)
    f.close()

    with open ('img.jpg', 'rb') as f:
        ph = vk_upload.photo(photos=f, album_id=284394723)  # Загружаем фото в альбом
        ph_id = ph[0]['id']  # Получаем id фотографии
# Отправляем пост на стену группы
    print(vk.wall.post(message=message, owner_id = '-213199160', attachments= f'photo223988241_{ph_id}', copyright="https://t.me/hokky_t"))

Hooray! Now our bot sends the same records to the Telegram channel and the VK group. Quite simple at first glance, the project gave me a lot of new knowledge and skills, and also received a lot of pleasure. There are some ideas for new features, such as text voice acting, or adding photos not from the search, but creating them using a neural network ruDalle. In general, there is room for improvement.

Publication in
Publication in

That’s basically it. I am waiting in the comments for your ideas for implementation in the project, as well as comments.

Again, here are the links to Telegram channel and VK groupas well as GitHub. Subscribe, I will be very pleased!

Until then, until new articles.

Similar Posts

Leave a Reply