how we made a conversational platform for creating bots

For example, many retailers use call centers for routine communications. Operators may not follow the script, make mistakes when recording data based on the results of a conversation in CRM, communicate irritably with clients due to fatigue, and even be rude, unable to cope with emotions. And if you choose between a call center and a bot, the bot will ultimately be cheaper and more manageable. Especially against the backdrop of rising prices for contact centers: salaries of operators and managers showed record growth in 2023 – by 28-30%.

The wise use of bots benefits both the business itself and its clients: on the one hand, they reduce the workload on employees and reduce the time for processing requests, on the other hand, clients receive the necessary information immediately. If the bot is configured correctly, its responses will not only be fast, but also accurate. Particularly valuable is the ability of bots to work without breaks and weekends, 24/7 – there are 11 time zones in our country.

And everything would be fine, but… where do you get such a bot? Who will manage it? How to make sure the bot is working correctly? What to do with the useful information that the bot receives from the client, where to store it?

The tool that handles all these requests is a conversational platform. Essentially, this is an application for creating bots with a clear and user-friendly visual interface, which greatly simplifies the development of bots and makes them accessible even to small companies without their own IT department. So we moved from custom development to creating a platform for developing omnichannel dialogue systems based on artificial intelligence.

Difficulties we encountered while creating the platform

The conversational platform is a system with a bunch of services, so its development was not without obstacles. One of the key tasks was the creation of a convenient visual editor that allows users without programming knowledge to independently create functional bots.

We have been on the topic of voice robots for a long time and understood what is being done on the Russian market, what competitors have plans for and what their products lack. Therefore, we relied on visualizing the process of creating and editing a script – we came up with a simple diagram for Robovoice in the form of a tree with blocks and arrows.

Difficulty 1: UX

Convenient UX was goal No. 1 for us – at the start, we looked at other Low-Code and No-Code bot designers and realized that there are practically no convenient tools for setting up a script in Russia. The solutions were implemented either in the form of many services, when NLU is configured in one interface, replicas are configured in another, and the script business logic editor is completely separate, and in the end everything needs to be kept in mind. Or it’s the other way around – the visual part is overloaded with blocks for every most incredible user case, and complex scenarios, such as making an appointment with a doctor or consulting on bank products, turn into a huge canvas of cubes, like in Minecraft. You simply get lost in such instruments.

We went through a lot of UX prototypes, testing various hypotheses and testing them with users. As a result, we managed to find a balance between a simple and understandable UX and sufficient flexibility and technology. And in the future, we improved a lot in Robovoice precisely thanks to the user experience.

Complexity 2: Nested and Background Scripts

We focused on implementing the basic functionality of the setup and began to see what users would be asked to “tweak.” It was the users who helped us set priorities. So the first feedback led to the emergence of background scenarios that could be turned on from any desired place in the dialogue and work out counter questions. It was important for users that the robot could answer off-topic questions, like “Who are you?” or “Where did you get my number?”, and if the bot continued the main line of dialogue instead of answering, it sounded stupid and greatly reduced the loyalty of callers.

Difficulty 3: BI analytics

It was not easy to ensure high detail of the bots' work and build analytics tools. We understood that it’s not enough for clients to just configure the bot—it needs to be improved. Therefore, we needed a tool that, firstly, would allow us to view (and listen to) any dialogue between a person and a robot, and secondly, would make it possible to build aggregated statistics on dialogues so that conversions and business efficiency of bots could be calculated.

To our deep surprise, at that moment there was not a single solution on the market with built-in analytics, as we saw it. That's why we've introduced a BI tool into Robovoice that allows you to set goals for specific conversation points, calculates the reach to those points, and presents it all in easy-to-read charts.

Challenge 4: Data Exchange

During dialogues, bots simultaneously collect a lot of useful statistics from their interlocutors. Since it was important for clients to automate not only the standard communication itself, but also the complex business processes associated with it, it was necessary to be able to interact with CRM and exchange data with the client’s external infrastructure. Therefore, we made a simple but functional API and several ready-made integrations for the most popular CRMs, including Bitrix24.

Now our platform allows you to extract information from customers into variables and share it with company systems. Plus, all dialogues are stored in a general section, where you can apply basic and complex semantic filters and download them, so that in the future managers can analyze it more deeply and draw conclusions that are useful for business.

Challenge 5: Implementing NLU

Robovoice's core was powered by our own conversation engine, which allows you to control the conversation. And so that she could move along the script, at the beginning we, like everyone else, started with conditions in regular expressions. This was convenient for small scenarios: I set up regular expressions and the bot already distinguishes “yes” from “no”, recognizes simple answers, selects the next block of the script and moves on to it. But when scenarios have become more complex, there are many more options for how a person can answer a bot’s question. Therefore, we needed a tool that would allow the bot to understand natural human speech and process it without thousands of algorithmic conditions.

There was only one solution – to use neural networks. Otherwise, a problem arises: a huge branched script flowchart, inconvenient for administration and also imperfect – it is impossible to predict all the answers. Therefore, we got confused and made a built-in mechanism for additional training of the model based on NLU. How it works: the analyst puts examples for training into the neural network, then, within the framework of a specific task, it effectively determines a person’s intention (intent) based on the meaning of the statement – and it doesn’t matter what words he expressed it in.

As a result, the connection between blocks has become more convenient and simpler – instead of a bunch of conditions with regular terms, there are several intents that are added by clicking. And the bot has become much better at understanding people.

Difficulty 6: natural sound and GPT

The next challenge was to make the robot talk like a human. For the most natural sound, we have added the function of loading audio files that the bot plays in blocks. To work with different languages, we also upgraded the synthesis – integrated with all the main speech resources (Yandex, Google, Tinkoff, etc.).

What's next

We are now making our bots even more native, so our conversational AI will be complemented by generative AI. We have already released a beta version with a built-in GPT block, which allows you to generate robot answers to various questions on the fly using just one instruction – that is, in fact, it takes on a large piece of the script from several ordinary blocks and communicates with a person independently, without additional settings.

We are actively working on our own synthesis and recognition services so that we can generate our own voice, which will work in conjunction with our smart GPT block and sound more natural than the one built into speech services.

how we made a conversational platform for creating bots