Conversational interface – a variant of the user interface (UI – User Interface).
Consider how the conversational interface evolved, starting with pressing buttons on a push-button telephone and gradually turning into dialogue in natural language, indistinguishable from human.
1st level. Click 1.
A long time ago, when telephones with push-button dialing had just appeared, some automatic telephone exchanges had a conversational level 1 interface, a voice menu.
The caller heard something like this:
Hello, you have called such and such a company.
Your call is very important to us.
If you need a sales department, press 1.
If you need a purchasing department, press 2.
Press 3 to listen again.
2nd level. Say “One”.
The earliest developments in the field of speech recognition led to the fact that on some PBXs the word PRESS was replaced by the word SAY. Everything else, out of habit, remained the same, only one word changed.
If you need a sales force, say “One”.
If you need a purchasing department, say “Two”.
To listen again, say “Three”.
It was necessary to speak very clearly and only these figures. Sometimes it worked.
Level 3. Say “Order”.
At the next stage, the numbers began to be replaced with familiar words. It was possible to say not “One”, “Two”, but “Order”, “Operator” and so on. It began to sound more human, but it was necessary to get into the words in response very clearly.
If you want to order products, say “Order”.
To connect to the operator, stay on the line.
4th level. I’d like to place an order.
The growth of computing power, the development of neural networks and the subsequent improvement of speech recognition have become a significant leap forward. The robot’s phrases sounded as before, but the robots began to understand not only individual words like “Order”, but also natural phrases similar in meaning – “Order”, “I want to order”, “Hello, I wouldn’t place an order” and everything is similar. At the same time, it became possible to interrupt the robot and the robot began to be able to ask again if it was not sure of the correctness of recognition or did not find the appropriate behavior scenario.
Most of the voice robots are now at level 5. In practice, this is the 4th level, in which a complex query is built into a sequential chain of simple queries. So, when accepting an application for water delivery, the bot will sequentially ask for the name of the water, container volume, quantity, delivery day, delivery date, address.
This is already very similar to a human dialogue, only seemingly lazy and drawn-out.
Tell the name of the water – Shishkin Les
Give the volume of the container – 19 liters.
And how many – 2 pieces.
Check the delivery day – tomorrow.
Check the delivery interval – after lunch.
This is not usually the way to speak. Well, or only if two very lazy people, or an examiner with a two-man.
Acceptance of the order in this case may take more than a minute. And then, if all is well. If the user walks a little past the algorithm, then the robot stoically returns the user back to the questions, and can do this indefinitely. Emotionally, the robot does not worry, but the user, if something does not work out for a long time, hangs up. In general, it is reasonable to transfer a call to an operator after 1-2 misunderstandings.
Let’s compare how a live operator accepts an order.
– Good day. I am ready to take the order.
– Shishkin forest 2 large bottles tomorrow afternoon Riga 23 78.
All in one phrase. 10 seconds for the whole order.
Accepting an order in this way, in one phrase, is level 6.
At level 6, the voice robot accepts the entire order in a completely natural language, like a live operator, and using built-in algorithms, it pulls out all the entities and sends the request to the system.
Of course, the optimal interaction also depends on the user. The user should still roughly fall into the template and meaning of the message. If the user says “Okay, girl, yesterday the neighbors told me that you have good prices on Thursdays. Tell me honestly, is this so or not? “, Then the robot” freezes “. If a “chatter” is not provided, then it is better to transfer such a user to an operator immediately.
As a result, one multichannel robot and one live operator “on safety net” became enough to receive orders. There were 5 operators or 50 before – it doesn’t matter, in both cases 1 robot and 1 live operator are enough now.
Near future. 7 level.
– Is this water delivery?
– Good afternoon, Ivan Timofeevich. As usual, two big ones home by Friday?