GigaSearch or Search Engine on GigaChat

Example of a factually accurate GigaSearch answer

Example of a factually accurate GigaSearch answer

Hallucinations are a phenomenon that until recently was the privilege of the human consciousness. However, with the development of text generative models such as GigaChat and ChatGPT, it has become possible to observe similar “illusions” in the world of artificial intelligence.

There are cases where hallucinations of the generative model are quite appropriate. For example, if you ask the model to generate a children’s fairy tale, then the presence of fictional characters and events in it will be very useful and the child will like it.

The problem of hallucinations of generative models

The situation is completely different with questions, for example, relating to real people or events. Here the user wants to get an answer, first of all, based on facts. But unfortunately, modern generative models often distort or invent non-existent facts in their answers. Moreover, the most interesting thing is that they do it very skillfully, and from the answer it is often impossible to understand whether the model invented this event or whether it really took place.

For example, ChatGPT recently made up a sex scandal and identified a real professor as the offender, citing a non-existent article in a well-known journal. All this turned out to be pure fiction, and the mentioned professor was never accused of anything like that.

Headline of The Washington Post article about the ChatGPT hallucination

Headline of The Washington Post article about the ChatGPT hallucination

Where do the hallucinations of generative models come from?

So why do large generative models come up with facts that don’t exist? The whole point is that in communication a person bases his reasoning on certain points in his memory, which he himself can evaluate for reliability. That is, if a person mentions some fact that he is not sure about or remembers inaccurately, he knows about it and will express uncertainty to his interlocutor using the words “possibly”, “most likely” or others.

The model, on the other hand, does not have the ability to evaluate the reliability of the facts that are stored in its scales. Each subsequent word is generated statistically as the most probable, while scientists have not yet learned how to check the factuality of the resulting statement without using external sources of knowledge.

Knowledge pruning problem

Another known problem with chatbot models is lack of awareness of the latest events in the world. In the GigaChat and ChatGPT models, at the moment, knowledge is limited to approximately the middle – second half of 2023. The knowledge limit date is the date when the model pre-training step was done. The model will not know about everything that happened after this date. As you know, the pre-training stage is very resource-intensive and time-consuming. It can take several months and uses a large cluster of video cards. That is, completely retraining a large model is expensive and time-consuming, so all large language models have a lag in the freshness of information.

For users, this means that the model may produce information that is already out of date. For example, when asked about the latest iPhone model, a chatbot can provide information about the previous one.

Solution using RAG

Scientists have been struggling to solve the problems of hallucinations and time lags in knowledge in generative models for quite some time, and finally, in 2021, an approach was proposed RAG (Retrieval-Augmented Generation)that is, generation supplemented by search results.

RAG operation diagram.  MIPS - Maximum Inner Product Search, that is, searching for the maximum scalar product of vectors.

RAG operation diagram. MIPS – Maximum Inner Product Search, that is, searching for the maximum scalar product of vectors.

In this article, we will not delve too deeply into the implementation details of the original approach. The main thing is that the authors proposed, along with the user’s request, to also transfer top-k documents from search results to the model to generate an answer based on them.

Thus, two problems are solved at once: hallucination and freshness of knowledge. Hallucinations can be avoided due to the presence of reliable facts in search results, and the freshness of knowledge is achieved due to the freshness of the indexes on which the search system is built. Search engines, in turn, the scientific community has long learned to update in a timely and effective manner.

Our implementation – GigaSearch

We at SberDevices were inspired by this approach and implemented our own version – GigaSearch.

How GigaSearch works

How GigaSearch works

This system works as follows. Optionally, depending on which system GigaSearch connects to, a user’s query can be passed through a classifier trained to identify factual issues. In this case, we use GigaSearch only for them. Other non-factual queries can be answered directly with GigaChat without using search.

Classifier based on encoder electra, which our colleagues from RnD recently trained. By the way, this model published and is publicly available.

Next, we add the top 3 documents from our search engine to the GigaChat prompt. Search is built on top of a proprietary knowledge base.

Scheme for adding search results to prompts

Scheme for adding search results to prompts

Thus, GigaChat “sees” not only the dialogue with the user, but also relevant documents with facts that help answer factually accurately and take into account the latest events in the world.

GigaSearch is already built into GigaChat Web, so anyone can “touch” the system with their own hands, to do this you need to follow the link https://developers.sber.ru/gigachat/ and log into your account (you must register the first time you log in).

Also, GigaSearch will soon appear in telegram bots: @gigachat_bot and in VK: vk.me/gigachat. You can write to them directly or add them to your chat with friends and colleagues.

As we mentioned, GigaSearch is enabled when answering open-ended factual questions.

For example:

GigaSearch in Salyut

GigaSearch has recently also been answering factual questions in our Salyut assistant; you can already ask it on all Sber smart devices.

What’s next?

The quality of GigaSearch answers consists of the quality of work of several components:

  • Search engine quality;

  • Quantity, correctness and relevance of data in the knowledge base;

  • GigaChat’s ability to use relevant material;

  • The ability of GigaChat to respond from parametric memory in case of irrelevant output.

We continue to improve each of these components, including expanding the knowledge base using user feedback. Therefore, we are waiting for your thumbs up and down, this is very important for us!

Acknowledgments

I would like to say a big thank you to my colleagues who participated in this project. Everything worked out only thanks to your positive charge and endless energy!

In touch with you

I also invite you to our Telegram channel Salute AI, in which my colleagues and I began to share our developments in the field of machine learning and other working issues. And in the corresponding chat Salute AI Community You can directly ask about all this and just chat.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *