New fuzzy search and autocompletion in Manticore Search

TL;DR

We're excited to introduce two important new features to Manticore Search: simple fuzzy search and query autocorrection (or autocompletion). These features make searching more convenient and efficient for users. You can test them in our GitHub issue search demos.

New functionality is available in the development version. Check out documentation, to find out how to install it.

Introduction

You may already be familiar with how it works semantic search from the issue search demo on GitHub. In this article we want to talk about two new features: fuzzy search And autocompletion.

  • Fuzzy search helps to find results, even with small errors in the query.

  • Autocompletion suggests relevant queries as you type, speeding up your search.

These functions not only make search by issues on GitHub more convenient, but also demonstrate the capabilities of Manticore Search in practice, improving the interface and providing the opportunity to obtain more accurate results in many cases.

Implementation of fuzzy search

What's the problem with typos?

Even minor errors in a query can result in missing results. Users expect the search engine to understand their query, even if there are minor typos. For semantic search this can be solved through context, but for traditional keyword search the ideal solution is fuzzy search.

Old method of fuzzy search via CALL QSUGGEST

Previously, in Manticore Search, in the absence of another, methods were used CALL QSUGGEST and CALL SUGGEST for fuzzy search. They are quite effective, but have limitations. For example, CALL QSUGGEST allows you to find the last word in a query using Levenshtein distance, which makes the search more robust to typos, but only for the last word. The second method only works for the first word in the query. Example using Manticore client in PHP:

$params = [
    'index' => 'issue',
    'body' => [
        'query'=> 'fzzy',
        'options' => [
            'limit' => 10,
            'max_edits' => 2,
        ],
    ],
];
$response = $client->suggest($params);

In this example, the client uses the function CALL SUGGESTto find suggestions for the misspelled word “fzzy”. Parameter max_edits controls the maximum allowed Levenshtein distance between the entered word and the suggested options. The Levenshtein distance is the minimum number of operations (insertions, deletions, or character substitutions) required to transform one word into another. Having installed max_edits in 2, we allow up to two errors per word, which allows us to find typos and similar words.

Although this approach works, it requires additional manual processing. Besides, CALL QSUGGEST does not take into account errors when changing the keyboard layout (we'll talk about this later). To make fuzzy search more convenient, we added a new option to Manticore Search for a query to select results, which allows you to use fuzzy search with certain parameters without having to write additional code.

New simplified fuzzy search

Now we have a simpler option. Just by adding fuzzy=1 In the query parameters, you can easily enable fuzzy search.

Here's an example:

$client = new Client();
$index = $client->index('issues');

$query = 'fzzy serch';
$result = $index->search($query)->option('fuzzy', 1)->get();

foreach ($result as $hit) {
    echo $hit->getTitle() . "\n";
}

This code enables a fuzzy search for the misspelled phrase “fzzy serch”, allowing Manticore to find the “fuzzy search”. See what it looks like in demo on GitHub:

More information about fuzzy search can be found in documentation.

Implementing query suggestions (autocompletion)

The Power of Predictive Search

Query suggestions or autocomplete make searches faster and help users find relevant queries that they might not have considered:

Old autocompletion with CALL KEYWORDS

Same as CALL QSUGGEST for fuzzy search, Manticore Search historically had a method CALL KEYWORDS to implement autocompletion. This method allows prefix and infix searches of keywords in a table, providing an efficient way to implement autocompletion. However, it does not support typos, which means that if the user makes a mistake, the system will not suggest keywords. Here is an example of using this method in the Manticore client in PHP:

$index = 'myindex';
$query = 'pref*';

$response = $client->keywords($index, $query);

foreach ($response as $keyword) {
    echo $keyword['normalized'] . ' (docs: ' . $keyword['docs'] . ', hits: ' . $keyword['hits'] . ")\n";
}

In this example we use the method CALL KEYWORDSto find all keywords in a table myindex, which start with pref. Asterisk (*) at the end indicates a prefix match. You can also use an infix match by putting asterisks on both sides of the term, for example *some*.

Although this method is fast and easy to implement autocompletion, it is not as flexible as fuzzy search. An exact search requires an exact prefix or infix match, which is suitable for cases where you need to suggest specific keywords. The new autocompletion method solves this problem too.

New autocompletion method CALL AUTOCOMPLETE

We've introduced a new method that generates sentences based on a few entered words and supports typos. This method combines the functions CALL KEYWORDS And CALL QSUGGEST.

An example of using the new autocompletion in the PHP client:

$client = new Client();
$result = $client->autocomplete([
    'body' => [
        'table' => 'issues',
        'query' => 'hllo wor',
        'options' => [
            'fuzziness' => 1,
            'layouts' => ['us', 'ru'],
        ],
    ],
]);

foreach ($result[0]['data'] as $suggestion) {
    echo $suggestion . "\n";
}

This code suggests “hello world” for the entered “hllo wor”, taking into account typos and keyboard layout errors.

How it works internally: We use fuzzy search logic to generate relevant offers. The method uses low-level functions CALL KEYWORDS And CALL QSUGGEST, to generate variants from your data set.

For exact sentences, we estimate the distance based on the word length and the number of available documents. For example, if the user enters “hllo wor” and the data contains “hello world”, the system will suggest exactly “hello world” thanks to fuzzy search. This approach allows users to receive accurate sentences even with typos.

More information about autocompletion can be found in documentation.

Defining the keyboard layout

Manticore can now detect keyboard layout. This is especially useful for users who accidentally switch layouts while typing. The system supports popular layouts, including language and QWERTY, which eliminates input errors.

Advantages:

  • Improved search accuracy for multilingual users

  • Troubleshooting errors due to switching layouts

  • Seamless integration with fuzzy search and autocompletion

Using new features in the GitHub issue search demo

With these new features, we're excited to improve our issue search demo on GitHub. It turned out to be easy to do:

To enable fuzzy search, we added following lines to request:

$search->option('fuzzy', 1);
$search->option('layouts', ['ru', 'us', 'ua']);

For query suggestions we used the new autocomplete function:

$result = $client->autocomplete([
    'body' => [
        'table' => $table,
        'query' => $query,
        'options' => [
            'fuzziness' => 1,
            'layouts' => ['ru', 'ua', 'us'],
        ],
    ],
]);

These improvements significantly improved the search accuracy in our demo, delivering more useful results to users.

Here's what it looks like in the issue search demo on GitHub

Integrating autocomplete and keyboard layout-aware fuzzy search using Manticore Search improves the user experience with the search tool. By accounting for typos and suggesting suggestions based on input patterns, users can quickly find the results they need.

Here is an example of how autocomplete works, reminiscent of popular search interfaces:

Additionally, fuzzy search improves the user experience by correcting common errors such as typos. For example, if a user searches for a product but makes a mistake, fuzzy search will still find matching results. Here's what it looks like in a demo of issue search on GitHub:

Conclusion

The integration of fuzzy search and query suggestions into Manticore Search has greatly expanded its capabilities. These features make searching more intuitive, flexible, and efficient.

The latest version can be found in the pre-development release. Check out documentationto install it and take advantage of new features.

We also invite you to try these features online in our GitHub issue search demos and share your opinion. Your feedback helps us improve Manticore Search.

Stay tuned and if you like the project, support our repository on GitHub!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *