How to improve the quality of data when filling out electronic forms

A few words about the development

Until 2018, the address base of the RF address classifier in the KLADR format was developed. KLADR is based on a 6-level address structure: subject of the Russian Federation, district, city/settlement, street, house, apartment. The directory is not particularly accurate addresses. It contains errors, the address may be out of date, not exist at all, or duplicated. Identifiers, or KLADR codes, may change from version to version for the same objects, and the address structure itself is often difficult to read.

Until mid-2021, the address database of the Federal Information Address System in the FIAS format was up-to-date and developed. In FIAS, in comparison with KLADR, the information storage structure has been improved. In the address directory, districts / districts as part of the subjects of the Russian Federation appeared, marks were added about the relevance of the address, as well as the start and end dates of the entry. Moreover, FIAS introduced a global identifier, GUID, which allowed the address structure to be connected together. Until now, this ID is fixed and serves as a key for external systems to determine the address element.

In addition to improvements, it is worth noting the problems observed in the FIAS address directories:

  • display of irrelevant information (for example, outdated OKTMO);

  • lack of data required in the system (for example, districts of federal cities);

  • incorrect data format (for example, “g City”);

  • illegible data format for some addresses.

FIAS was replaced by the address base of the state address register in the GAR format. The register contains information about addresses in accordance with municipal and administrative-territorial divisions. The address information is better structured, the address storage format has been modernized, and a unique identifier of the address components, OBJECTID, has been introduced, which makes it possible to optimize the search for data in registry directories. The advantage for users is that address lookups are faster.

Why DaData’s Hints service was chosen as the solution

Address hints allow the user to quickly enter the correct address on a web form or application. The address base of the DaData service has been maintained since 2014, the main directory of addresses is updated once a week. Addresses in the service directories are stored in all formats: GAR, FIAS, and KLADR. In general, the use of the DaData service significantly improves the quality of data and facilitates the process of filling in addresses.

Among the main advantages of the service:

  • matches when the user enters the address more precisely;

  • completeness of data is wider;

  • the address directory contains more addresses;

  • there is a list of historical names of the lower level object, whether it is the previous names of the city or the street;

  • automatic correction of typos (text standardization).

Let’s use an example to look at the difference between searching for an address in FIAS and searching for an address using “Hints” from DaData in order to clearly see the differences, pluses and minuses of the two solutions. Let’s try to enter the address: “Moscow region, Naro-Fominsky, Aprelevka”, OKTMO of which, from March 1, 2018, was changed to 46750000006.

First, let’s enter the address using the FIAS directory. When you enter “Aprelevka”, only one address option is displayed, which refers to the Kaliningrad region.

After detailing the address, we see that the data is returned in the incorrect format “g Urban settlement”, instead of the city it says “district”.

In the extended response, the obsolete OKTMO 46638102 is also returned to the address.

For comparison, we make a request to DaData.

The result is noticeable. DaData returns up-to-date information and offers more similar options to choose from, which is a big advantage for users. OKTMO returns updated.

The relevance of the returned data is the main criterion by which the choice was made in favor of the DaData service.

What other advantages does DaData have in comparison with FIAS?

It is convenient that the search in “Hints” for addresses is flexible and is carried out for any part of the address – from the region to the apartment, and the service offers a large number of options.

Here are some illustrative examples:

Another notable advantage is the search for an address when it is entered in the wrong layout. The service produces results in accordance with the conversion of the text into the desired layout. Misprints when using the “Hints” service are automatically corrected, so there is no need to set up complex validations and come up with intricate forms.

Of course, DaData directories are not without errors; for some addresses, data is partially missing or out of date (OKTMO, districts of federal cities). For example, for the city of Vidnoye, Moscow region, there is no Leninsky urban district in the response if auto-completion is not enabled when entering. By default, the address is returned in administrative divisions.

There is one more disadvantage of the “Hints” service at addresses. Not all fields are filled in unambiguously, so the building, letter, building, ownership are transferred in one data.block parameter, and the apartment, room, room, office are transferred to data.flat. They can only be distinguished by the type passed to data.block_type(_full) and data.flat_type(_full) respectively. For example, if the service response received the block type value “block_type_full” = “Building”, the “Building” form field is filled with the building value, so additional checks are needed to parse the address into components.

About Prompts when entering organization data

Another useful tool that helps to quickly and accurately fill in information about organizations in the project declaration placed in relation to individual residential buildings is “Tips” on organizations.

If it is required to indicate information about organizations, for example, about the bodies that approved the documents, about organizations that issued the conclusions of the examination of project documentation and (or) examination of the results of engineering surveys, about organizations performing work as general contractors, etc., then the filling will be be made on the basis of “Tips” for organizations.

The user begins to enter the name of the organization or the full name of the individual entrepreneur / head of the organization, TIN / OGRN in free form, and the service automatically offers suitable options and returns the results.

The search for an organization is carried out by:

The service searches only among active or liquidated companies. It is important that the “Tips” on organizations are based on the official database of the tax service. Data on organizations is quickly updated and replenished, the maximum possible lag from the site of the tax service is 3 days.

How to get information from DaData, are there any differences from connecting FIAS

Readers have probably already noticed that the interface for filling in the address and information about the organization looks simple. The screen first displays a single data entry field, which is filled using the “Hints”. By activating the address clarification flag, the user can enter the missing information manually.

The code that calls the address service is connected to the input field. To connect directly to DaData, you need to get an API key, which is a password from the service, and, if necessary, pay for a subscription. In our case, an API key was obtained and a license was issued that allows deploying the necessary service directories in the system loop. For DaData, there is another option for obtaining data by directly calling the “Hints” API method (for this you also need to get an API key), or integrating with DaData using a jQuery plugin.

It is enough to upload FIAS/GAR directories to your database, and then deploy the internal API for work, connect plugins.

Like FIAS, DaData works through the API according to the scheme: when the user starts entering an address, the system sends a REST request, to which it receives a real-time response from the information provider in JSON/XML format.

Request to DaData:

curl --location --request POST 'http://domain/suggestions/api/4_1/rs/suggest/address' \ 

--header 'Accept: application/json' \ 

--header 'Content-Type: application/json' \ 

-- data-raw ' { 

   "query": "115612, г Москва, ул Алма-Атинская" 

 }'  

Запрос к ФИАС: 

curl --location --request POST https:// domain /api/fias/search' \ 

--header 'Accept: application/json' \ 

--header 'Content-Type: application/json' \ 

--data-raw ' { 

   "full_address": "г Москва, ул Алма-Атинская" 

 }'

After a detailed response is received, the necessary data is taken from it and displayed in a screen form. With normal input, this is the address on one line. If splitting the address into parts is provided, then granular address fields from the service response are substituted into the screen form fields.

Results

According to the results of the preliminary connection of “Hints” from DaData, the analysis and experience of using the “Hints” service, it becomes clear that the most suitable solution was to use it to fill in the fields of the document form. In conclusion – a brief comparison of “Hints” and FIAS in terms of the main parameters.

Parameter

“Hints” on addresses

FIAS

Search

Fast, flexible, accurate. Built-in text standardization and decryption of incorrect keyboard layouts

Slow, not accurate enough. Works well only for full match search

Search query response

Detailed, in the answer come different suitable options. Municipal division works when an additional module is connected

Several options are returned. Not always the correct address is returned in the response. Lots of outdated data

Database

The address base is wider, regularly supplemented with new addresses and updated

Address directory is not updated

Data structure

Structured data. The service eliminates duplication of data. The downside is that houses, buildings, possessions, etc. passed in one parameter

Compared to KLADR, the information is more structured, but for many addresses the integrity of the structure is broken. There are duplicate addresses

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *