We clicked ads for 23,100 rubles in 8 days. How does this happen in reality?

There are a number of studies that confirm that the most common actions that come to mind to change the browser fingerprint will not help much:

1. Cookies: Browser fingerprint generation does not rely on cookies to create a unique identifier. Thus, while disabling them may provide some degree of privacy for the bot, it does not affect the ability to obtain “fingerprints”.

2. Incognito Mode: This method is ineffective in preventing browser fingerprints as other options are available besides browsing history that can be used for identification.

Anyone curious can watch this study (Detecting incognito mode in Chrome 76 with a timing attack), which demonstrates the capabilities of detecting incognito mode (Russian translation is available at this link). Agree, it is very strange if suddenly a user clicks on an advertisement and goes to the site in Incognito mode.

3. VPN connection is effective for hiding the visitor’s browsing history or hiding the user’s real IP address and geolocation. However, this is not very effective in preventing fingerprints, since many scripts to generate them do not use the IP address as their primary source.

There are several mechanisms for making fingerprinting more difficult, but they are not reliable either. Some browsers offer built-in functionality; for example, Firefox gives users the ability to block third-party requests to sites that generate fingerprints. Another possible way to reduce the number of available sources is to completely disable functionality such as canvas HTML or audio content, which are commonly used to generate prints. However, many sites will not load correctly without these technologies, and this will allow the bot to be detected faster.

Returning to the analysis, what happens? That day, the bot clicked three times on an advertisement in Yandex.Direct, in each case it had one IP address, one browser “fingerprint” and … different Yandex.ClientID (the YandexClientID field in the image above). Pay attention to the reCaptchaOdd field with a value of 0.1, I will return to this a little later, because this value 99% confirms our suspicion that this is a bot.

What is Yandex.ClientID?

ClientID is an identifier that Yandex.Metrica automatically assigns to each unique site visitor. The ID is essentially randomly generated and identifies the browser in which the visitor is viewing your site. If a bot visited the same site, for example, using Google Chrome and Opera, Yandex.Metrica will record two different ClientIDs… ClientID gives you the ability to distinguish and recognize unique visitors, collect their actions on the site into a session and link data about sessions that occurred at different times.

Yandex.Metrica ClientID looks like this: _ym_uid = 1573226534123620835

_ym_uid is the name of the cookie, the first ten digits are the UNIX cookie creation time, and the second ten digits are a randomly generated number.

In our case, Yandex.Metrica produced each time various ClientID for one bot user. Why is that? To answer this question, we ran a simple experiment and you can easily repeat it yourself. Our employee went to the same site three times directly from one PC (work PC). The first two times he entered from the same browser at an interval of one hour, the third time – he opened a site with a browser in the “Incognito” mode (note that the “prints” are the same, this is how it should be – after all, the user is, in fact, one) …

Yandex.Metrica correctly identified the second call (we deliberately waited an hour to close the session) and assigned the same ClientIDs. And the third time (13.43), due to the “Incognito” mode, a new ClientID was generated and for Yandex.Metrica it is essentially new user.

By the way, Yandex.Metrica generates a global ClientID tied to the browser, not the site’s domain, and this (theoretically, we did not check) allows you to see the history of user visits to your different sites if they have one Yandex.Metrica counter installed. There is another nuance in how long the Yandex.Metrica cookie is stored? Previously, it was stored for two years from the last visit or the period set in the user’s browser settings. Now some browsers have begun to limit the lifetime of cookies and, for example, Apple Safari only stores two weeks, regardless of the site settings. As a result, if a visitor came to the site with a difference of more than 2 weeks, then for the analytics system these are already two different users, since on the second visit the old cookie is gone and the browser has written a new file. In our experiment above, in Incognito mode, the cookie was not saved and Yandex.Metrica generated a new ClientID.

Returning to the case above. With a very high probability, we can assume that the clicks were performed every time the browser cookie was cleared, so that Yandex.Metrica would re-identify the user as a new one and assign new clientIDs accordingly. What for? So that these clicks are counted as valid and the funds are debited from the client’s Yandex.Direct budget. Simple click programs work on the same principle (video above).

Let’s see how often a user with the specified “fingerprint” visited this site in general (for advertising and not). Take a look at the image below. This is data for one day. It can be seen that visits are constantly occurring.

What are the characteristics of all these visits? Always one action (click), 3-5 seconds on the site and different IP-addresses. Please note that in some cases Yandex did not assign the ClientID to the client. This can be in several cases. The first – the bot left the site too quickly and the Yandex.Metrica counter simply did not start, the second – Yandex itself identified the visit as a bot and considered the click “invalid”. Unfortunately, there are not many such visits. Basically, every visit receives new ClientID, with the same fingerprint.

Please do not write in the comments that Yandex clicks ads by itself :). Just like the fact that Yandex does not fight bots at all. they say “bees against honey ..”. It’s just that Yandex’s main task is to sell ads, not to fight click fraud. Looking at Google, despite its colossal capabilities, third-party services thrive that provide additional protection. By the way, Yandex recently announced a new algorithm for detecting invalid clicks.

How did the bot change its IP address? Most likely he used a proxy server. I will not write in detail what a proxy is, there is a lot of information. Now it is important to understand that all clicks were on advertising (type of source campaign, which means any “paid source” in the matomo classification). It is obvious that there is active ad clicks.

How often did this bot change IP addresses when clicking on a website ad? We made a simple selection, see the image below. You can see that each IP address was used several times (Visitcount) and it is easy to see that some IP addresses were clearly taken from the same pool (for example 176.222. *. * Or 176.214. *. *). It’s not that hard to buy a pool of addresses.

Why did we decide it was a bot at all? The fact is that in our protection we use several algorithms that work sequentially, identifying bots step by step. There are no secrets in the algorithms themselves, you can repeat them on your site if you have programmers. For example, we check the entry of an IP-address into the “black” lists (here, for example, one of the possible sources).

If we take address 178.214.248.145 (see the image above – this is the bot’s address), you can easily check it for cleanliness using any service. You can do it yourself by following the link: https://dnschecker.org/ip-blacklist-checker.ph? query = 178.214.248.145 (please note that after a while the address may leave the blacklists, so it is important to check promptly). This address has been found on several lists.

What else have we done? Checked the visit with Google reCaptcha v3 (invisible) to get an additional assessment of the session quality. If we return to the images above, we will see that for all the mentioned bot visits, the reCapcha coefficient was 0.1 (very highly likely this is bot in the terminology of Google), and the visit of our employee, when checking the calculation of Yandex.ClientID, gave a coefficient of 0.9 (very highly likely this is a person – it was so :). Next, we have a machine learning algorithm (more specifically, the K-nearest algorithm), which clusters all visits in order to find those bots that are trying to mimic ordinary users.

Coming back to the behavior of this clicks bot. We calculated how many times a given bot changed its environment in such a way that Yandex.Metrica determined it as new each time – 462 times.

Why did we decide that this is definitely a bot, because it is impossible to say for sure? This is so, there is never an absolutely exact probability that we have blocked the bot. But we can judge by the substitution of ClientID, by the same digital “fingerprint” (which, together with other data, only confirms our hypothesis). This behavior is not like the action of a real user. Also, to be sure, you can additionally look through the Webvisor in Yandex.Metrica – often the actions of bots are distinguished by the monotony and linearity of mouse movements, while for a real user they are more chaotic, but in relation to a specific site, given its navigation. Not everyone is ready to provide our analysts with access to the Webvisor, so we have to rely on the data that we collect ourselves in the tables presented in this review. By the way, the analysis of mouse movement is a very interesting approach to identifying bots and I recommend those interested to get acquainted with the work on this link (“CLUSTERING WEB USERS BY MOUSE MOVEMENT TO DETECT BOTS AND BOTNET ATTACKS”, San Luis Obispo).

This client connected to us on June 2, 2021 with a request to check if he has a clicks (usually advertising agencies refer to the fact that Yandex / Google refunds money for invalid clicks and there is no need to worry). During the first 8 days of June, about 462 ad clicks were made. We do not know the cost of one click, we will suggest that it is within 50 rubles. (not so much).

Unfortunately, we do not have access to Yandex.Metrica / Yandex.Direct to check exactly how many clicks were recognized by Yandex as invalid, but from our experience in analyzing other clients, we know that stock Yandex.ClientID in 80-90% of cases, clicks are counted as valid and money is debited. Simple math gives us an estimate of losses: 8 days, 462 clicks * 50 rubles. = 23 100 rubles. (I did not make an amendment for 80% of the probability of counting clicks, since the proposed cost of a click of 50 rubles does not seem very high for the sphere of business activity).

Well, something like this … In conclusion, I don’t even know what to write. There is an endless struggle with clicks bots – they are trying to outwit the Yandex / Google protection mechanisms and mimic ordinary users, and we are trying to calculate them using different methods. Fighting “armor” and “projectile”. This bot, after being blocked, did not return. Although, of course, he could undergo the procedure of “reincarnation” by changing the “fingerprint”.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *