Web analytics. Or where does the collection of user data begin?

Greetings to all!

Based on the title, you already understand what I want to tell you (the community, novice specialists in the field and other interested parties). I am sure that some will say that data is collected about users not only from the Internet, but also from other sources. You will be right, but I still want to talk about collecting data from your “Internet” and what requirements today’s market places on Middle+ level specialists. We won’t talk about metrics in advertising and on the website, because… this goes without saying.

I propose to break these requirements down into:

  1. Analytics systems and additional tula

  2. SQL and DBMS

  3. BI (Data Visualization Systems)

  4. Soft Skills

Analytics systems

In my work, I constantly use data that is pumped into the database via the API, but I can’t do without setting up and defining events, so I’ll tell you a little about analytics systems below. Let's also talk about extras. tools that are used by web analysts.
As you know, there are quite a lot of analytics systems on the market, but I would highlight some of the leaders:

  1. Google Analytics (GA4)

  2. Yandex Metrica

  3. Matomo

Google Analytics (GA4). Previously there was an excellent Universal Analytics, but GA4 is also interesting and provides a large number of goodies. Setting up the system can be done by simply installing a counter on the site and then using the Google tag (gtag.js) or Google Tag Manager (hereinafter referred to as GTM), which is also a script that is implemented into the site. I know a fairly small number of colleagues who use gtag.js in events, implementing them into the corresponding functions inside the site code, but most often we are dealing with GTM, we will talk about it a little later. By going to the streams section, you can go to the advanced settings and note that you can create streams not only for the web, but also for mobile applications (Android and IOS). I’ll tell you about settings in mobile applications in another article. Let's continue, in the administrator section you can configure connections with other products of the Google ecosystem, configure user rights, filters, conversions, special definitions, etc. We will not dwell on each point, because… a bunch of authors from the West talked about it, and even more authors from the CIS translated it and put it in their videos and blogs. Among the big advantages is good integration with various services outside the ecosystem, including A/B testing systems and data visualization systems, and the ability to create visualizations in the form of tables, funnels, etc. can also be added as a plus. The biggest disadvantage I encountered was the appearance of the value (other). As a rule, it appears when the system has a lot of unique values ​​assigned to a dimension. Google offers several solutions, one of which is to transfer data to BigQuery (it’s cheaper). In the system, a user of any level of knowledge can view the main metrics for his web resource and make decisions based on them.

Yandex Metrica. This is also quite an interesting analytics system, which in the CIS is used almost on a par with GA4. The interface is very different from GA4. Metrica has a lot of standard reports, which allows even a novice user to navigate the data well. Of course, setting up events (goals) has good functionality, but the Google ecosystem is still far away. To be fair, we can note a good additional feature in the form of so-called event marking, and with GTM the system acquires serious weight in the eyes of users. There are also quite interesting add-ons to the system in the form of a web viewer, scroll and click maps, a variocube for A/B testing and integration with CRM.

Matomo. The system is rather similar to Universal Analytics in its method of data collection; there is an internal Tag Manager. The interface is certainly not very good, but we need functionality. Quite a lot of plugins can be added to the system (there are paid and free ones). You can create your own (hardcore specialists will like it). There is an internal A/B testing system in the form of a separate plugin, which costs no more than $300 per year. Of course, there are also disadvantages; it does not understand the usual data layer in the form of dataLayer, so you need to use window._mtm.push or plugins. And of course the price, if Yandex and Google provide a free solution (we won’t talk about 360 systems), then Matomo has 2 options. The first is to use a Cloud solution, the second is open source on your servers or an internal provider. Each of the solutions is not very cheap, although for some “money is no problem.” Also, when working with the Self Host solution, you should take into account the recommendations from the system developers regarding the hardware and database.

Google Tag Manager. It, in turn, is quite simple, easy to manage and there is no need to constantly bother the developers with all sorts of little things, because as always they do not have the resources for this. GTM has a lot of preset tags, triggers and variables, which at the initial stages are enough to set up the main events on your web resource. There is also a gallery of templates that you will definitely like. From GTM you can transfer data anywhere (third-party analytics systems, advertising accounts, databases, etc.). Much attention can be paid to variables, because they have very interesting functionality in the form of creating your own JS code, regular expression tables, etc. You can also create your own (custom) tag in GTM, which is written in JS, so AI can help you. When working, you are limited only by your imagination and the capabilities of the system. I want to pay great attention to working with Server-Side GTM. I would call this a cool extra, because… it allows you to improve the accuracy of data collection, improve the performance of the pages of your resource, because… event processing will occur not on the client side, but on the server side, improved management of user privacy settings. I will write a separate detailed article about Server-Side, but for specialists I want to draw attention not only to the simple setup of events within the system, but also to raising containers inside Cloud Run on the Google Cloud Platform with subsequent logging and error detection. Cloud Run is quite simple, one of its characteristic features is automatic scaling as loads on other containers in the system increase, so you don’t have to worry about increasing loads on the server. Based on so many recommendations, you need to become a little familiar with the Google cloud ecosystem; in the future it may be useful for working on international projects.

A/B testing systems. Previously, Google had an excellent Google Optimize, but unfortunately they did not provide us with an alternative, so some people use VarioCube, VWO and other systems. Larger companies write their own systems for splitting or replacing content, after which they engage in serious analysis of the results, but that’s another story.

SQL and DBMS

Everything here is quite simple, after pumping data via API from various advertising accounts and analytical systems, you can start doing analytics. SQL is used to query the database, but the list of databases is quite different. These can be either Cloud Managed or Self Hosted solutions. The most popular are BigQuery, ClickHouse, PostgreSQL, Greenplum. In your work you will definitely encounter tasks from simple selects to CTE and window functions. Therefore, it is worth devoting your time to learning SQL. Also worth paying attention to:

  1. working with Cloud solutions; for web analysts, Yandex Cloud and Google Cloud Platform are sufficient;

  2. setting up tables and raising the database in the cloud;

  3. working with logs when transferring data to a Cloud Managed database (you never know if something will break);

  4. sometimes keep track of expenses, but this is probably not a question for you, although for small and medium-sized companies this will be an important criterion.

BI (Data Visualization Systems)

In most cases, specialists have enough functionality from Looker Studio and DataLens. But if we talk about a serious level, then it’s worth looking towards Tableau and Power BI. Most often, large companies use them on their servers, where reports are stored. It should be taken into account that it is unlikely that external BI systems will be allowed to connect to internal databases, so you will use what the employer provides within the security loop.

Soft Skills

It should be noted that you will need to communicate with colleagues at different levels in your activities:

  1. With developers – when implementing the next technical specification, checking the correctness of the work performed, or just talking about technical topics.

  2. With managers – when presenting a report or study.

  3. With customers of dashboards and ad-hoc requests, telling them what a certain indicator is responsible for, how to work with a dashboard, how a view differs from a session (this also happens), etc.

Don’t forget about creativity, because in analytics you are limited only by your imagination and the capabilities of the systems.

I hope this article will help novice specialists understand the needs of the market, and specialists in this field will improve their current skills.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *