Shadow traffic: why 20% of traffic goes unaccounted for

Getting complete traffic information has always been a challenge. Dark traffic first emerged in 2012, when it became apparent that traffic with remote link source information was categorized as “direct”. Dark traffic volumes have grown with development of HTTPS and the rise in popularity of private messaging systems (Slack, WhatsApp, etc.).

Some of the dark traffic helps to detect the tracking of advertising campaigns, but recently another obstacle for analytics has appeared: shadow traffic…

What is shadow traffic?

Shadow traffic is site visits that are not tracked by a conventional analytics vendor.

Shadow traffic is real traffic from real people, but you will never see them or their behavior. Shadow traffic – this is real traffic that your analytics are missing.

Before dealing with the concept shadow trafficyou need to understand how web, content and product analytics tools work. Every serious analytics tool requires adding code to your website or application. Once you add the code, your analytics vendor starts sending you traffic and event reports.

When someone visits your website or application, this code sends “events” to the analytics provider. Events are small packets of information that briefly describe the activity on the site. You are probably familiar with the most common web analytics events: page view events, video launch events, or receiving an email address. If events for some reason cannot be transmitted, for example, if they are stopped by a browser or an ad blocker, then your analytics software provider will not register anything – neither user, nor session, nor page views. Such visits still remain real traffic from real people, but you will not learn anything about them and their behavior. That’s what it is shadow traffic – real traffic that analytics are missing.

Shadow traffic occurs when analytics events cannot reach analytics servers.

What are the causes of shadow traffic?

Shadow traffic is generated by ad blockers, browser privacy settings, and other tools that prevent the analytics provider from reading events. The most important causes of shadow traffic on both the web and mobile apps are ad blockers and new privacy protections built into browsers. Below we will discuss what motivates users to use ad blockers and browser privacy features; their use is caused by the natural desire of users to protect their privacy.

Despite its name, adblockers often block analytical software providers… Enhanced Tracking Protection Firefox browser, Intelligent Tracking Prevention (ITP) Safari browser and Tracking Protection Microsoft Edge browsers act as built-in blockers for a portion of analytics services. Even the Google Ads Monster built ad blocker into Chrome… The problem of shadow traffic is becoming more serious and cannot be ignored, as it affects almost all popular browsers and platforms.

The most important causes of shadow traffic: ad blockers and new built-in browser privacy features. There are also less popular tools that generate shadow traffic.

Some advanced Internet users are adopting more advanced privacy protection tools, which are becoming less common causes of shadow traffic. Including:

Network-level blocking, for example Pi-hole
VPN level blocking, for example NordVPN
DNS blocking of the device, for example AdGuard
Locking based on app in device like Wipr

Why do users use ad blockers?

While every user is unique, there are three main reasons for using ad blockers:

1. Prevention of leakage of personal information to a third party: Many ad networks and ad sellers take advantage of their connections with site operators to create cross-site device graphs of users. These device graphs associate personal information from one site with behavioral information on another site. Users rightly consider such commercial practices to be an infringement of privacy and install ad blockers in response. Regulations such as the GDPR and CCPA were created in part to address this issue on a large scale. Installing an ad blocker reduces the impact of such market practices on their browser.

2. Avoiding intimidating behavioral retargeting: Many ad networks allow their clients to “retarget” users based on the user’s browsing history. For example, they might show you an advertisement for a product you viewed (but did not buy) on Amazon.com in the ad slots of a news / information site that you visit a few days later. Users find these advertisements intimidating and they install ad blockers that disrupt their technology.

3. Improving the characteristics of the web page: This is probably the most pragmatic reason for the rise in popularity of ad blockers – improving the end-user experience on most websites. Many sites can have ten to twelve ad networks installed, and sometimes these ad networks host other ad networks through cascades of ad technology sellers. All of this significantly slows down the speed of the website, and sometimes makes browsing unbearable.

Why are blocking technologies built into browsers?

The reasons listed above may seem like just annoying hindrances, but preventing personal information leaks, eliminating user inconvenience and increasing page speed are primarily the tasks of popular browsers and the companies that create them or non-profit organizations (that is, for example, Apple, Google, Microsoft and Mozilla ). In addition, alternative browsers like Brave and Vivaldi have begun to gain popularity due to their advantages in built-in blocking technologies. That is, this technology and its benefits for the end user are an integral part of the “browser wars” that we have seen over the past few decades – browser developers build on new systems on the basis of other people’s work, trying to grab their market share.

Unlike third-party ad blockers, browser-based blocking technologies actively seek not to significantly disrupt web technologies. They are implemented differently and are usually more conservative in their work. However, in the last couple of years, systems like Safari ITP and Mozilla ETPblocking some classes of technologies. The list of blocked technologies includes social media trackers that track the user between sites, cookies, fingerprinters and cryptominers. This usually results in a better user experience for the average user.

Growth of shadow traffic

According to the 2019 international research GlobalWebIndex, quoted IMPACT, 47% of today’s internet users use some kind of blocker. In 2020, Parse.ly began its own study of the percentage of untracked visitors. Early Access members found that compared to traditional web analytics data, no less than 20% and no more than 40% of the traffic was shadow traffic.

Lack of 20% -40% of your visitors’ data leads to unreasonable decisions, loss of profits and poor financial results. The first step to improving data quality is to tackle the problem of shadow traffic while maintaining the privacy, usability, and productivity that your audience cares about.

How to measure shadow traffic

Regular website or application analytics tags are not capable of detecting shadow traffic. There are ways to make shadow traffic visible, but they are difficult to implement and have drawbacks; moreover, they all require extensive engineering knowledge and the work of programmers.

Option 1 – Consolidated Edge Server Logs

When users access your site or application, they interact with your servers. These servers can log user interactions, after which you can visualize this data to understand your users. But this strategy fails because reconciling logs from different servers, CDNs, or even multiple sites is nearly impossible, even with server log management software. These tools are designed for developers and collect information that is important to them. Engagement and conversion are important to marketers and content editors, not load balancing and server health checks.

Option 2 – server side tracking

Blocking usually occurs in the context of a web browser, however, ultimately, users need to make requests to your servers to gain access to content. By sending analytics events to the analytics vendor from your own servers, you can continue to track the user even if the browser blocks the analytics service requests. This option is technologically complex, and it is practically impossible to recreate all the intricacies of reliable client-side analytics tools. Most likely, you will miss out on data that should be collected automatically.

Option 3 – use existing analytics services

The best option is to use your analytics provider’s service, but in such a way that you can reliably access aggregate analytics events while maintaining confidentiality. This solution covers both visible and shadow traffic without compromising user-selected privacy, convenience, and performance. An example of such a service is Parse.ly.

Advertising

Epic servers – this is virtual servers for hosting sites from a small blog on WordPress to serious projects and portals with a million audience. A wide range of tariff plans is available, the maximum configuration is 128 CPU cores, 512 GB RAM, 4000 GB NVMe!

Shadow traffic: why 20% of traffic goes unaccounted for

What is shadow traffic?