How did Twitter slow down? What is DPI? Parsing

You’ve probably heard that the so-called Twitter slowdown kicked off this week.

Since March 10, 100% of mobile and 50% of stationary Twitter traffic in Russia has been officially slowed down. All this is made possible by DPI technology. We decided to figure out how it works and how the deceleration mechanism works.

Why is it important? It can be assumed that Twitter is a rehearsal of slowing down / blocking Facebook and then YouTube.

Therefore, today we will figure out what DPI is, how it works and what its capabilities are.

Will we have it like in China? And how can you protect yourself from this?

Let’s start from afar, in November 2019 a law came into force, according to which telecom operators had to start installing special “technical means of countering threats” (TSPU).

Later it turned out that behind the abbreviation TSPU hiding DPI technology known to any system administrator – Deep Packet Inspection or “Deep packet analysis”. Not to be confused with pixel density per inch – this is also DPI, but Dots Per Inch.

And so she came to testing on a national scale. Using Twitter as an example.

I will immediately say that I condemn such pressure on social networks, because it is a threat to freedom of speech, which we have only on the Internet and persists. But let’s move on to technology.

How does DPI work?

Sysadmins know that data transmission in the network is divided into layers: from the physical, where the bits are transmitted, to the application layer, where the message is packed in the messenger. At each level, the data package is supplemented with its own metadata related to it. It turns out a kind of nesting doll. For example, which application is sending information to which IP address, and so on.

So DPI is able to view data of different levels and the packages themselves. And it can understand not only where and where the traffic is coming from, but also what kind of traffic it is specifically: a text message, pictures or video, Skype voice traffic, or maybe even a torrent. And then do what you want with him:

  • Prioritize
  • Limit by speed
  • Redirect
  • Block
  • Or, of course, transfer to the recipient.

By the way, exactly how much Twitter is slowing down now is unclear. Personally, I don’t see any problems yet. But why so we will talk later.

This greatly simplifies the task of blocking sites and services. Why?

Previously, blocking was done by IP address. This meant that the service was completely limited. And besides this, sites that were recorded to the same IP address suffered.

Now you can carefully examine the packages and block, for example, only the download of images.

But how does all this wunder technology work?

There are two things that DPI analyzes: the packets of information themselves, both metadata and headers, and internal content. And the second is the behavior of data packets, so to speak.

Let’s take a look at an example. This is what the packet header looks like when the browser receives data and everything is clear with it. The HTTP protocol is written on top, that is, most likely a web page. We see the address of the request. And even what kind of data is being sent.

In this case, this is a small picture. And its size in bytes is also written. You don’t even need to get inside the data itself to understand what is in front of us.

Encrypted traffic is more complicated. There is much less information here. The system will see which port is in use (this gives a hint of the type of application), IP addresses, encryption type. Everything else is encrypted, even the URL.

There is also the second part of DPI algorithms. DPI uses so-called heuristic analysis: small pieces of data are extracted from the packet, so to speak, samples. And then the system looks at what these data look like. Comparing them with a huge database of well-known samples. Such data samples are called traffic signatures.

This is very similar to the way Shazam works, which compares a small piece of a song that you recorded on a dictaphone to a huge database of song signatures that are stored on the server. That is, by and large, DPI is Shazam for traffic.

By the way, antiviruses work exactly the same way, they compare the signatures of new viruses with a huge database of known viruses.

In addition, DPI looks at the frequency and size of packets, because different applications also have their own characteristics. For example, torrents send packets in a very peculiar way.

And most importantly, since everything happens in real time, this approach does not significantly slow down the Internet.

How can DPI be used?

By classifying a packet, DPI technology allows a lot to be done with it, and in general the technology can be useful. DPI can improve the quality and speed of your connection by prioritizing one type of traffic over another. For example, if everyone downloads torrents in your LAN, then people will not be able to normally call on ZOOM for work. Then DPI will allow ZOOM traffic to prioritize.

But if you are Roskomnadzor, you probably have different priorities.

But the question of who will determine which traffic is prioritized and which is not remains open. For example, using DPI it is quite possible to promote some services and slow down others: giving priority to domestic counterparts. For example, you can slow down YouTube while speeding up Rutube. In general, there are a lot of options for fantasy.

Architecture or why did things go wrong again?

But if there is such a powerful tool, why is Twitter still doing pretty well?

An important issue is architecture and implementation: that is, where exactly to install DPI equipment. In order for all traffic in a country to pass inspection, it is necessary that operators set DPI on all their border gateways. It is logical.

Did you notice what was announced: Slowdown for 100% of mobile traffic, and only 50% for stationary traffic? Distributing content that is dangerous, according to ILK, through computers is not so dangerous, or what? Not.

Not everyone has just implemented the technology. We managed to come to an agreement with a crowd of mobile operators. And there are a huge number of small local Internet providers. But this is not so bad.

Secondly, just like with Telegram, blocking Twitter is not so easy. The fact is that Twitter uses the services of a CDN provider to ensure high speed of the service all over the world. It’s called Akamai.

CDN – Content Delivery Network – content delivery network.

The CDN provider provides Twitter with a globally distributed infrastructure. As a rule, this is necessary so that the user, when accessing a site, receives information not from its main server, which may be located on the other side of the world, but from the node nearest to it.

Such a distributed content delivery system significantly complicates the ability to block or restrict access to a resource. Because if the application does not receive data at one address, it will go to another. And there the operator may not have DPI enabled.

To be sure to restrict access to an Internet service using a CDN, you need to restrict access to almost the entire network of a CDN provider, which may have tens of thousands of servers around the world. In other words, to block Twitter, you need to block the CDN provider, and this will again account for half of the Internet.

And I will not even dwell on the curious situation when they limit the bandwidth not only for the twitter.com domain, but also for all domains whose names contain the combination “t.co” – this is a short domain belonging to twitter. Thus, other sites were subject to restrictions, for example: reddit.com, microsoft.com and even the Russia Today site rt.com. Presumably for the same reason, Rostelecom servers lay down.

However, over time it seems to have been fixed. Another thing is interesting.

DPI limits. What to do?

But the most curious thing: since this is a kind of guesswork, there is no single DPI standard. Each equipment supplier has its own algorithms and technologies.

The quality and efficiency of DPI work is highly dependent on the quality of the signature base, which the supplier must constantly update. In other words, if Twitter changes something in the composition of traffic packets, it is not a fact that old DPIs will be able to classify traffic according to old signatures.

Finally, DPI is easy enough to bypass with a VPN, since VPN encrypts all traffic and spoofs your IP address, DPI is not scary for it either.

But a simpler solution is also expected. Encryption protocols do not stand still either. And with the arrival of TLS 1.3 and DNSSEC, even more application and user data will be hidden from DPI. And it will be even more difficult to understand what is in the system for the package. So it goes.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *