DLP: preventing leaks

Data leaks are one of the main problems in the modern IT world. Personal data, confidential information, trade secrets and sometimes something more secret periodically leaks from someone and pops up on the Darknet, telegram channels for breaking through and others ~~useful~~ dubious resources.

At the same time, leak prevention systems (DLP) have been in existence for decades. But before we go further and talk about the problems of implementing DLP, let's correctly decipher this abbreviation. DLP is Data Leak Prevention, preventing information leaks. You can find the decoding Data Loss Prevention – that is, preventing the loss of information. However, information loss and leakage are slightly different things. When information is lost, it does not necessarily become available to outsiders, while information leakage clearly determines how outsiders will gain access to it.

This way, the ransomware can only encrypt files, as a result of which information may be lost, but not stolen. Conversely, information leakage through copying does not lead to its loss.

But this was only a small lyrical digression. Let's return directly to DLP systems.

What is DLP essentially?

Previously, the focus was on protecting physical documents. This can be achieved by infiltrating the physical perimeter or stealing documents from couriers. While these tactics may continue today, the rise of the Internet has increased the scope and likelihood of data theft. In short, the proliferation of data and communication channels has made the criminal's job easier.

A classic data leak prevention system must control all channels through which data can be copied. First of all, this is everything related to the network: email, web, instant messengers, FTP, etc. If encryption is used, the DLP system must be able to break into the gap in order to be able to open encrypted traffic. The second group of leak channels is the user's computer itself. Here, data can be transferred through USB ports in all their forms: flash drives, printers and other devices. However, do not forget about other interfaces that can also be used to copy files.

It is also important to control file storage, since the presence of confidential data in public resources can also lead to data leakage.

In fact, the only channel that DLP cannot cover is data output to the monitor. If our unconscious user has rights to read a file, then he can display the contents of this file on the screen and then try to film it on his phone. Yes, it is unlikely that it will be possible to re-record the entire customer base of a large company in this way, but it is, in principle, possible to remove tables with purchase prices. But to combat such leaks, organizational measures should already be used, such as cameras and proper training of employees.

Modern DLP systems have long gone beyond these boundaries; DLP now includes EDR/UBA/SIEM functionality and ungodly features for monitoring users through a built-in camera and microphone. However, in this article we will only talk about the classic DLP functionality.

Main problems

You cannot simply take and implement a DLP system, just as they do when implementing antiviruses or firewalls (although there are plenty of problems there too). There are a number of basic steps that need to be taken when implementing DLP. We'll look at the general guidelines your implementation strategy should follow. These requirements can also be used to select the right DLP solution for your organization.

Prioritize it

A typical story when implementing DLP systems is the customer’s lack of understanding of what data needs to be protected from leakage. That is, there are regulations on what is considered confidential information, but what confidential documents actually look like, what criteria should be used to filter them out – all this is often a problem.

The first step before implementing DLP is to determine which data would cause the most problems if it were stolen. Of course, you can be puzzled by this issue after implementation, but then there is a risk of encountering the fact that your system does not completely cover the possible perimeter of leaks or, conversely, confidential data is processed by fewer nodes and you bought so many licenses in vain.

In any case, depending on the type of activity, priority may be given to intellectual property, such as design documentation, drawings, etc. Retailers and financial services companies should obviously appreciate the data falling under the requirements of banking standards, Federal Law No. 152 and similar documents. Healthcare companies would prioritize medical records since they are often stored on laptop computers. While it may seem obvious, preventing data loss should start with the most valuable or sensitive data that is most likely to be targeted by attackers.

Thus, it is important to properly prioritize when identifying sensitive information that is subject to DLP control.

Classify data

Data classification is often considered the most challenging task in DLP implementation. However, in practice we can rely on the context used in confidential documents of a certain type. For example, documents containing trade secrets must contain appropriate identifiers.

Applying persistent classification tags to data allows organizations to track its usage. Thus, using regular expressions in documents, you can identify credit card numbers or keywords (for example, “confidential”).

In fact, reliable tagging allows you to better configure the DLP system to detect data leaks.

Once installed, the DLP system already contains a basic set of rules that allow you to identify credit card numbers. However, in practice, the rules out of the box can generate a large number of false positives, identifying leaks where in fact there are none. For example, triggering on passport data in contracts that are not confidential data. So, as a rule, the best solution is to disable the built-in rules and then enable only the rules you need one at a time.

Also important is the file format, application protocols and other characteristics by which you can identify data that is subject to control using DLP systems. For example, if design documentation developed in a specialized application is confidential, we can configure the DLP system to analyze files of this format. But problems may arise here too. So many applications allow you to export data, for example, to PDF or graphic formats, and here our insider can use the export function to bypass the DLP system, which does not expect to encounter sensitive data in files of a different format.

Therefore, it is important to understand what functionality the target application has that works with confidential information and what methods an attacker can try to bypass control. And here an important role is played by the threat model, which should be compiled in the organization at the time of implementation of the DLP system. Information security is a process that functions effectively when we use a set of security measures. That is, we must implement both network protection (firewalls, IDS) and host protection (antivirus, EDR, restriction of rights), encryption, monitoring and other means. And the weakness of one link can lead to a gap in the entire information security system.

For example, all users have local administrator rights and the organization does not maintain an inventory of installed applications. In this case, our cunning insider can install an application for recording video (we will assume that we can control PDF and graphic files using OCR) and simply make a film about reading a confidential document, which he will then seamlessly transfer to a flash drive.

Another workaround option for savvy industrial espionage geniuses. An attacker, if he has administrative rights, can dump the RAM in which a confidential document is open. Then he seamlessly copies the dump file onto his flash drive. After this, at home, using Volatility and a debugger, you can easily extract the desired process from the dump and open a confidential document.

Well, it’s not even worth talking about the banal disabling of the DLP agent if you have administrator rights.

The moral of all these cases is that the problem of information leakage must be solved in a comprehensive manner using various means of protection. Thus, lack of rights will not allow you to make a memory dump, and control of launched applications will not allow you to start a screen recorder. But this is not the functionality of a DLP system, since other security tools are responsible for these functions.

Track all data movements

As you know, the best protection is for a switched-off computer located behind a two-meter layer of concrete. But in practice it is impossible to use such a computer. It’s the same with confidential information – as a rule, it must move across the organization’s network: designers prepare drawings, which are then used by engineers, and managers prepare contracts, which are then dealt with by lawyers. As sensitive information must travel legally across the network, understanding how data is used and identifying existing behaviors that could put it at risk is critical.

Without understanding information flows, security teams will be unable to develop appropriate policies that reduce the risk of data loss while ensuring the proper use of data.

Remember that not all data movement constitutes data loss. However, many actions can increase the risk of data loss. Organizations should track all data movements to gain insight into what is happening with their sensitive data and determine the scope of the problems their DLP strategy needs to address.

For example, the ability to connect any flash drive to a USB port can significantly increase the risk that an insider will be able to deceive DLP and copy confidential data onto the drive. The use of nodes on the network that do not have DLP agents installed, but which can still process confidential data (for example, laptops of seconded employees) is also a serious security hole.

The same goes for network monitoring. You must monitor all VLANs and network segments over which sensitive information may travel. There should be no routes through which files can be sent to the Internet without verification.

Monitoring DLP

Monitoring will provide indicators of how data is being compromised. The next step to effectively preventing data loss is to work with business leaders to understand why it is happening and setting up controls to reduce the risk of data loss. Thus, at the beginning of planning for a DLP implementation, monitoring data use can be simple, targeting the most common risky behaviors and ensuring support from line managers. As a data loss prevention program matures, organizations can develop more granular, fine-tuned controls to mitigate specific risks.

For example, in the beginning we can only detect leaks without blocking. But once we understand that DLP is working as it should, we will need to start blocking leaks.

Train your employees

User training can often reduce the risk of accidental data breaches caused by employees themselves. Employees are often unaware that their actions may result in data loss and will be more aware when properly instructed. Advanced DLP solutions offer administrators the ability to inform employees about data usage that may violate company policy or simply increase risk (in addition to controls to directly block risky data activities).

Conclusion

Of course, data loss prevention is an ongoing process, not a series of discrete steps or an end point to a project. If you start with a focused effort to protect a subset of your most critical data, DLP is easier to implement and manage. However, the operation of DLP systems needs to be constantly monitored and improved, since systems working with confidential data can change over time, and new data formats appear that also need to be controlled.

In conclusion, I would like to recommend you free webinar about domestic and foreign solutions for protecting information from leakage.

DLP: preventing leaks

What is DLP essentially?

Main problems

Prioritize it

Classify data

Track all data movements

Monitoring DLP

Train your employees

Conclusion

One year after the workshop. How is my career

Tim Berners-Lee suggests storing personal data in pods

BGP and Address lists + Mangle. How to implement domain crawling?

What is MLOps and how we implemented model cascades

Reverse Migration: Information Security Issues When Returning Employees to Offices

A quick tutorial on installing and operating the CrowdSec v.1.0.x IP filtering system

Leave a Reply Cancel reply

What is DLP essentially?

Main problems

Prioritize it

Classify data

Track all data movements

Monitoring DLP

Train your employees

Conclusion

Similar Posts

Leave a Reply Cancel reply