Chasing the “Wrong” Incidents, or How We Build Threat Hunting

Incidents with SLA timings should be detected with a minimum number of False Positive, and also have a clear processing workflow. With such a device, the SOC guarantees that the incident will be processed within a certain time – for further timely response … This approach has been true for many years for any SOC (including ours). But at some point, the realization came that we are partially losing the completeness of the picture of what is happening. The reason is in the very objective limitations imposed on the scripts. Indeed, in a multitude of unrelevant triggers (which cannot be processed on the monitoring lines with reasonable resources and within the SLA), the very salt of an incident is sometimes hidden.


Chasing with the exception of False Positive leads to periodic False Negative, which is known to happen at the most inopportune moment. Realizing this fact, such a phenomenon as Threat Hunting came to the SOC world, designed to strengthen the classic monitoring process and close the above deficiencies.

The very concept of Threat Hunting is now flickering on all commercial brochures, but there are a lot of questions and disputes about what it is and how to organize the process of proactively searching for threats during SOC operation. Even in our team, we periodically like to argue on this topic, and our JSOC camp in this matter is divided into two groups:

1. Some believe that the Threat Hunting process is based on hypotheses that have already been put forward by various researchers or have been formed as a result of analyzing the work of a malicious file or the activities of some hacker group.

2. Others believe that the Threat Hunting process is based on hypotheses that a specialist forms and tests himself at the right time.

Since we could not decide, we began to use both approaches – by the way, both of them have advantages and disadvantages. Let’s take a closer look at what we are doing around each of the options.

Option 1

Here we include the rules of correlation we have written for various atomic events, simple detections that may indirectly indicate an ongoing incident, while these detections in isolation from the context can be absolutely insignificant, it is logically difficult to filter them into False Positive signs and unambiguously build a workflow around them for engineers from the line is not possible.

What is it about? Here are examples of two rules to understand the specifics of “wrong incidents”.

In this category we have, for example, a rule that detects the presence of an IP address or FQDN in the parameters of the process being launched… In reality, this rule is very difficult to pass through the False Positive sieve, but at the same time it is quite effective in detecting suspicious activity.

Or here, for example, a rule that detects the launch of a macro when opening a Word document (a record in the so-called Trusted Records Registry Key value containing the sequence FF FF FF 7F). In a large infrastructure, this rule will work very often, but with the current volume of phishing using macro technology, it cannot be ignored.

It is clear that these rules differ in the degree of “fallaciousness”. Therefore, for each of them, we prescribe (and dynamically change for customers after launch) a certain internal scoring, which, using the mechanisms of retrospective analysis, shows the likelihood of a “combat” detection. High-scoring rules are included in ServiceDesk to highlight activities in relation to “typical” incidents.

The workflow around such rules looks different. The detections do not go to the line, but to the analysts engaged in proactive threat search. They, in turn, are looking for the relationship between the triggers of these rules and the incidents that went to the line (by key parameters – host, account, process), in parallel connecting the mechanism of retrospective analysis of incidents, which we described above. It is worth noting that in this moment there is no concept of SLA, so these triggers do not imply an immediate occurrence of an incident and the need to respond. With this approach, we get an expanded picture of what is happening and minimize the likelihood of missing suspicious activity.

Option 2

In this variant of work, an analyst engaged in a proactive search for threats does not receive as input the triggers of any detection rules, but the so-called raw events around which hypotheses can be built. And already the result of this activity is the task of developing rules, if it was possible to find something really “interesting” that is not covered by current detections. Again, here are two examples.

Process creation event – Event id 4688 (Sysmon id 1)

By processing and analyzing data on all processes running in the infrastructure, the threat analyst looks for suspicious events by analyzing various information. For instance:

– parameters of the processes being launched: collect statistics, pay attention to the rarest command lines, search them for a key set of words / phrases, search for the presence of base64 encoding;

– path to the executable file: pay attention to launch from special directories
(for example, C: User Public, C: Windows Temp, C: Windows Debug wia, C: Windows Tracing – it is possible to write and run executable files in the specified directories without having local administrator rights) ;

– look for interesting Parent -> Process relationships that are not covered by the current detection rules.

Named pipe creation event Pipe (Sysmon id 17)

As you know, malicious software very often creates a named pipe for its own interprocessor communication. And often a named pipe has a certain mask in the name and some generated parameter, for example, Evil_090920 (Evil is a mask, 090920 is a generated parameter). If the name of the named pipe is not in the indicators of compromise, then the very creation of this pipe will not raise suspicion, nevertheless, the analyst can pay attention to the fact that at a certain point in time (or at any time interval) such unknown named pipes were created on several systems. which may indirectly indicate the spread of malicious code.

_________
In this post, talking about how we build the Threat Hunting process, we relied on the events of Windows and Sysmon, which act as a source for the SIEM system. In reality, the source of events and the end system (if only it allows it) does not matter for the analyst to work on a proactive search for threats – exactly the same philosophy can be applied, for example, to EDR or NTA.

Alexey Krivonogov, Deputy Director of Solar JSOC for Regional Network Development

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *