Creating SIEM Rules Using Category Models

Introduction

In the documentation for the foreign SIEM Exabeam, I spied an idea for creating rules using category models: essentially, automatically updated tabular descriptions of user activity.

The application of such models is quite simple, they can be implemented in any (I assume) SIEM and it allows to expand the set of rules for detecting abnormal user activity (UEBA).

How to Use Categorical Models

The main thing I use category models for now is to create SIEM “First Time” rules:

  • The first time user X did action Y;

  • User X performed action Y for the first time among members of group U;

In addition, you can:

  • Register the correspondent event “User X performed a rare action Y” (for example, when the number of days the category appeared for the user is less than 5);

  • Use aggregated data from the model table for 1-2 months to display reports in the user account, as well as during investigations.

Structure of models and their life cycle

Example and what is a “category”

To understand the basic use of category models, let's start with an example and use it as we go along.

Our users can connect from home workstations to VPN. In a number of Windows authentication events, the name of the computer from which the connection is made is visible (for example, DESKTOP-123456, MY_PC11). These are authentication events via RDP (4624 Logon type 10), sometimes via the network (4624 Logon type 3).

I hypothesize that a hacker who has taken over a user's credentials may not know the name of their home computer. And once they log in as a user from a computer with a new name that is not in the table, we may notice them.

For the example above we need to have a table with at least 2 fields:

  • username (this will be the entity ID, entity_id)

  • computer name (this will be ours categorycategory – we will enter in lower case)

The table for 2 employees will be as follows (ivanov uses 1 computer, petrov – 2):

entity_id

category

ivanov

desktop-12345

petrov

my-pc

petrov

homepc

Model fields

To be able to store different models in a single table and maintain statistics, we will expand the list of fields to the following:

Field

Description

model_id

Model ID. Home computer names will be described in one, something else with another ID, etc. For example, win-acc-homepc

entity_type

Entity type. We can create models for accounts, account_groups, etc.

entity_id

Unique ID of the entity. For example, account name, account group name.

category

Category. Some value that describes the behavior of an entity within the model being formed. Computer name, path to running software, etc. Data is generally entered in lower case.

last_seen_day

Date of the last registration of the category for the entity. For example, 02/19/2024

first_seen_day

The date of the first appearance of the category in the entity within the model's lifetime. For example, 01/21/2024.

days_seen

A list of days within the lifetime of the record when the category was encountered. I use a date enumeration: 2024-01-24,2024-01-26,2024-01-27,2024-02-18,2024-02-19

days_seen_cnt

Number of days. For the dates above, it is 5.

In principle, the last 3 fields can be omitted if statistics on the dates of appearance of categories for entities are not needed.

Life cycle of records in models

For example, let's say today is 02/23/2024.

Processing events received by SIEM

When an event that fits the model is received, the presence of the category in the model is checked and days are added/updated.

If a given entity does not have a category within a given model, a correlation event is generated.

The logic block diagram is shown below:

Flow chart of the procedure "Processing events received by SIEM"

Flowchart of the procedure “Processing events received by SIEM”

Lifetime of records in models

I use the acceptable time of category presence in the model of 30 days. If the category has not been encountered in events for more than 30 days (for example, the account ivanov has not logged in from the computer desktop-12345), it should be removed from the model table.

This ensures that the model remains up-to-date.

Clearing old values ​​from models

The execution occurs daily, once a day. Given that the set lifetime of records in the models is 30 days, the boundary date will be 01/24/2024.

On 23.02.2024 the following will be done:

  1. Entries with first_seen_day == last_seen_day == 01/24/2024 are deleted – such categories were last encountered 30 days ago.

  2. For other entries where first_seen_day == 01/24/2024:

    1. From the beginning of days_seen, “01/24/2024,” “01/24/2024” is removed

    2. In first_seen_day the first date from days_seen is written – 01/26/2024

    3. days_seen_cnt is being updated

Removing malicious values ​​from the model

The procedures for responding to information security suspicions should include deleting values ​​from the model table that are attributed to incidents (in our example, the names of hacker computers).

Other examples of “First Time” anomaly detection

First connection to VPN with this client version

When connecting, our VPN registers the name (with a mention of the OS type) and the version of the VPN client, and it is stable – it is updated only manually. Let's assume that an attacker, having received the user's login and password, tries to connect to the VPN.

I hypothesize that he will most likely download the VPN client from the official website. There is a good chance that it will either not get into the version that the employee usually uses or it will not get into the OS that the employee uses.

As a result, having received the event “Employee X connected to VPN with client Y for the first time”, we can start to react.

The entries in the model table will be as follows:

  • model_id = acc-vpn-useragent

  • entity_type = account

  • entity_id = (account)

  • category = (VPN client name)

First access to the network from this provider

Enrichment of VPN login logs, external sites with information about the autonomous system (AS/ASN = provider) from the Maxmind GeoIP ASN database and category models will allow us to respond to the first logins under the user from the provider.

I hypothesize that an attacker, having stolen credentials, will not be able to predict and forge from which provider the user usually logs into external systems.

As a result, we can get the event “Employee X logged into the network for the first time from provider Y”.

The entries in the model table will be as follows:

  • model_id = acc-remote-asn

  • entity_type = account

  • entity_id = (account)

  • category = (AS name)

First access via RDP from internal IP under user

Often users and admins log in via RDP from a small number of IP addresses within the network: their own workstation, admin jump server, virtual IP VPN and that’s it.

I will put forward a hypothesis that if a hacker hacks an application server and finds users logged in via RDP, he will recover their passwords from RAM and can then begin to explore the network (log in via RDP) from the IP of this application server.

Thus, having a certain model of “from which IP an employee usually logs in via RDP somewhere”, we can identify such anomalous IPs from which logins occur.

The entries in the model table will be as follows:

  • model_id = acc-remote-rdp-internal

  • entity_type = account

  • entity_id = (account)

  • category = (IP/IP group)

First launch of the program among users of the Accountants group

To identify anomalies in user actions, you can segment them, for example by selecting the Accountants group and creating a model of the programs that its members run:

  • we are looking at events 4688 (and 1 from Sysmon);

  • convert the path to the application to lower case;

  • We ignore some program launches and do not send them to the model (for example, system ones);

  • we transform the path to the application using templates (change c:\users\ivanov\program.exe to “USERPROFILE\program.exe”, etc.);

  • the category will be the final templated path to the application.

As a result, we can get the correlation event “Employee X was the first from the Accountants group to launch software Y”

The entries in the model table will be as follows:

  • model_id = group

  • entity_type = account_group

  • entity_id = Accountants

  • category = (template path to software)

Reacting to First Time Events

There may be many correlation events across models. Each model requires its own approach(es) to respond.

SOC responds to every event as if it were a suspicion

Perhaps it is worth identifying groups of accounts that are responded to immediately – for example, responding immediately only when events appear according to the model from domain admins.

SOC analysis once a day

For some model/group of entities, we can view the triggers not immediately, but, for example, once a day.

Personalized automatic mailings

Events of the “first time” type can be notified directly to the users themselves, as a large number of services do (Google, Telegram, etc.) – with messages like “You logged in for the first time with XXX”. This can be done automatically from SIEM: to a Telegram bot, to a corporate messenger and corporate mail.

In my opinion, this is a good approach to responding to events, of which there are many, but it is physically impossible for SOC employees to process them due to the number of accounts and their relatively low significance for information security.

For example, you can notify everyone in the bot using this model, and also deal with domain admins through SOC.

Accumulation of risk

The general idea of ​​the approach is that according to certain rules, an alarm is not registered in the suspicion accounting system, but once a day, for each rule (module), a danger score is added for accounting: 20, 30, 40, etc.

And if an account has accumulated more than 100 points during the current day (several different rules have been triggered for it), then a suspicion is registered for it for the SOC (“Account X has accumulated more than 100 risk points today”).

I will discuss the risk accumulation approach in more detail in the next article.

Advantages of category models

Below are the advantages of category models compared to conventional profile tables with permanent records:

  • You can also respond to rare categories (for example, assign risk points to events that occur up to 5 times a month (days_cnt < 5));

  • there are statistics and specific days of the appearance of the category for accounts – you can use this in reports on monthly data (without analyzing billions of information security events), display this in user accounts, and conduct analytics;

  • the models are self-updating: old irrelevant categories are deleted automatically, and the last dates of use are always up-to-date;

  • The approach to creating “First Time” rules is simplified by eliminating the need to create separate tables, etc.

Conclusion

We implemented the work of such models in our SIEM based on Graylog + WSO2 Streaming Integrator + MySQL relatively easily. In any other systems, there is usually a functionality of table lists with the lifetime of records, and something similar can be invented there.

We launched the first personal notifications to users from SIEM using the “First Time” rules.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *