DNS Tunnel Detection

In today's world, where digital technologies penetrate every area of ​​our lives, ensuring data security becomes an important task.

In this article, we will discuss what DNS tunnels are, how they are created, and how machine learning methods can be used to effectively detect them.

DNS and Tunneling Methods

DNS (Domain Name System) protocol [1,2] was originally created as a mechanism for converting domain names into corresponding IP addresses and vice versa. However, intruders, having shown ingenuity, began to use this protocol not only to identify nodes in the network, but also as a data transmission channel. Such channels were called DNS tunnels.

DNS tunnels create “covert channels” that allow data to be transmitted via standard DNS traffic by embedding information into service fields of the protocol that are not normally used for data transmission. Their main purpose is to bypass normal detection and blocking mechanisms. Why DNS? This protocol is widely used because it is used to convert names to addresses and back and is not blocked by information security tools.

A DNS tunnel typically consists of a special DNS client that embeds information into queries, and a special DNS server that understands those queries and generates special responses. This may involve generating a large number of subdomains with encoded data, changing the order of queries and responses to disguise the tunnel, and embedding information into DNS packet parameters.

DNS tunnel diagram [3]

DNS tunnel diagram [3]

A DNS packet consists of several main components that can be analyzed to detect DNS tunnels:

  • DNS packet header: includes the query identifier, control flags (QR – query or response definition, Opcode – operation code, AA – authoritative server response, TC – connection break indicator, RD – recursion indicator, RA – recursion availability indicator), the number of questions, answers, authoritative and additional records, and other control parameters.

  • DNS Queries: This section contains information about DNS queries, including domain names and query types (e.g. A – IPv4 address query, MX – mail server query, CNAME – canonical name query, etc.).

  • DNS Responses: This contains information about the records found that match DNS queries. Each response may include the response type (e.g. A, CNAME, MX), the IP address record, and the record's time to live (TTL).

  • Authoritative DNS Records: This section may contain information about the servers that have authority for a given domain name.

  • Additional DNS records: This may contain additional records that may be useful for interpreting the response.

Example of a typical DNS query

Example of a typical DNS query

Analysis of DNS packets to detect DNS tunnels may include examining the length of DNS queries and responses, the frequency of queries, the use of unusual subdomains, analyzing query types (such as TXT queries), detecting abnormal query and response sequences, and comparing with known DNS tunnel signatures. These analysis methods can help identify potential DNS tunnels and other malicious activity that uses the DNS protocol.

Example of a tunneled DNS request

Example of a tunneled DNS request

Review of existing tunnel detection methods

The problem of detecting DNS tunnels attracts researchers and cybersecurity experts. Many articles, studies and tools are devoted to this problem. [4,5,6,7,8,9,10,11,12] Existing detection methods range from statistical traffic analysis and network behavior monitoring to developing signatures to identify the characteristics of DNS tunnels.

translated from the article [4]

translated from the article [4]

The figure above shows some ways to solve this problem. I will briefly describe each of them:

  • Signature method: analysis of signature features characteristic of objects or phenomena for the purpose of classifying or detecting illegitimate traffic.

  • Thresholding method: classification or detection based on a threshold value for selected characteristics.

  • Unsupervised learning: analyzing unlabeled data to discover structure, cluster, or reduce dimensionality.

  • Supervised learning: training a model on labeled data, where each example has a known label, to predict the labels of new data.

  • Deep Learning: Using deep neural networks with many layers to extract high-level patterns from data.

When choosing a method, we conducted a thorough study of existing approaches. I will briefly describe the most interesting ones, below I will provide links for full familiarization:

1)[7] DNS Tunneling Detection by Cache-Property-Aware Features: To address the issue, the authors focus on the “footprint” left by DNS tunneling, which cannot be easily hidden. In the context of DNS tunneling data leakage, malware connects directly to a caching DNS server, and the generated DNS tunneling queries inevitably cause cache misses. And since normal DNS transactions are most often accompanied by successful cache hits, the cache hit/cache miss ratio can be an indicator of abnormal activity on the network.

2)[8] A Hybrid Method of Genetic Algorithm and Support Vector Machine for DNS Tunneling Detection: This paper proposes a hybrid method that combines a genetic algorithm method for selecting the best features with a support vector machine (SVM) classifier. The authors achieved an F-score of 0.946.

3)[10] DNS Tunneling: A Deep Learning based Lexicographical Detection Approach: This paper proposes a detection method based on a convolutional neural network (CNN) with minimal architectural complexity to improve performance. Despite its simple architecture, the developed CNN model correctly detects more than 92% of the total number of DNS tunneling domains with a false positive rate close to 0.8%.

After reviewing scientific articles on the topic of DNS tunneling, it should be noted that in practice, various commercial solutions are actively used to detect and prevent these threats. These solutions offer comprehensive approaches that combine modern methods of traffic analysis and filtering, which allows for a higher level of security for corporate networks. Examples of such commercial tools are ZenArmor and CloudFlare DNS. ZenArmor is a tool that uses deep packet inspection (DPI) and machine learning to monitor network traffic and detect anomalies, such as DNS tunnels, by analyzing behavioral models and data patterns. CloudFlare DNS ensures the security and performance of DNS requests by actively monitoring and filtering suspicious activities, which allows for effective detection and prevention of DNS tunneling attempts. Suricata can also be used to detect tunnels with certain sets of rules, such as emerging threats. Further in the article, our solution will be compared with Suricata.

Overview of datasets

Developing and evaluating DNS tunnel detection methods using machine learning, like any other machine learning task, requires high-quality datasets. In this study, we relied on a variety of data sources to create training and test sets. The training data was generated from our own setup. To generate it, we created two virtual machines (client and server), registered a domain name, and set up an NS record so that the client virtual machine could access the server virtual machine. We generated both clean traffic and traffic using DNS tunnels using various programs, such as tcp2dns[13]iodine[14] and tuns[15]This allowed us to create a variety of tunnel traffic scenarios to more effectively train our models.

To test and evaluate the effectiveness of our model, we used additional datasets. In particular, we used pcap files from open repositories on GitHub[16,17,18]as well as data from articles, including the open dataset CIC (Canadian Institute for Cybersecurity)[19]These datasets provided additional traffic cases that our model could encounter in the real world, allowing us to more comprehensively evaluate its performance.

This approach to constructing training and test datasets provided a broader coverage of tunnel traffic scenarios, which in turn improved the generalization ability of our models and their accuracy in detecting DNS tunnels.

Using LightGBM

To effectively detect DNS tunnels, the supervised machine learning algorithm LightGBM (Light Gradient Boosting Machine) was chosen in this work.[20]. This algorithm is based on gradient boosting and is characterized by high performance and accuracy. LightGBM is able to efficiently process large amounts of data and identify complex dependencies between features, making it a suitable choice for tunnels in DNS traffic.

During the experiments, it was important to achieve high accuracy while processing a sufficient number of packets per minute, as some enterprise networks can be very busy. The table below shows the results of different machine learning models, showing their performance and accuracy:

Machine learning model

Number of packets per minute

Macro F1 measure

LogReg

32343750

0.69

SVM(poly kernel)

205472

0.64

SVM(rbf kernel)

183904

0.71

KNN

36751

0.72

Decision tree

646950

0.72

LightGBM

53181

0.96

An important strategy chosen for detecting DNS tunnels was to split the traffic into DNS requests and DNS responses, and then create two separate models (Request and Response) to analyze each type of message. This simplified the task of the model working with one specific type of traffic:

DNS traffic processing scheme

DNS traffic processing scheme

It is important to note that to ensure the universality and generalization of the models, we used only statistical characteristics from the traffic. This means that we used features that can be extracted from the traffic without directly revealing the content of the requests or responses to the models themselves. The features and their importance are:

Request:

  • 'qd_qname_len' / 'qd_qname_shannon' – length / entropy of the requested domain

  • 'ancount' / 'nscount' – number of records in the answer / name Server block

  • 'chars' – the number of letters in the domain name

  • 'digits' – the number of digits in the domain name

  • 'ratio' – the ratio of the number of vowels to consonants in a domain name

It is also possible to graphically determine the reason why the model has designated a particular packet as abnormal. Two graphs are displayed for each packet (belonging to class 0 and 1, respectively). Blue signs reduce the model's confidence in the current class, while red signs increase it.

Graph of the influence of features on the confidence of the request model in the normality of the packet

Graph of the influence of features on the confidence of the request model in the normality of the packet

Graph of the influence of features on the confidence of the request model in the packet anomaly

Graph of the influence of features on the confidence of the request model in the packet anomaly

Response:

  • 'qd_qname_len' / 'qd_qname_shannon' – length / entropy of the requested domain

  • 'an_rdata_len','an_rdata_shannon' – length/entropy of response data

  • 'an_rrname_len','an_rrname_shannon' – length/entropy of domain in response

  • 'ancount' / 'nscount' – number of records in the answer / name Server block

Graph of the influence of features on the confidence of the response model in the normality of this package

Graph of the influence of features on the confidence of the response model in the normality of this package

Graph of the influence of features on the confidence of the response model in the abnormality of this packet

Graph of the influence of features on the confidence of the response model in the abnormality of this packet

These statistical features allow the models to detect patterns characteristic of DNS tunnels with minimal dependence on the specific content of the information being transmitted in the packet.

Obtained metrics and results

For the model to work successfully, it is important to avoid false positives, since there is a lot of “white” traffic, and frequent errors can lead to the model being disabled. When assessing its performance, it is recommended to focus on the F1-score (weighted), which takes into account the class imbalance and provides a more balanced view of accuracy and recall, especially in conditions of heterogeneous data distribution.

In the process of evaluating the effectiveness of our models, we used three different test datasets. Let's look at the results on each of them:

GitHub dataset:

Dataset from repositories on GitHub[16,17,18] provided us with a large number of test cases created by the research community. We used this dataset to test the generalization ability of the model on a wide range of tunnel traffic variations.

Request model:

precision

recall

F1-score

Support

0

0.87

1.00

0.93

2477

1

1.00

0.99

1.00

74108

accuracy

1.00

76585

marco avg

0.93

1.00

0.96

76585

Weight avg

1.00

1.00

1.00

76585

Response model:

precision

recall

F1-score

Support

0

1.00

1.00

1.00

2477

1

1.00

1.00

1.00

59575

accuracy

1.00

62052

marco avg

1.00

1.00

1.00

62052

Weight avg

1.00

1.00

1.00

62052

Based on the results of working on this dataset, we can say that the model has learned well to detect typical tunnels.

CIC Dataset:

Dataset from CIC[19] provided more complex and realistic scenarios based on current cyber threats. This dataset allowed us to test how the model copes with more advanced DNS tunnel evasion and detection techniques, as well as evaluate its ability to detect new and unknown patterns.

Request model:

precision

recall

F1-score

Support

0

0.96

0.94

0.95

362006

1

0.90

0.93

0.92

199612

accuracy

0.94

561618

marco avg

0.93

0.94

0.93

561618

Weight avg

0.94

0.94

0.94

561618

Response model:

precision

recall

F1-score

Support

0

1.00

1.00

1.00

362004

1

0.00

0.00

0.00

6

accuracy

1.00

362010

marco avg

0.93

0.94

0.93

362010

Weight avg

1.00

1.00

1.00

362010

On the CIC dataset, the model also worked well, which shows that the model has not been overtrained and is able to work in new situations. However, 6 first-class packets were not detected. Since the number of class 1 examples is very small, in these cases it is possible to use an additional blocking method. For example, you can block with a request-response pair in one connection.

The trained models were successfully integrated into the UDV NTA (Network Traffic Analyzer) solution. UDV NTA (Network Traffic Analyzer) is an advanced network traffic analysis solution that provides effective detection of anomalies and malicious activity.

Example of an incident created by a model trigger in the UDV NTA interface

Example of an incident created by a model trigger in the UDV NTA interface

Comparison with Suricata

Suricata is a powerful and flexible intrusion detection system (IDS), intrusion prevention system (IPS), and network security monitoring tool created and maintained by the Open Information Security Foundation (OISF). Suricata is designed to analyze network traffic in real time and identify suspicious activity using a combination of signature and anomaly detection. Due to its multifunctionality and support for various protocols, Suricata is widely used in corporate networks to protect against cyberattacks and ensure data security.

In this block, we will compare the performance of Suricata with our models. The evaluation will be conducted on datasets provided on GitHub [16,17]. In our analysis, we will consider the detection of each packet separately, as well as pairs of packets representing a single connection. This will allow us to evaluate in detail the accuracy of threat detection, the speed of data processing, and the overall effectiveness of both systems.

Suricata Configuration:

Suricata 4.1.2, Ruleset: Emerging Threats 2024-07-19T20:49:10Z

Request:

Suricata/Model

precision

recall

F1-score

Support

0

0.82/0.99

1.00/0.99

0.90/0.99

64677

1

0.85/0.96

0.00/0.95

0.00/0.95

13982

accuracy

0.82/0.98

78659

marco avg

0.84/0.97

0.50/0.97

0.45/0.97

78659

Weight avg

0.83/0.98

0.82/0.98

0.74/0.98

78659

Response:

Suricata/Model

precision

recall

F1-score

Support

0

0.99/1.00

1.00/0.99

0.99/1.00

61915

1

0.96/0.70

0.23/1.00

0.37/0.82

875

accuracy

0.94/0.99

62790

marco avg

0.97/0.85

0.62/0.99

0.68/0.91

62790

Weight avg

0.99/1.00

0.99/0.99

0.99/0.99

62790

Compound:

Suricata/Model

precision

recall

F1-score

Support

0

0.90/1.00

1.00/0.99

0.95/0.99

126592

1

0.95/0.90

0.03/1.00

0.06/0.94

14857

accuracy

0.94/0.99

141449

marco avg

0.92/0.95

0.51/0.99

0.50/0.97

141449

Weight avg

0.90/0.99

0.90/0.99

0.85/0.99

141449

Suricata showed detections only on one file, with 222 of them being 'Misc activity' and only 38 being 'ET TROJAN Suspicious Long NULL DNS Request – Possible DNS Tunneling'. In addition, the experiment revealed that Suricata does not detect tunnels generated by the dnscat2 utility. [21]which are detected by our model.

Conclusion

In this paper, we presented a study on DNS tunnel detection using the LightGBM machine learning algorithm. We demonstrated that our model has high accuracy and ability to detect covert data transmission channels, even in complex and diverse scenarios.

However, cybersecurity is a continuous process and new threats appear every day. In the future, it is planned to develop models to detect other types of attacks.

With the cyber threat constantly changing and technology rapidly evolving, our work on this path is far from over. We are committed to finding new methods, adapting to new threats, and improving our models and technologies. Combating cyber threats requires continuous development and cooperation on a global scale, and we are ready to rise to this challenge.


Author of the article: Nikita Bykov, Junior Researcher at the UDV Group Research Center


Links:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *