Abstract
Background: Domain Name System (DNS) is considered the phone book of the Internet. Its main goal is to translate a domain name to an IP address that the computer can understand. However, DNS can be vulnerable to various kinds of attacks, such as DNS poisoning attacks and DNS tunneling attacks.
Objective: The main objective of this paper was to allow researchers to identify DNS tunnel traffic using machine-learning algorithms. Training machine-learning algorithms to detect DNS tunnel traffic and determine which protocol was used will help the community to speed up the process of detecting such attacks.
Methods: In this paper, we considered the DNS tunneling attack. In addition, we discussed how attackers can exploit this protocol to infiltrate data breaches from the network. The attack starts by encoding data inside the DNS queries to the outside of the network. The malicious DNS server will receive a small chunk of data decoding the payload and put it together at the server. The main concern is that the DNS is a fundamental service that is not usually blocked by a firewall and receives less attention from systems administrators due to a vast amount of traffic.
Results: This paper investigates how this type of attack happens using the DNS tunneling tool by setting up an environment consisting of compromised DNS servers and compromised hosts with the Iodine tool installed in both machines. The generated dataset contains the traffic of HTTP, HTTPS, SSH, SFTP, and POP3 protocols over the DNS. No features were removed from the dataset so that researchers could utilize all features in the dataset.
Conclusion: DNS tunneling remains a critical attack that needs more attention to address. DNS tunneled environment allows us to understand how such an attack happens. We built the appropriate dataset by simulating various attack scenarios using different protocols. The created dataset contains PCAP, JSON, and CSV files to allow researchers to use different methods to detect tunnel traffic.
Keywords: Domain name system, DNS tunneling, dataset, iodine, DNS traffic, machine learning.
Graphical Abstract