International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
MALWARE CAPTURING AND DETECTION IN
          DIONAEA HONEYPOT
                       Dilsheer Ali. P                                                      Gireesh Kumar T.
            TIFAC-CORE in Cyber Security                                           TIFAC-CORE in Cyber Security
       Amrita School of Engineering, Coimbatore                               Amrita School of Engineering, Coimbatore
     Amrita Vishwa Vidyapeetham, Amrita University                          Amrita Vishwa Vidyapeetham, Amrita University
                          India                                                                 India
              dilsheerparavath@gmail.com                                              gireeshkumart@gmail.com
    Abstract - This paper proposes software based malware            gain information about the motives and mode of operation of
capturing and detection method. Which implements an efficient        the attackers. Research honeypot’s are complex to deploy and
malware capturing and detection method in honeypot                   maintain , main advantage of this kind of honeypot to collect
environment. It includes collecting the logs from the network
using honeypot system. It create incident table based on the logs,
                                                                     all the information’s up to the Operating System level. The
in the test case we are using metasploit frame work for attacking    size of the logs in a honeypot is very huge. These types of
the honeypot system through LibEmu manually. In the                  honeypot’s is mainly used for research purpose, military, or
metasploit there are thousands of malicious payloads are             government organizations. Based on level of involvement
available, using the payloads we can exploit the vulnerabilities     honeypot classified in to two types. High Interaction honeypot
present in the services on the honeypot.                             and Low interaction honeypot [4] [10]. High interaction
Keywords – Malware, Honeypot, LibEmu, Incidents, Metasploit.         honeypot provide lots of services to the attacker up to the
                                                                     Operating system level[5]. Attacker try to attack the system
                                                                     through this services. Low Interaction honeypot’s provide the
                      I. NTRODUCTION                                 service to the attacker that he frequently requested, these types
                                                                     of honeypots provide only limited number of services.
                                                                     Malwares can be classified in to different types, some of the
Malware is a computer program that designedly fulls the              malwares are list out below . Worm is a malicious computer
abusive intent of an attacker is unremarkable referred to as         program that geminates itself to other computer machine, it
despiteful software or malware. Worm, Virus, and Trojan              can run independently. The initial element of a worm is called
horse are some of the malware, these malwares are posses             malcode[2]. The malcode is act as a penetration testing tool
sympathetic malicious behavior. The ultimate aim of the              that locates the vulnerabilities in the system. It will also scan
malware is to take the sensitive information's from the              the unsecured servers and replicates itself to each server. Virus
computer system and display unwanted advertisement etc.              cannot run independently . It requires host program be run to
Today the malware's are mainly used by black haters and              hap it. The first step is invasion virus enter in to the computer
government to steal the information from the Internet users. In      system and infect the system. When virus enter the computer
general honeypot means a container of honey. But in terms of         system they move towards the hidden areas. When we open
computer markup language, honeypot is a security system              some files related to these viruses the virus could infect the
designed to detect the unauthorized access or the use of a           system, destroy its major components. Trojan horse is
computer system[6][12]. The aim of the honeypot is to deviate        malicious software that pretends to be very useful but perform
the attackers from the real servers[11]. All the traffic through     despiteful actions on the background part. The Trojan horses
the honeypot is purely unauthorized, because there is no real        are might be very useful browser plugins screen-savers, and
server is running. The honeypot’s can be categorized based on        might be downloadable games. Once installed their malicious
their deployment (use/action) and their level of intercession,       part might download additional malware from the Internet,
and further classified in to Production and Research honeypot.       which also try to modify the system settings or infect files on
Production honeypot can easy to use. It can capture only             the computer system. Spyware is mainly used for tracking and
limited number of information’s [4][8]. This type of honeypots       storing the internet users movements on the web and serving
are mainly used by corporates, Research honeypots are run to         pop-up advertisements to the internet users. Software that
978-1-5090-5682-8 /17/$31.00 ©2017 IEEE                                                                                             1
                        International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
retrieves the sensitive information from the user system, and           •    Analysis results including labels of anti-viruses etc.
transfer the information to the attacker. Information they              •    Internet Protocol (IP) addresses information.
might be useful to the attacker like credit card details, account       •    Downloaded Logs, binaries that were submitted.
credentials, contents of confidential data's and email contents
[2]. Botnet is a group of infected end system under the control               The Fetched Malware binaries are stored in the log
of botmaster. Every bot have at least one command and               servers and submitted to the analysis server for analysis
control(C & C)channel. Botnet is mainly use IRC protocol due        purpose. The malware binaries were fetched from log server
to his high availability. It will commonly used other protocols     and submitted to anti-virus scan server which will take care of
are HTTP, Peer to peer, the botmaster send commands to bot          the analysis of the binaries supported on signatures. For
through C & C channel [1].                                          analysis purpose three anti-virus software were taken from
                                                                    different companies like Symantec and Microsoft, MacAfee,
                         II. Related Works                          also the MD5 hash values of the corresponding binaries were
                                                                    submitted to the Virus Total for scan with 42 anti-virus
         Every day the cyber attack rate is increased               products.
tremendously, as this reason the security mechanism is very                   Bhanu S, Khilari G, and Kumar proposed in 2014 [3],
important. Honeypot security mechanism is one of the best           Kippo is a SSH based honeypot is a medium-interaction
malware analysis mechanism in the current situation. Here we        honeypot which is written in Python language. Kippo is
are us-ing DIONEA Honeypot , dionaea is a low interaction           mainly used to log the brute force attacks and the complete
honeypot, it provided some services like SMB, FTP, TFTP,            shell interaction recite by an attacker. SSH(Secure shell) is an
VoIP. The LibEmu is the services provided by Dionaea, it            encrypted remote system connection mechanism, commonly
provide a shell to the attacker using port binding. Attacker try    used in Linux based operating systems. It provides a secure
to execute his malware payload on the shell [8][9]. Dionaea         connection between the computer systems. The protocol was
logs all the activities of the attacker. The main aim of the        defined by Ylonen and Lonvick in Internet Engineering Task
dionaea is to obtain the copy of the malware. Dionaea collect       Forces RFC4254 and allows users to access the secure shell of
the API calls and Argument, using the features it will              a remote system through only authentication mechanism. SSH
download the copy of the malware using HTTP. In 2012                uses the port 22 for authentication purpose it will ask for the
chaudhary B. P. [1] deals with a project based on nepenthes         username and password at the time of login, also it has a more
honeypot. It is a botnet detection based low interaction            secure methods like public key authentication. First they
honeypot, Botnet detection possible on public network as well       deploy a kippo honeypot using a virtual private server (VPS).
as in the private network by deploying the honeypots. The           Assign a static ip for the server. To monitor all the attacker
automated architecture for malware collection using honeypot        activity, they were use some of the tools like an openSSH
and analysis is done by using anti-virus scan. There are three      server for collecting the passwords, syslogging to remotely log
major components in the nepenthes honeypot system:                  important system events. Sebek tool is used for collecting
Malware collector, Virus scan server and Log server. There          secretly all keystrokes on incoming SSH connections [3].
are various modules in nepenthes honeypot.                          Egele, Manuel [2] in 2012 deals with various techniques based
                                                                    on dynamic malware analysis and tools used for it. The
    •    Shell code Handlers and Emulators : Which provide a        dynamic analysis mainly focused on execute a malware
         path to interact with the malware and honeypot             sample and analyzing the action taken by it. In static analysis
    •    Download Modules : It Will try to download the             we are analyzing the sample by perambulating it. There are
         binary of the file (http,curl, ftp, tftp etc).             some drawbacks for static analysis. The main drawback is, it
    •    Submission Modules : It will submit the binary of a        will not give a proper result when it is disassembling them.
         malware for analysis purpose(Norman box, Cuckoo            function parameter analysis, information flow tracking,
         sandbox, postgres, etc).                                   Instruction traces, function call monitoring are useful for
                                                                    dynamic analysis.
Collected malwares and all the data set including network
trace, and captured data were stored in the log server for              •    Function call monitoring : A function in a program
further analysis. Log server is a main database server which                 deals with a specific task. While we are analyzing the
keeps the metadata of the collected information. Also it keeps               function call we got the information about it. The
the following records:                                                       grabbing of a function call can be termed as hooking.
                                                                        •    Function parameter analysis : The function parameter
    •    MD5 hash values of the collected samples                            is different from both static analysis and dynamic
    •    Binaries of the collected malwares.                                 analysis. In static analysis we consider only possible
    •    Captured data and network traces.
                                                                                                                                      2
                         International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
         parameters, but in dynamic analysis, we take the real    Dionaea collect all the networks logs in the local repository.
         values when function is invoked.                         Dionaea is a low interaction honeypot. It is the latest version
    •    Information flow tracking : This is deal with the how    of nepenthes honeypot. The ultimate aim of the dionaea is to
         the malicious program execute the datas. The             obtain the copy of the malware. The Metasploit Framework
         important information's are marked as taint.             (MSF) has number of vulnerable exploits were available. Its
         Whenever the taint label is invoked it will shows        an infrastructure that you can build upon your malicious
         warnings.                                                payloads and utilize for your custom needs. Lots of vulnerable
    •    Instruction trace : It will give the behavior view of    exploits are available in Metasploit. In dionaea honeypot there
         the machine level instruction while it is being          are different kinds of services running on there like SMB,
         executed. Analysis of system calls and function calls    FTP, TFTP, VoIP. SMB is the one of the most vulnerable
         are the examples of this type.                           service running in dionaea honeypot[13]. Open metasploit
                                                                  frame work in another computer system. Search the different
         Podhradsky, A. L., Casey, C., & Ceretti, P. (2012)       kinds of payloads for SMB,FTP,FTP, choose one payload and
[7]. The Bluetooth honeypot (Bluepot) developed by Andrew         set the payload. After that, set the local IP address and port
Micheal Smith his grail was to create a software that collect     number, similarly set the remote IP address and remote system
malware logs and Bluetooth attacks. He create a software          port number and exploit the payload on the remote shell.
written in java compatible with Linux operating system. In the
bluepot it will take care of the 3 Bluetooth protocols like       A. LibEmu
OBEX,L2CAP,RFCOMM. We can configure and enable the
Bluetooth settings through the bluepot. We can enable the                   Dionaea uses a service called LibEmu. Which is used
randomizer also which will automatically change the               to detect and evaluate the payload send by an attacker. Libemu
Bluetooth name randomly every interval of time, so every          provide a LibEmu VM the payloads are executed on there,
interval attacker got a new Bluetooth devices. Attacker try to    after executing the payload it will store the features like API
connect to that systems, if the connection established attacker   calls and arguments which is called profiling. Once we got the
send a malicious payloads in to the devices through OBEX          profiling and payload we have to act up on it in order to obtain
protocol. If the payload is received at the receiver side the     the copy of the malware , there are some techniques which is
bluepot take the mac address of the attacker device and Store     used by an attacker for attacking a system.
the malicious payloads on the local repository. The bluepot
will provide a graphical representation of an attacking rate.         •    Shell Binding : Dionaea offers a shell emulation for
                                                                           payload the provide a shell to the attacker. Attacker
              III. DESIGN AND IMPLEMENTATION                               tries to execute his malicious payload on the shell.
                                                                           Here we are using the ip address is 172.128.17.46,
Fig.1. shows the complete system architecture of the dionaea               attacker try to make a connection with this ip. After
honeypot. All the network traffic are coming to the internal               the establishment connection dionea provide a shell
network through the dionaea honeypot, so it will collect the               to the attacker. Attacker inject his malicious payloads
logs from the network and stored in the logal repository. This             on the shell.
logs is further used for analysis purpose.                            •    URLDownToFile API :- Which is used to download
                                                                           a copy of the malware via http that send by an
                                                                           attacker and the copy will be stored in the local
                                                                           repository.
                                                                  B. Logging
                                                                      Figure 2 shows the Logs in Text format .Dionaea collect
                                                                  the logs in text format as well as in incident format. Logs in
                                                                  text format is not a scalable solution, so in that case dionaea
                                                                  provide an another service called INCIDENT. The incident
                                                                  which contain the information about the attackers, this
                                                                  information passed in to the log sql database using iHandler.
                                                                  The main advantage of the incident is the ability to group the
                  Fig.1 . Dionaea System Architecture             information based on the initial attack.
                                                                                                                                3
                        International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
                                                                         Table I. Shows the complete information about the attacker
                                                                  Protocol                        Local        Remote IP         Remote
                                                                   used                           Port                            Port
                                                                                 Local IP
                                                                 TCP          172.17.128.154     993        172.17.128.47        64592
                    Fig.2. Logs In Text Format
                                                                 TCP          172.17.128.154     1723       172.17.128.47        64592
C. Experimental setup                                            UDP          172.17.128.46      47882      184.105.139.94       41392
   •   Install windows 8 as host operating system.
                                                                 UDP          172.17.128.46      5060       142.0.41.190         5122
   •   Create a Virtual Machine with Ubuntu 14.04 LTS as
       guest operating system.
   •   Configure the dionaea in ubuntu guest os.
                                                                 Table I shows the INCIDENT format of details of an attacker.
   •   Dionaea comes with a shell script module called
                                                                 The 1st column represents the connection protocol used by an
       runDionaea.sh inside of the dionaea-vargent folder.
                                                                 attacker, the rest of the columns represent, ip address and port
   •   When running shell script(rundionaea.sh) dionaea run      number for the host system and remote systems.
       the service on background.
                                                                 The attacker try to attack the honeypot system through various
   •   It will collect the logs. The logs is present in          services like SMB, FTP, TFTP, VoIP. The dionaea provide a
       /opt/dionaea/var/log                                      communication system called LibEmu, using the LibEmu It
                                                                 will provide a shell emulator to the attacker, attacker try to
D. Configure ubuntu Guest Network Adapter Settings               execute his payload on the shell. After executing the payload
                                                                 the dionaea collect the logs and store the API calls and
                                                                 arguments, the logs are in the text format, logs in text format
   •   Assign a Static Ip for honeypot 103.5.112.94.             is not a scalable solution, So dionaea provide an another
   •   Configure a local IP to the the Ubuntu os                 communication system called incidents. Incident, which
       172.17.128.46 and 172.17.128.154.                         cluster the information based on the initial attack. Which store
                                                                 the API calls and arguments it is call it as profiling. Using the
   •   Run Dionaea Honeypot using runDionaea.sh script.          profiling dionaea will download the copy of the malware using
                                                                 http protocol.
E. Capturing Task
                                                                 Table II shows the some of the malwares downloaded by
                                                                 dionaea honeypot. Dionaea calculate the MD5 Hash values of
   •   Open up a terminal and navigate to our dionaea folder     the corresponding malwares. This MD5 Hash value is used for
       and execute our runDionaea.sh script.                     analysis purpose. We can either submitted the hash value in to
                                                                 the virus total or we can submitted to any anti-virus software.
   •   After running the script Dionaea run on the
                                                                 This hash value we can consider as a feature of a malware.
       background we can verify the services using nmap
                                                                 This types of features are mainly used for malware analysis
       127.0.0.1 it will shows the current running services in
                                                                 purpose like whether the given sample is malware or not. The
       the system.
                                                                 virus companies are mainly used the Hash value for analysis
   •   Dionaea collect the logs in text format, logs in text     purpose.
       format is not a scalable solution, so dionaea provide
       an another communication system called incidents. It
       will collect and cluster all the information based on
       the initial attack.
                                                                                                                                          4
                               International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]
                                                                                                         References
             Table II. Malware’s downloaded by dionaea honeypot
                                                                          [1]    Kumar, S., Sehgal, R., Singh, P., & Chaudhary, A. (2013). Nepenthes
                                                                                 Honeypots Based Botnet Detection. arXiv preprint arXiv:1303.3071.
Connection     Download URL           MD5 Hash values of the Malware
number                                                                    [2]    Egele, M., Scholte, T., Kirda, E., & Kruegel, C. (2012). A survey on
                                                                                 automated dynamic malware-analysis techniques and tools. ACM
                                                                                 Computing Surveys (CSUR), 44(2), 6.
14296          smb://1.34.68.135      939b1bbfd367b0f6ef45144ce0516be     [3]    Bhanu, S., Khilari, G., and Kumar, V. (2014). Analysis of SSH attacks
                                                                                 of Darknet using Honeypots. International Journal of Engineering
                                                                                 Development and Research, ISSN, 2321-9939
                                                                          [4]    Sachan, A., and Panchagavi, R. (2016). Honeypots: Sweet OR Sour spot
14296          smb://1.34.68.135      64b4345a946bc9388412fedd53fb2
                                                                                 in Network Security
                                                                          [5]    Kumar, S., and Pant, D. (2009). Detection and prevention of new and
                                                                                 unknown malware using honeypots. arXiv preprint arXiv:0912.2293.
14300          smb://1.34.68.135      6f14fbd4368fd67d5fa1d8b92cfd2a9f
                                                                          [6]    Kambow, N., and Passi, L. K. (2014). Honeypots: The Need of Network
                                                                                 Security. International Journal of Computer Science and Information
                                                                                 Technologies, 5(5).
132619         spoolss://103.5.112    7878277b316e802761d4e3f8705c4221
                                                                          [7]    Podhradsky, A. L., Casey, C., & Ceretti, P. (2012, April). The Bluetooth
                                                                                 honeypot project. InWireless Telecommunications Symposium (WTS),
                                                                                 2012(pp. 1-10). IEEE.
132622         spoolss://103.5.112.   ce9a7d0d23b3238ff379aa9a313b4e90
               94                                                         [8]    Aathira.K.S, Hiran.V.Nath, Thulasi.N. Kutty, Gireesh Kumar.T, Low
                                                                                 Budget Honeynet Creation and Implementation for Nids and Nips,
                                                                                 International Journal of Computer and Network Security, Vol. 2, No.
                                                                                 8,pp 27-32, August 2010.
132627         spoolss://103.5.112.   f2c55e756009e81c109369c1f9068d30
               94                                                         [9]    www.edgis-security.org/honeypot/dionaea.
                                                                          [10]   https://github.com/rep/dionaea.
                                                                          [11]   https://www.honeynet.org/project.
132630         spoolss://103.5.112    113e9ae0b05d7aea1ce423b3013c23491   [12]   https://en.wikipedia.org/wiki/Honeypot28computing29.
                                                                          [13]   https://github.com/DinoTools/dionaea.
                              IV. Future Work
 The malware capturing and detection process is a very
 important task in a secure infrastructure. There are different
 kinds of malware capturing systems are available now. But
 Honeypot is the one of the best method among them. Malware
 detection systems are used with other security systems like
 IDS and FIREWALL to make the entire network as more
 secure. This technology has made significant role in the past
 few years. This paper proposed a efficient malware capturing
 mechanism using dionaea honeypot. The ultimate aim of the
 dionaea is to obtain a copy of the malware. Dionaea provide a
 service called LibEmu which provide a shell to the attacker,
 attacker try to execute his malicious payload on the shell.
 Dionaea extract the features like API calls and arguments
 which is call it as profiling using the profiling it will download
 the copy of the malware using http protocol. Future work
 includes implementing cuckoo sandboxing for dynamic
 analysis purpose, collecting the logs from sandboxing and
 generate the report for each malware. Extract the features like
 API calls, arguments, permissions etc. and create a data set
 based on that, after creating dataset apply feature reduction on
 the data set. apply machine learning algorithm on the data set.