Module 4.
Cyber Threat
 Intelligence
     MIS 689
      Cyber Warfare
           Capstone
Module Objectives
 Fundamental CTI
 Exploring & Collecting Hacker Community Data
 Exploring AZSecure Hacker Assets Portal: Identifying Threats, Actors,
  and Targets
 CTI Visualization via Tableau (Your Own Analysis)
Fundamental CTI
    CTI process
                             Phase 2: Data                                            Phase 4: Intel
  Phase 1: Intel                                         Phase 3: Threat
                             Collection and                                             Usage and
Planning/Strategy                                           Analytics
                              Aggregation                                             Dissemination
    Description: Identify                                     Description: Analyze
                               Description: Identify                                    Description: Mitigate
    intelligence needs of                                       collected data to
                               and collect relevant                                         threats and
    organization, critical                                     develop relevant,
                                 data for threat                                            disseminate
      assets, and their                                      timely, and actionable
                                    analytics                                               intelligence
        vulnerabilities                                           intelligence
     Approaches: threat                                     Approaches: malware          Approaches: manual
                              Data sources: internal
   trending, vulnerability                                      analysis, event         and automated threat
                              network data, external
     assessments, asset                                           correlation,          responses, intelligence
                               threat feeds, OSINT,
     discovery, diamond                                     visualizations, machine         communication
                               human intelligence
          modelling                                                 learning             standards (e.g., STIX)
                                        Four phased CTI Lifecycle       We are here
    Popular CTI Analytics
Analytical Approach          Description                    Examples                          Value             Major Companies Using
Summary Statistics    High level summary of        Number of blocked IP’s,        Good overview for            All
                      collected data               locations of attacks,          executives
                                                   counts over time
Event Correlation     Analyzes relationships       Identifying machine            Integrates multiple          All
                      between events               sending malicious traffic      sources of data together
                                                   by checking firewall log       (usually internal network)
Reputation Services   Identifying the quality of   IP “X” has a poor              Identify which IP            Akamai, NSFOCUS,
                      an IP                        reputation                     addresses to block           FireEye, AlienVault
Malware Analysis      Analyzing malicious files    Decompiling ransomware         Bolster technical cyber-     FireEye, AlienVault
                      on a network                                                defenses
Anomaly Detection     Detecting abnormal           Unusual user logins            Detect malicious activity    Splunk
                      behaviors
Forensics             Identifying and preserving   Examining RAM from a           Identifying how an attack    LIFARS, Blue Coat, FireEye
                      digital evidence             malicious system               occurred
Machine Learning*     Algorithms that can learn    Classifying malware            Automated analysis           Splunk, FireEye, Cylance
                      from and make
                      predictions on data
                                    *We will have lectures dedicated to machine learning/data mining
Malware Analysis – Types of Malware
                Type                                         Description
Backdoor                       Allows an attacker to control the system
Botnet                         Infected computers receive instructions from same Command-and-
                               Control server
Downloader                     Malicious code that exists only to download other malicious code
Information-stealing malware   Sniffers, keyloggers, password hash grabbers
Launcher                       Malicious program used to launch other malicious programs
Rootkit                        Malware that conceals the existence of other code, usually paired with
                               a backdoor
Scareware                      Frightens a user into buying something
Spam-sending malware           Attacker rents machine to spammers
Worms or Viruses               Malicious code that can copy itself and infect additional computers
Malware Analysis – Static vs Dynamic
 Static Analysis – examines malware without running it
    Quick and easy, but fails for advanced malware and can miss important
     behavior
    Tools: VirusTotal, strings, disassemblers
 Dynamic Analysis – run malware and monitor its effect
    Easy, but requires a safe test environment. Not effective on all malware
    Tools: RegShot, Process Monitor, Process Hacker, CaptureBAT
    RAM Analysis: Mandiant Redline and Volatility
 Exploring & Collecting
 Hacker Community Data
(Du et al., IEEE ISI 2018)
What are in the underground economy?
                                  POS Skimmer                  ATM Skimmer
                                                                 Accessories
              EMV  encoder
                Target POS device:
                Verifone vx510/vx670
                                                                 Tutorials
                                                           Features
         Youtube
         Tutorials                               Sold in
   Method of                                     batch
   Payment: Liberty
   Reserve                                      Blank Credit/Debit Cards (Plastics)
Collection Challenges
 Anti-crawling measures
      IP address blacklisting
      User-agent check
      User/password authentication & CAPTCHA validation
      Denial of service for too many requests
 Potential risks of retaliation
    Constantly probing underground economy platforms may spook platform owners.
    These owners can trace back to us based on network traffic log.
Need for secure, intelligent collection capabilities
Hacker Community Platforms Overview
Platform           Description                     CTI Value                Underlying Mechanism:
                                                                            • Hackers use forums and/or IRC to
            Message board allowing   Key threat actor identification;
            members to post messages sharing of hacking tools; indication     freely discuss and share Tools,
Hacker
Forums      (archived)               of access to other hacker                Techniques, and Processes (TTP).
                                     communities
            Plain-text, instant      Sharing of hacking knowledge and       • Hackers download tools or
            messaging, communication potential target; indication of          navigate to DNMs to purchase
  IRC       (not archived)           access to other hacker
                                     communities
                                                                              exploits.
            Markets on Tor that sell   Early indicator for breached         • These tools help hackers conduct
            illicit goods via          companies; in-depth
 DNMs       cryptocurrency             understanding of underground
                                                                              cyber-attacks to attain sensitive
                                       economy                                data such as credit card and SSN.
            Shops selling stolen       Monitoring trafficking of internet   • Finally, hackers load stolen data to
Carding     credit/debit cards and     fraud industry; precaution of
 Shops      sensitive data             breaches before happen                 DNMs and/or carding shops for
                                                                              financial gain.
         Table 1. Hacker Community Platform Summary
Data Collection Overview: Hacker Forums
                                                                                 Ransomware
                                                                                  description
      Poster
   information
     Ransomw
     are code
         Figure 1. An example of a hacker forum member sharing ransomware code
Data Collection Overview: IRC
      Figure 2. An example of hackers sharing links containing illegal contents
      Figure 3. An example of an IRC user demanding hacking service
Data Collection Overview: DNM
         Figure 4. An example of a product listing page on DNM
         Data Collection Overview: Carding Shop
   Card Type
Information of one
  card for carders
                     Figure 5. An example of listing page on carding shop
  AZSecure Data Collection Overview
    Platform      # of          # of         Languages
               Platforms      Records                         In our hacker community data
     Forums    51 forums
                             32,266,852        English/       collection, we successfully
                                posts       Russian/ Arabic
                                                              collected 102 platforms for a
       IRC     13 channels
                              2,791,120
                               lines of         English       total of 43,981,647 records.
                             conversation
      DNM      12 markets
                               249,597
                                               English/
                                               Russian/
                                                                    51 hacker forums,
                               listings
                                                French              13 IRC channels,
     Carding
                26 shops      8,674,078         English
                                                                    12 DNMs
      Shops                    listings
                                                                    26 carding shops
Table 2. Hacker Community Data Collection Summary
       Data Integration and Visualization
  Figure 6. (a) scorecard of active and expired cards, (b)   Figure 7. (a) frequency of cards per shop, (b) banks of stolen
locations, (3) search, sort, and filter functions, and (d)   cards, (c) average card prices, (d) filter capabilities, and (e)
frequency of cards based on zip code                         card issuers with most stolen cards
Exploring AZSecure Hacker Assets Portal:
 Identifying Threats, Actors, and Targets
   (Samtani, et al., JMIS, 34(4), 2017)
Hacker Asset Examples
      Hackers and Hacker Assets
                                     Tutorial on how to create
                                       malicious documents
   Forum post with source code to
     exploit Mozilla Firefox 3.5.3
                                                Forum post with
                                         BlackPOS malware attachment.
                                                                        19
        Introduction – Hacker Asset Examples
Figure 1. Forum post with source code to create botnets       Figure 2. Forum post with BlackPOS malware attachment
                                      Figure 3. Tutorial on how to create malicious
  AZSecure Hacker Assets Portal System Design and Features
    Data Collection                       Web Hosting and Access              System Functionalities
     and Analytics
                                                                      Browsing            Searching          Downloading
                                                                                   System Analytics
  Latent Dirichlet Allocation (LDA) and
Support Vector Machine (SVM) Analytics
 987 tutorials, 15,576 source code, and                            Cyber Threat Intelligence     VirusTotal Malware Analysis
          14,851 attachments                                              Dashboard
                       Figure 5. AZSecure Hacker Assets Portal System Design and Features
AZSecure Hacker Assets Portal – Data Testbed
 Forum     Language        Date Range          # of Posts     # of    # of source       # of      # of tutorials
                                                            Members      code       attachments
 OpenSC    English    02/07/2005-02/21/2016    124,993       6,796      2,590          2,349          628
 Xeksec    Russian    07/07/2007- 9/15/2015     62,316      18,462      2,456            -             40
Ashiyane   Arabic     5/30/2003 – 9/24/2016     34,247       6,406      5,958         10,086           80
tuts4you   English    6/10/2006 – 10/31/2016    40,666       2,539        -            2,206           38
 exelab    Russian    8/25/2008 – 10/27/2016   328,477      13,289      4,572            -            628
  Total:      -       02/07/2005- 10/31/2016   590,699      47,492     15,576         14,851          987
AZSecure Hacker Assets Portal – Data Mining Approach
  Data Collection and      Asset Analysis and Evaluations
    Pre-Processing                                                  Algorithm   Accuracy   Precision   Recall    F1
        Forum              Cleaned         Cleaned       Cleaned      SVM        98.20      96.36      98.20    98.28
     Identification         Code         Attachment      Tutorial
                            Posts           Posts         Posts     k-Nearest    64.00      83.47      64.00    72.24
                                                                    Neighbor
      Obfuscated                           Latent Dirichlet
                        Support Vector
     Crawling and                             Allocation             Naïve
        Parsing
                        Machine (SVM)
                                                 (LDA)
                                                                     Bayes       86.00      88.57      86.00    87.26
                                    Evaluations                     Decision
    Subset creation                         Perplexity and            Tree       82.60      86.41      82.60    84.42
                         Benchmark
     and data pre-                            Inter-rater
                          Classifiers
      processing                               Reliability
                               Hacker Assets Portal V2.0 – Overview
(a) Home page, linking to (b & c) Assets, (d) Dashboard, and (e) Malware Families:
                                                                                     (e) Malware Families, for
                                                                                        depicting relationships
                                                                                       among assets over time
                                                                                       (Crypter Family shown)
 (b) Assets page, linking to
          Source Code and
              Attachments
         (c) Source Code page; sortable by
        asset name, exploit type, date, etc.
                                                                                             (d) Dashboard for drill-down analysis of hackers
                                                                                             & assets over time
               Searching, Sorting & Browsing Hacker Assets
                                             (d) Browsing: Asset metadata and forum link
                             (a) Searching
(b) Sorting                                  (e) Browsing: Raw Code
     (c) Browsing
Cyber Threat Intelligence (CTI) Dashboard
Cyber Threat Intelligence (CTI) Example – Bank Exploits
1.   Filtering on 2014, when BlackPOS was posted, shows assets and threat actors at that time.
2.   Filtering the actor who posted BlackPOS reveals that he posts other bank exploits (e.g., Zeus).
     • Provides intelligence on which hacker to monitor.
Cyber Threat Intelligence (CTI) Example – Crypters
    1
                        2
                                                                3
 1. Filtering on a specific time point (highest peak):
 2. Filtering on a specific asset (crypters, a key technology for
    Ransomware)
 3. Filtering a specific crypter author (Cracksman) shows the trends and
    types of assets he posted.
Cyber Threat Intelligence (CTI) Example – Mobile Malware
1. Filtering for 2016 mobile malware shows assets and threat actors at that time.
2. Filtering on a specific actor (BH-HACKER) allows us to see the assets posted.
CTI Data Exploration &
Visualization via Tableau
  (Your Own Analysis)
Tableau Background
 Tableau is a powerful data visualization software.
 Capable of creating various interactive visualizations from a multitude
  of data sources.
 Tableau is a commercial software, but is available to students for free.
    Download from (http://www.tableau.com/academic/students)
 Tableau is primarily a drag-and-drop software.
Data Sources and Types of Visualizations
 Tableau can connect to variety of data sources, including:
       Local files – Excel, text, Access
       Traditional databases – SQL Server, MySQL, Oracle, PostgreSQL, DB2
       Cloud technologies – Amazon Aurora, EMR, Redshift, BigQuery
       Big Data Technologies – Hadoop, Hive, Spark SQL
 Tableau can create a variety of visualizations including:
       Basic bar and line charts (e.g., temporal, box plots, etc.)
       Geospatial analysis
       Word clouds
       Treemaps
       Network analysis, although there are better tools for this (e.g., Gephi)!
 These visualizations can be combined into interactive dashboards.
     Can later be published online or shared easily.
Tableau Interface                                                • Blue: discrete data
                                                                 • Green: continuous data
 Dimensions
    Data fields that cannot
     be aggregated
    Qualitative values (such
     as names, dates, or                          Drag-n-drop
     geographical data)
 Measures
                                           Data
    Data fields that can be
     measured, aggregated,
     or used for math                                                           Worksheet
                                                    Format/
     operations                                     Encode                                  Plot types
    Numeric, quantitative
     values
                                                          Tabs
 https://onlinehelp.tableau.com/current/pro/desktop/en-us/datafields_typesandroles.htm
Walkthrough Example: NFL Sports
Analytics
 The data used in this example is an Excel spreadsheet about NFL
  Offensive players from 1999-2013. It contains:
      ~40,000 rows of data
      Player information (physically measurable traits, birthplace, college attended)
      Positions played
      Wins achieved in career
    Connecting to a Data Source
1
     We will have to connect to a data source to start making visualizations.
       1. Since our data is in an Excel workbook, we will select that.
       2. Second, we will join two of the sheets in the workbook such that we can get access to a
          larger set of data. Drag the “Unique players” and “Zip codes” sheets to the right. Select the
          “Inner” join option.
       3. We will join the sheets based on zip code.
Creating a Bar Chart
                         1
 Suppose we want to know which major college conferences have most combined wins since 1999.
1. First, drag the “Conference” dimension into the “Rows” bar, and the “College Wins” into the
    columns. Hit the drop down on the “College Wins” and select “Sum.”
2. Second, select bar chart on the right hand side.
3. To add a little bit of color, drag the “Conference” into the “Color” mark.
Creating a Word Cloud
        1
 Suppose now we want to get a general sense of the most popular conferences in
  terms of player enrollment is concerned. A word cloud is a great way to visually
  represent this.
1. First, switch the “Marks” option to “Text”.
2. Second, drag the “Conference” dimension into the “Text” marks box.
   1.   Then drag the “Conference” dimension into the “Size” marks box.
   2.   Adjust the measurement on this by hitting the drop down and selecting “Measure (Count)”
 Creating a Geospatial Visualization
 Consider now that we are                   1
  interested in the birthplaces of all
  of the NFL players.
 We can easily create a map
  representation.
1. Drag the “Longitude” dimension
     to columns, and “Latitude”          2
     dimension to the rows. Select
     the map visualization.
2. Add in some color by dragging
     the “Birth Zip Code” into the
     “Color” Marks.
Combining Visualizations into a Dashboard
 To tell a more comprehensive
  story, we can create a
  dashboard combining all of the
  visualizations.
 Simply open a dashboard view
  and start dragging sheets into
  the dashboard.
 You can format and add filters
  into the dashboard as you
  wish.
Further Examples
 It is useful to explore other Tableau visualizations to get ideas.
    https://public.tableau.com/s/gallery contains many great visualizations.
  Endangered Safari         US Flights Delayed by Precipitation   Domestic Violence in Spain
Tableau Resources
 Gallery of Tableau visualizations:
    https://public.tableau.com/s/gallery
 Tableau training videos:
    http://www.tableau.com/learn/training
 Sample Tableau data sources:
    https://public.tableau.com/s/resources
 Reference book:
    Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software. Daniel Murray, 2 nd edition,
     2015.
    Available online through UA Library
    Companion materials: http://tableauyourdata.com/downloads/