CS 495/595
Lecture 1: Introduction to
Software Reverse Engineering
Cong Wang
Center of Cybersecurity Education and Research
Department of Computer Science
http://www.lions.odu.edu/~c1wang/cs495.html
Syllabus
Cyber Center Desktop Logins:
User ID:Your Midas ID
Password: cre-midasid-cre
Or you can bring your own laptop to class – helps save the
work
Syllabus
Textbook:
Practical Malware Analysis – Michael Sikorski et. al.
Supplemental Textbook (not required):
Reversing Secrets of Reverse Engineering, Eldad Eilam et. al.
The IDAPro Book, Chris Eagle
Practical Reverse Engineering, Bruce Dang et.al
[More advanced]: Hacker Disassembling Uncovered, Kris Kaspersky
et. al.
Course Schedule
See: http://www.lions.odu.edu/~c1wang/cs495.html
Let us go through
Gradings
In-Class Homework: 30%
Homework: 40%
Final Project: 30%
Final Project: Analysis of Wannacry Ransomware
In-Class Homework: you can work in group of max. 2
students
Homework: should be completed independently (no
plagiarism – if found, zero for both, lower your final grade as
well)
Gradings
If you scored you earn
90-100 A
85-90 A-
80-85 B+
75-80 B
70-75 B-
65-70 C+
and so on ……..
Homework Submission
Submission: Blackboard
Homework submission format: docx/pdf file
Write Names/Student ID on the First Page of the HW
Timestamped
Grades will be distributed normally a week after
Policy
Final Project: can be done in group of max 2 students or
individually.
Late submission.
In-class homework should be submitted before Friday
Monday class – some in-class homework
Wednesday class - some in-class homework
Make sure all the in-class homework is submitted before Friday
(cutoff time Thursday 23:59:59)
Take home assignments (before Sunday night)
Pre-requisites
Course Requirements:
Programming Language (C/C++/Python) – basic
programming language
Computer Architectures – basic understanding of operating
systems
Assembly Language
CS150, CS170, CS250, CS270 or equivalents.
Little Test
What this function is doing?
Assembly (hello world)
What is Reverse Engineering
Definition: the processes of extracting knowledge or design
information from anything man-made and reproducing it.
Soviet: AK‐47
Reversed
US: McDonell Douglas AV‐8 Harrier Soviet: Yak 38
Body Design
Ford Fusion Aston Martin
German: STG44
Clone
Assemble an iPhone in 15
mins Shenzhen, China
Legal
• Practice of analyzing a software system, either in whole or in part,
to extract design and implementation information.
• Risks of business disputes/lawsuit
• Is Reversing Legal ? Seek legal counsel.
– Copyright Laws (decompilation legal, intermediate copying is illegal)
– Copyright Laws: In order to decompile a program, that program
must be duplicated at least once, either in memory, on disk, or both
– Digital Millenium Copyright Act (applies to Digital Right
Management products)
• Felten vs. RIAA
• US vs. Sklyarov
Felten vs. RIAA
In 2000, SDMI (Secure Digital Music Initiative) announced the
Hack SDMI challenge – protect audio recordings
SDMI challenge offered a $10,000 reward in return of giving up
ownership
Princeton Prof. Felton’s team found weakness and wrote a paper
[Wu et. al.] ANALYSIS OF ATTACKS ON SDMI AUDIO WATERMARKS, ICASSP, 2001.
[Craver et. al.] Reading Between the Lines: Lessons from the SDMI Challenge, USENIX SP,
2001.
Felten’s team chose to forego this reward and retain ownership of
the information to allow them to publish their findings.
They received legal threats from SDMI and the RIAA (the
Recording Industry Association of America) claiming liability
under the DMCA
They first withdraw their original submission, but paper got
published later
Felten vs. RIAA
Classic case
DMCA could actually reduce the level of security by preventing
security researchers to publish their findings.
US vs. Sklyarov
In 2001, Dmitry Sklyarov, a Russian programmer, was arrested by
the FBI for what was claimed to be a violation of the DMCA.
Sklyarov had reverse engineered the Adobe eBook file format while
working for ElcomSoft, a software company from Moscow.
The information gathered using reverse engineering was used in the
creation of a program called Advanced eBook Processor that could
decrypt such eBook files so that they become readable by any PDF
reader.
Adobe filed a complaint stating that the creation and distribution of
the Advanced eBook Processor is a violation of the DMCA, and both
Sklyarov and ElcomSoft were sued by the government.
Why ?
Related to computer security
Used by hackers to defeat copy protection
(crack games/software)
Reverse encryption product to assess
security levels
Malware analysis (our focus)
Motivation Example 1
1. FBI uses an exploit in TOR browser, implants a cookie to
fingerprint users geographical location via external Firefox
Browser.
2. TOR is anonymous, several layers of encryption – hard to
trace
3. TOR browser based on Firefox
4. Use the exploit to inject a malware “Magneto”
Cookies
What are Cookies ?
Cookies are small text files stored by your web browser after
visiting a web page to personalize your visit in next time, collect
demographic information about the visitors to the page or to
monitor banner clicks.
Not malicious, but can be used by malicious code to affect user
privacy, send user profile to third parties
Motivation Example 1
Send user’s hostname/MAC via HTTP request to 65.222.202.54
Example 2: Wannacry Ransom
Example 2: Wannacry
Began on May 12, 2017
Infected over 200,000 computers over 150 countries (brought
down the entire British hospital system)
Shadow Broker make the exploit public
EternalBlue Exploit Windows SMB (Server Message Block)
protocol – zero-day exploit
Malware analysis found a kill switch -
http://www.iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea.com ,
register this domain name (the guy is also arrested by FBI in LV)
Use Bitcoin to pay ransom – no trace
Disputes, criticisms against government agency/Microsoft
SMB – 20 years, upgrade to Windows 10
Example 3: IoT Botnet/DDoS
Massive amount of IoT Devices now/in future (Home
cameras/Alexa/Fridge/Lamp)
IoT firmware not updated, sold as is
BusyBox system – tiny Unix utilities on IoT devices
Last year, Mirai malware outbreak brought down ISP on the entire
eastcoast
Bruteforce BusyBox with a list of deault password
IoT turns into Botnet
Launch DDoS attack (blackmailing)
Example 4: Privacy Leaks during
PowerBank Sharing
New start-up company to rent
Powerbanks
Android system has a USB
debugging config
Some smartphone vendors
allows installation of software
automatically while connecting
to desktop
Malware obtains privilege via
USB debugging to extract
personal photos, contact and
install backdoor, adware in
Android
Example 5: OPM and Anthem Breach
FBI arrested Yu Pingan, aka GoldSun, a malware broker at LAX (a
teacher from Shanghai). How?
Accused of using Sakula to attack US Office of Personnel
Management, stealing over 80M medical records from Anthem;
Chinese government also made two relevant arrest in 2015.
Sakula was only used once – exploits 4 zero-day CVE-2014-0322
(affecting IE10), CVE-2012-4969 (affecting IE6), CVE-2012-4792
(affecting IE6), and an unidentified Flash Player zero-day
Use watering hole attack: infect popular website with malware and
wait for prey
How ? See next -
How ?
Victim companies found they are connecting to a malware server;
infect by Sakula or its mutant; log keys/upload file.
Found a malware named: capstone.exe, inject into a DNS server:
capstoneturbine.cechire.com
The guy controls the DNS claims working for Capstone Turbine
(a company makes turbines in CA); this guy controls hundreds of
such DNS servers including: update.microsft.kr/hacked.asp
(fake updates for phishing)
Yu contacted this guy says he had some zero-day exploits and
provided him with the malware
What is Malware ?
Set of instructions that run on your computer and make your system do
something that an attacker wants it to do
Malware Classification
Viruses and worms
• Self-replicating code that infects other systems manually or automatically
Botnets
• Software that puts your computer under the remote control of an adversary to
send spam or attack other systems (DDoS)
Backdoors
• Code that bypasses normal security authentications to provide continued,
unauthorized access to an adversary
Trojans
• Code that appears legitimate, but performs an unauthorized action
Malware Classification
Rootkits
• Tools to hide the presence of an adversary, stay concealed, avoid detection
Information theft
• Collects credentials (e.g. keystroke loggers)
• Steal files (credit card data exfiltration)
• Gather information on you, your habits, web sites you visit (e.g. spyware)
• Monitor activity (webcams)
Ransomware
• Code that renders your computer or data inaccessable until payment received
(Wannacry) – Cryptocurrency
• CryptoMiner
• Javascript-based in-Browser malware [INFOCOM 19’]
[INFOCOM' 19] Rui Ning, Cong Wang, Chunsheng Xin, Jiang Li, Liuwan Zhu and Hongyi Wu, CapJack: Capture In-
Browser Crypto-jacking by Deep Capsule Network through Behavioral Analysis, IEEE International Conference on
Computer Communications, Paris, France, 2019. (Acceptance Rate: 19.7%)
Malware Classification
Resource or identity theft
• Store illicit files (copyrighted material)
• Stepping stone to launder activity (frame you for a crime)
Scareware
• Tricks users into buying products they do not need (window pop-up: your
system is infected)
Adware
• Code that tricks users into clicking illegitimate advertisements
Drive-by downloads
• Code automatically downloaded via the web
Malware Classification
Course Objective
Learn tools and techniques to analyze what malicious software
does
How to detect malware
Understand the countermeasures from malware authors to
evade detection
Ethics
Do not run malware files in the classroom PC locally/or
your own computers – only in the VM
Explore only on your own systems/virtual machine you
have permission to
Do not break or break into other people's machines
VirusTotal
• Upload a file, website URL, hash for analysis
• Pros: Free
• Cons: zero‐day exploits, and more ?
VirusTotal
Sandbox
WinXP WinXP WinXP WinXP
Oracle Virtualbox
Cuckoo framework
Sandbox: special environment allows for logging the behavior
of programs
API function calls, their parameters, file created/deleted,
websites and ports accessed
Results are saved in a text file
Identify Malware
Identify Signatures:
Host-based signatures:
Malware PE File - Entropy
Malware behavior: changes registry, API calls, file access/creation/modification
Network signatures: monitoring network traffic, understand the
propagation of worms – provides high detection rate, less false
positives
Port Scanning
Protocols used (e.g. SMB, IRC)
Packet Payload
Finding needle-in-haystack (packet inspection)
Virtual Machine
Download Virtual Box and Install:
https://www.virtualbox.org/wiki/Downloads
You can either:
Load the image file (2+ GB) here: (Box Link – the link will be
valid till the end of semester)
In-class work
Download VM .ova image
Establish VM on either your own laptop or classroom
desktops
No need to submit this one
Tools already install on VM
WinRAR
Sysinternals tools (Process Explorer, Process Monitor) (https://docs.microsoft.com/en-us/sysinternals/downloads/)
PEView (wjradburn.com)
Resource Hacker (angusj.com)
Dependency Walker (dependencywalker.com)
IDA Pro 5.0 Freeware (hex-rays.com) – IDA Pro 6.8
Wireshark (wireshark.org) – v. 1.10 works on XP
Apate DNS (mandiant.com) – Need .NET Framework 3.5 (if you do it by yourself)
OllyDbg 1.10 (ollydbg.de)
WinHex (winhex.com)
PEiD (softpedia.com)
UPX (upx.sourceforge.net)
Regshot (code.google.com/p/regshot/)
Google Chrome
-You can customize your VM, of course
Tools already install on VM
Make sure you “insert guest Additions CD image”
So you can drag files to VM from host
Tools already install on VM
IDAPro 5.0 Freeware does not support plugin
We will use IDAPro 6.5 (some Python plugins do not work)
Advice when analyzing code
Since we are reversing:
Pay attention to the main flow rather than details
Pay attention to the keywords (function calls/names/strings
rather than memory operation from the assembly code)
Some code is generated by the compiler – difficult to analyze
(avoid the rabbit hole)
Make guesses and use your hunch
Personal experience with IDAPro and Ollydbg
IDAPro is fantastic
Ollydbg gets the job done, but the text is too small (hurt your
eyes), cannot trace back only forward – makes analysis time-
consuming.