0% found this document useful (0 votes)

47 views15 pages

Oakland 10

This document formally defines the algorithms for dynamic taint analysis and forward symbolic execution as extensions to the operational semantics of a simple intermediate language called SIMPIL. It also highlights important implementation choices, common pitfalls, and considerations when using these techniques for security applications like malware analysis, vulnerability discovery, and test case generation.

Uploaded by

metatron13

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views15 pages

Oakland 10

Uploaded by

metatron13

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

All You Ever Wanted to Know About

Dynamic Taint Analysis and Forward Symbolic Execution

(but might have been afraid to ask)
Edward J. Schwartz, Thanassis Avgerinos, David Brumley
Carnegie Mellon University
Pittsburgh, PA
edmcman, thanassis, dbrumley@cmu.edu
AbstractDynamic taint analysis and forward symbolic
execution are quickly becoming staple techniques in security
analyses. Example applications of dynamic taint analysis and
forward symbolic execution include malware analysis, input
lter generation, test case generation, and vulnerability dis-
covery. Despite the widespread usage of these two techniques,
there has been little effort to formally dene the algorithms and
summarize the critical issues that arise when these techniques
are used in typical security contexts.
The contributions of this paper are two-fold. First, we
precisely describe the algorithms for dynamic taint analysis and
forward symbolic execution as extensions to the run-time se-
mantics of a general language. Second, we highlight important
implementation choices, common pitfalls, and considerations
when using these techniques in a security context.
Keywords-taint analysis, symbolic execution, dynamic
analysis
I. INTRODUCTION
Dynamic analysis the ability to monitor code as it
executes has become a fundamental tool in computer
security research. Dynamic analysis is attractive because
it allows us to reason about actual executions, and thus
can perform precise security analysis based upon run-time
information. Further, dynamic analysis is simple: we need
only consider facts about a single execution at a time.
Two of the most commonly employed dynamic analysis
techniques in security research are dynamic taint analysis
and forward symbolic execution. Dynamic taint analysis runs
a program and observes which computations are affected
by predened taint sources such as user input. Dynamic
forward symbolic execution automatically builds a logical
formula describing a program execution path, which reduces
the problem of reasoning about the execution to the domain
of logic. The two analyses can be used in conjunction to
build formulas representing only the parts of an execution
that depend upon tainted values.
The number of security applications utilizing these two
techniques is enormous. Example security research areas
employing either dynamic taint analysis, forward symbolic
execution, or a mix of the two, are:
1) Unknown Vulnerability Detection. Dynamic taint
analysis can look for misuses of user input during an
execution. For example, dynamic taint analysis can be
used to prevent code injection attacks by monitoring
whether user input is executed [2325, 50, 59].
2) Automatic Input Filter Generation. Forward sym-
bolic execution can be used to automatically generate
input lters that detect and remove exploits from the
input stream [14, 22, 23]. Filters generated in response
to actual executions are attractive because they provide
strong accuracy guarantees [14].
3) Malware Analysis. Taint analysis and forward sym-
bolic execution are used to analyze how information
ows through a malware binary [7, 8, 65], explore
trigger-based behavior [12, 45], and detect emula-
tors [58].
4) Test Case Generation. Taint analysis and forward
symbolic execution are used to automatically generate
inputs to test programs [17, 19, 36, 57], and can
generate inputs that cause two implementations of the
same protocol to behave differently [10, 17].
Given the large number and variety of application do-
mains, one would imagine that implementing dynamic taint
analysis and forward symbolic execution would be a text-
book problem. Unfortunately this is not the case. Previous
work has focused on how these techniques can be applied
to solve security problems, but has left it as out of scope to
give exact algorithms, implementation choices and pitfalls.
As a result, researchers seeking to use these techniques often
rediscover the same limitations, implementation tricks, and
trade-offs.
The goals and contributions of this paper are two-fold.
First, we formalize dynamic taint analysis and forward
symbolic execution as found in the security domain. Our
formalization rests on the intuition that run-time analyses
can precisely and naturally be described in terms of the
formal run-time semantics of the language. This formal-
ization provides a concise and precise way to dene each
analysis, and suggests a straightforward implementation. We
program ::= stmt*
stmt s ::= var := exp [ store(exp, exp)
[ goto exp [ assert exp
[ if exp then goto exp
else goto exp
exp e ::= load(exp) [ exp
b
exp [
u
exp
[ var [ get input(src) [ v

b
::= typical binary operators

u
::= typical unary operators
value v ::= 32-bit unsigned integer
Table I: A simple intermediate language (SIMPIL).
then show how our formalization can be used to tease out
and describe common implementation details, caveats, and
choices as found in various security applications.
II. FIRST STEPS: A GENERAL LANGUAGE
A. Overview
A precise denition of dynamic taint analysis or forward
symbolic execution must target a specic language. For
the purposes of this paper, we use SIMPIL: a Simple
Intermediate Language. The grammar of SIMPIL is pre-
sented in Table I. Although the language is simple, it is
powerful enough to express typical languages as varied as
Java [31] and assembly code [1, 2]. Indeed, the language is
representative of internal representations used by compilers
for a variety of programming languages [3].
A program in our language consists of a sequence of
numbered statements. Statements in our language consist
of assignments, assertions, jumps, and conditional jumps.
Expressions in SIMPIL are side-effect free (i.e., they do
not change the program state). We use
b
to represent
typical binary operators, e.g., you can ll in the box with
operators such as addition, subtraction, etc. Similarly,
u
represents unary operators such as logical negation. The
statement get input(src) returns input from source src. We
use a dot () to denote an argument that is ignored, e.g.,
we will write get input() when the exact input source is
not relevant. For simplicity, we consider only expressions
(constants, variables, etc.) that evaluate to 32-bit integer
values; extending the language and rules to additional types
is straightforward.
For the sake of simplicity, we omit the type-checking
semantics of our language and assume things are well-typed
in the obvious way, e.g., that binary operands are integers
or variables, not memories, and so on.
B. Operational Semantics
The operational semantics of a language specify unam-
biguously how to execute a program written in that language.
Context Meaning
Maps a statement number to a statement
Maps a memory address to the current value
at that address
Maps a variable name to its value
pc The program counter
The next instruction
Figure 2: The meta-syntactic variables used in the execution
context.
Because dynamic program analyses are dened in terms
of actual program executions, operational semantics also
provide a natural way to dene a dynamic analysis. However,
before we can specify program analyses, we must rst dene
the base operational semantics.
The complete operational semantics for SIMPIL are
shown in Figure 1. Each statement rule is of the form:
computation
current state), stmt end state), stmt
Rules are read bottom to top, left to right. Given a statement,
we pattern-match the statement to nd the applicable rule,
e.g., given the statement x := e, we match to the ASSIGN
rule. We then apply the computation given in the top of
the rule, and if successful, transition to the end state. If
no rule matches (or the computation in the premise fails),
then the machine halts abnormally. For instance, jumping to
an address not in the domain of would cause abnormal
termination.
The execution context is described by ve parameters: the
list of program statements (), the current memory state (),
the current value for variables (), the program counter (pc),
and the current statement (). The , , and contexts are
maps, e.g., [x] denotes the current value of variable x. We
denote updating a context variable x with value v as x v,
e.g., [x 10] denotes setting the value of variable x to the
value 10 in context . A summary of the ve meta-syntactic
variables is shown in Figure 2.
In our evaluation rules, the program context does not
change between transitions. The implication is that our oper-
ational semantics do not allow programs with dynamically
generated code. However, adding support for dynamically
generated code is straightforward. We discuss how SIMPIL
can be augmented to support dynamically generated code
and other higher-level language features in Section II-C.
The evaluation rules for expressions use a similar notation.
We denote by , e v evaluating an expression e
to a value v in the current state given by and . The
expression e is evaluated by matching e to an expression
evaluation rule and performing the attached computation.
v is input from src
, get input(src) v
INPUT
, e v
1
v = [v
1
]
, load e v
LOAD
, var [var]
VAR
, e v v

=
u
v
,
u
e v

UNOP
, e
1
v
1
, e
2
v
2
v

= v
1

b
v
2
, e
1

b
e
2
v

BINOP
, v v
CONST
, e v

= [var v] = [pc + 1]
, , , pc, var := e , ,

, pc + 1,
ASSIGN
, e v
1
= [v
1
]
, , , pc, goto e , , , v
1
,
GOTO
, e 1 e
1
v
1
= [v
1
]
, , , pc, if e then goto e
1
else goto e
2
, , , v
1
,
TCOND
, , e 0 e
2
v
2
= [v
2
]
, , , pc, if e then goto e
1
else goto e
2
, , , v
2
,
FCOND
, e
1
v
1
, e
2
v
2
= [pc + 1]

= [v
1
v
2
]
, , , pc, store(e
1
, e
2
) ,

, , pc + 1,
STORE
, e 1 = [pc + 1]
, , , pc, assert(e) , , , pc + 1,
ASSERT
Figure 1: Operational semantics of SIMPIL.
Most of the evaluation rules break the expression down into
simpler expressions, evaluate the subexpressions, and then
combine the resulting evaluations.
Example 1. Consider evaluating the following program:
1 x : = 2 ge t i nput ( )
The evaluation for this program is shown in Figure 3 for
the input of 20. Notice that since the ASSIGN rule requires
the expression e in var := e to be evaluated, we had to
recurse to other rules (BINOP, INPUT, CONST) to evaluate
the expression 2get input() to the value 40.
C. Language Discussion
We have designed our language to demonstrate the critical
aspects of dynamic taint analysis and forward symbolic
execution. We do not include some high-level language
constructs such as functions or scopes for simplicity and
space reasons. This omission does not fundamentally limit
the capability of our language or our results. Adding such
constructs is straightforward. For example, two approaches
are:
1) Compile missing high-level language constructs down
to our language. For instance, functions, buffers and
user-level abstractions can be compiled down to
SIMPIL statements instead of assembly-level instruc-
tions. Tools such as BAP [1] and BitBlaze [2] already
use a variant of SIMPIL to perform analyses. BAP is
freely available at http://bap.ece.cmu.edu.
Example 2. Function calls in high-level code can
be compiled down to SIMPIL by storing the return
address and transferring control ow. The following
code calls and returns from the function at line 9.
1 / Cal l e r f u n c t i o n /
2 esp : = esp + 4
3 s t or e ( esp , 6) / r e t a d d r i s 6 /
4 got o 9
5 / The c a l l wi l l r e t u r n her e /
6 hal t
7
8 / Cal l e e f u n c t i o n /
9 . . .
10 got o l oad ( esp )
We assume this choice throughout the paper since
previous dynamic analysis work has already demon-
strated that such languages can be used to reason about
programs written in any language.
2) Add higher-level constructs to SIMPIL. For instance,
it might be useful for our language to provide di-
rect support for functions or dynamically generated
code. This could slightly enhance our analyses (e.g.,
allowing us to reason about function arguments), while
requiring only small changes to our semantics and
analyses. Figure 4 presents the CALL and RET rules
that need to be added to the semantics of SIMPIL to
provide support for call-by-value function calls. Note
that several new contexts were introduced to support
functions, including a stack context () to store return
addresses, a scope context () to store function-local
variable contexts and a map from function names to
addresses ().
In a similar manner we can enhance SIMPIL to support
, 2 2
CONST
20 is input
, get input() 20
INPUT
v

= 2 20
, 2*get input() 40
BINOP

= [x 40] = [pc + 1]
, , , pc, x := 2*get input() , ,

, pc + 1,
ASSIGN
Figure 3: Evaluation of the program in Listing 1.
dynamically generated code. We redene the abstract
machine transition to allow updates to the program
context (

) and provide the rules for adding

generated code to . An example GENCODE rule is
shown in Figure 4.
III. DYNAMIC TAINT ANALYSIS
The purpose of dynamic taint analysis is to track in-
formation ow between sources and sinks. Any program
value whose computation depends on data derived from a
taint source is considered tainted (denoted T). Any other
value is considered untainted (denoted F). A taint policy
P determines exactly how taint ows as a program ex-
ecutes, what sorts of operations introduce new taint, and
what checks are performed on tainted values. While the
specics of the taint policy may differ depending upon the
taint analysis application, e.g., taint tracking policies for
unpacking malware may be different than attack detection,
the fundamental concepts stay the same.
Two types of errors can occur in dynamic taint analysis.
First, dynamic taint analysis can mark a value as tainted
when it is not derived from a taint source. We say that such
a value is overtainted. For example, in an attack detection
application overtainting will typically result in reporting
an attack when no attack occurred. Second, dynamic taint
analysis can miss the information ow from a source to a
sink, which we call undertainting. In the attack detection
scenario, undertainting means the system missed a real
attack. A dynamic taint analysis system is precise if no
undertainting or overtainting occurs.
In this section we rst describe how dynamic taint analysis
is implemented by monitoring the execution of a program.
We then describe various taint analysis policies and trade-
offs. Finally, we describe important issues and caveats that
often result in dynamic taint analysis systems that overtaint,
undertaint, or both.
A. Dynamic Taint Analysis Semantics
Since dynamic taint analysis is performed on code at
runtime, it is natural to express dynamic taint analysis in
terms of the operational semantics of the language. Taint
policy actions, whether it be taint propagation, introduction,
or checking, are added to the operational semantics rules.
To keep track of the taint status of each program value, we
redene values in our language to be tuples of the form
v, ), where v is a value in the initial language, and is
taint t ::= T [ F
value ::= v, t)

::= Maps variables to taint status

::= Maps addresses to taint status

Table II: Additional changes to SIMPIL to enable dynamic
taint analysis.
the taint status of v. A summary of the necessary changes
to SIMPIL is provided in Table II.
Figure 5 shows how a taint analysis policy P is added to
SIMPIL. The semantics show where the taint policy is used;
the semantics are independent of the policy itself. In order
to support taint policies, the semantics introduce two new
contexts:

and

keeps track of the taint status of

scalar variables.

keeps track of the taint status of memory

cells.

and

are initialized so that all values are marked

untainted. Together,

and

keep the taint status for all

variables and memory cells, and are used to derive the taint
status for all values during execution.
B. Dynamic Taint Policies
A taint policy species three properties: how new taint is
introduced to a program, how taint propagates as instructions
execute, and how taint is checked during execution.
Taint Introduction. Taint introduction rules specify how
taint is introduced into a system. The typical convention is
to initialize all variables, memory cells, etc. as untainted.
In SIMPIL, we only have a single source of user input:
the get input() call. In a real implementation, get input()
represents values returned from a system call, return values
from a library call, etc. A taint policy will also typically
distinguish between different input sources. For example, an
internet-facing network input source may always introduce
taint, while a le descriptor that reads from a trusted
conguration le may not [2, 50, 65]. Further, specic taint
sources can be tracked independently, e.g.,

can map not

just the bit indicating taint status, but also the source.
Taint Propagation. Taint propagation rules specify the taint
status for data derived from tainted or untainted operands.
Since taint is a bit, propositional logic is usually used to
express the propagation policy, e.g., t
1
t
2
indicates the
result is tainted if t
1
is tainted or t
2
is tainted.
, e
1
v
1
. . . , e
i
v
i

= [x
1
v
1
, . . . , x
i
v
i
] pc

= [f] = [pc

]
, , , , , , pc, call f(e
1
,. . . ,e
i
) (pc + 1) :: , , , ,

, :: , pc

,
CALL
= [pc

]
pc

, , , , ,

, pc, return

, , , ,

, pc

,
RET
, e v v , dom() s = disassemble([v])

= [v s] =

[v]
, , , pc, jmp e

, , , v,
GENCODE
Figure 4: Example operational semantics for adding support for call-by-value function calls and dynamically generated code.
Component Policy Check
P
input
(), P
bincheck
(), P
memcheck
() T
P
const
() F
P
unop
(t), P
assign
(t) t
P
binop
(t
1
, t
2
) t
1
t
2
P
mem
(t
a
, t
v
) t
v
P
condcheck
(t
e
, t
a
) t
a
P
gotocheck
(t
a
) t
a
Table III: A typical tainted jump target policy for detecting
attacks. A dot () denotes an argument that is ignored. A
taint status is converted to a boolean value in the natural
way, e.g., T maps to true, and F maps to false.
Taint Checking. Taint status values are often used to
determine the runtime behavior of a program, e.g., an attack
detector may halt execution if a jump target address is
tainted. In SIMPIL, we perform checking by adding the
policy to the premise of the operational semantics. For
instance, the T-GOTO rule uses the P
gotocheck
(t) policy.
P
gotocheck
(t) returns T if it is safe to perform a jump
operation when the target address has taint value t, and
returns F otherwise. If F is returned, the premise for the
rule is not met and the machine terminates abnormally
(signifying an exception).
C. A Typical Taint Policy
A prototypical application of dynamic taint analysis is
attack detection. Table III shows a typical attack detection
policy which we call the tainted jump policy. In order to be
concrete when discussing the challenges and opportunities in
taint analysis, we often contrast implementation choices with
respect to this policy. We stress that although the policy is
designed to detect attacks, other applications of taint analysis
are typically very similar.
The goal of the tainted jump policy is to protect a
potentially vulnerable program from control ow hijacking
attacks. The main idea in the policy is that an input-derived
value will never overwrite a control-ow value such as a
return address or function pointer. A control ow exploit,
however, will overwrite jump targets (e.g., return addresses)
with input-derived values. The tainted jump policy ensures
safety against such attacks by making sure tainted jump
targets are never used.
The policy introduces taint into the system by marking
all values returned by get input() as tainted. Taint is then
propagated through the program in a straightforward manner,
e.g., the result of a binary operation is tainted if either
operand is tainted, an assigned variable is tainted if the right-
hand side value is tainted, and so on.
Example 3. Table IV shows the taint calculations at each
step of the execution for the following program:
1 x : = 2 ge t i nput ( )
2 y : = 5 + x
3 got o y
On line 1, the executing program receives input, assumed
to be 20, and multiplies by 2. Since all input is marked as
tainted, 2 get input() is also tainted via T-BINOP, and
x is marked in

as tainted via T-ASSIGN. On line 2,

x (tainted) is added to y (untainted). Since one operand is
tainted, y is marked as tainted in

. On line 3, the program

jumps to y. Since y is tainted, the T-GOTO premise for P
is not satised, and the machine halts abnormally.
Different Policies for Different Applications. Different
applications of taint analysis can use different policy de-
cisions. As we will see in the next section, the typical
taint policy described in Table III is not appropriate for
all application domains, since it does not consider whether
memory addresses are tainted. Thus, it may miss some
attacks. We discuss alternatives to this policy in the next
section.
D. Dynamic Taint Analysis Challenges and Opportunities
There are several challenges to using dynamic taint
analysis correctly, including:
Tainted Addresses. Distinguishing between memory
addresses and cells is not always appropriate.
Undertainting. Dynamic taint analysis does not prop-
erly handle some types of information ow.
v is input from src

, , get input(src) v, P
input
(src))
T-INPUT

, , v v, P
const
())
T-CONST

, , var [var],

[var])
T-VAR

, , e v, t)

, , load e [v], P
mem
(t,

[v]))
T-LOAD

, , e v, t)

, ,
u
e
u
v, P
unop
(t))
T-UNOP

, , e
1
v
1
, t
1
)

, , e
2
v
2
, t
2
) P
bincheck
(t
1
, t
2
, v
1
, v
2
,
b
) = T

, , e
1

b
e
2
v
1

b
v
2
, P
binop
(t
1
, t
2
))
T-BINOP

, , e v, t)

= [var v]

[var P
assign
(t)] = [pc + 1]

, , , , pc, var := e

, , ,

, pc + 1,
T-ASSIGN
= [pc + 1] P
memcheck
(t
1
, t
2
) = T

, , e
1
v
1
, t
1
)

, , e
2
v
2
, t
2
)

= [v
1
v
2
]

[v
1
P
mem
(t
1
, t
2
)]

, , , , pc, store(e
1
, e
2
)

, ,

, , pc + 1,
T-STORE

, , e 1, t) = [pc + 1]

, , , , pc, assert(e)

, , , , pc + 1,
T-ASSERT

, , e 1, t
1
)

, , e
1
v
1
, t
2
) P
condcheck
(t
1
, t
2
) = T = [v
1
]

, , , , pc, if e then goto e

1
else goto e
2

, , , , v
1
,
T-TCOND

, , e 0, t
1
)

, , e
2
v
2
, t
2
) P
condcheck
(t
1
, t
2
) = T = [v
2
]

, , , , pc, if e then goto e

1
else goto e
2

, , , , v
2
,
T-FCOND

, , e v
1
, t) P
gotocheck
(t) = T = [v
1
]

, , , , pc, goto e

, , , , v
1
,
T-GOTO
Figure 5: Modied operational semantics of SIMPIL that enforce a taint policy P. T denotes true.
Line # Statement

Rule pc
start 1
1 x := 2*get input() x 40 x T T-ASSIGN 2
2 y := 5 + x x 40, y 45 x T, y T T-ASSIGN 3
3 goto y x 40, y 45 x T, y T T-GOTO error
Table IV: Taint calculations for example program. T denotes tainted.
Overtainting. Deciding when to introduce taint is often
easier than deciding when to remove taint.
Time of Detection vs. Time of Attack. When used
for attack detection, dynamic taint analysis may raise
an alert too late.
Table V summarizes the alternate policies proposed for
addressing some of these challenges in particular scenarios.
In the remainder of the section we discuss the advantages
and disadvantages of these policy choices, and detail com-
mon implementation details and pitfalls.
Tainted Addresses. Memory operations involve two values:
the address of the memory cell being referenced, and the
value stored in that cell. The tainted jump policy in Ta-
ble III independently tracks the taint status of addresses and
memory cells separately. This policy is akin to the idea that
the taint status of a pointer (in this case, an address) and
the object pointed to (in this case, the memory cell) are
independent [32].
Example 4. Given the tainted jump policy, consider the
Policy Substitutions
Tainted Value Pmem(ta, tv) tv
Tainted Addresses Pmem(ta, tv) ta tv
Control Dependent Not possible
Tainted Overow P
bincheck
(t1, t2, v1, v2,
b
) (t1 t2) overows(v1
b
v2)
Table V: Alternate taint analysis policy choices.
following program:
1 x : = ge t i nput ( )
2 y : = l oad ( z + x )
3 got o y
The user provides input to the program that is used as
a table index. The result of the table lookup is then used
as the target address for a jump. Assuming addresses are
of some xed-width (say 32-bits), the attacker can pick an
appropriate value of x to address any memory cell she
wishes. As a result, the attacker can jump to any value
in memory that is untainted. In many programs this would
allow the user to violate the intended control ow of the
program, thus creating a security violation.
The tainted jump policy applied to the above program
still allows an attacker to jump to untainted, yet attacker-
determined locations. This is an example of undertaint by
the policy. This means that the tainted jump policy may miss
an attack.
One possible x is to use the tainted addresses policy
shown in Table V. Using this policy, a memory cell is tainted
if either the memory cell value or the memory address is
tainted. TaintCheck [50], a dynamic taint analysis engine
for binary code, offers such an option.
The tainted address policy, however, also has issues. For
example, the tcpdump program has legitimate code similar
to the program above. In tcpdump, a network packet is rst
read in. The rst byte of the packet is used as an index into
a function pointer table to print the packet type, e.g., if byte
0 of the packet is 4, the IPv4 printer is selected and then
called. In the above code z represents the base address of
the function call table, and x is the rst byte of the packet.
Thus, the tainted address modication would cause every
non-trivial run of tcpdump to raise a taint error. Other code
constructs, such as switch statements, can cause similar table
lookup problems.
The tainted address policy may nd additional taint ows,
but may also overtaint. On the other hand, the tainted jump
policy can lead to undertaint. In security applications, such
as attack detection, this dichotomy means that the attack
detector either misses some exploits (i.e., false negatives) or
reports safe executions as bad (i.e., false positives).
Control-ow taint. Dynamic taint analysis tracks data ow
taint. However, information ow can also occur through
control dependencies.
Informally, a statement s
2
is control-dependent on state-
ment s
1
if s
1
controls whether or not s
2
will execute. A
more precise denition of control-dependency that uses post-
dominators can be found in [30]. In SIMPIL, only indirect
and conditional jumps can cause control dependencies.
Example 5. Consider the following program:
1 x : = ge t i nput ( )
2 i f x = 1 t hen got o 3 e l s e got o 4
3 y : = 1
4 z : = 42
The assignment to y is control-dependent on line 2, since
the branching outcome determines whether or not line 3 is
executed. The assignment to z is not control-dependent on
line 2, since z will be assigned the value 42 regardless of
which branch is taken.
If you do not compute control dependencies, you cannot
determine control-ow based taint, and the overall analysis
may undertaint. Unfortunately, pure dynamic taint analysis
cannot compute control dependencies, thus cannot accu-
rately determine control-ow-based taint. The reason is sim-
ple: reasoning about control dependencies requires reasoning
about multiple paths, and dynamic analysis executes on a
single path at a time. In the above example, any single
execution will not be able to tell that the value of y is
control-dependent and z is not.
There are several possible approaches to detecting control-
dependent taint:
1) Supplement dynamic analysis with static analysis.
Static analysis can compute control dependencies,
and thus can be used to compute control-dependent
taint [3, 21, 53]. Static analysis can be applied over
the entire program, or over a collection of dynamic
analysis runs.
2) Use heuristics, making an application-specic choice
whether to overtaint or undertaint depending upon the
scenario [21, 50, 64].
Sanitization. Dynamic taint analysis as described only adds
taint; it never removes it. This leads to the problem of taint
spread: as the program executes, more and more values
become tainted, often with less and less taint precision.
A signicant challenge in taint analysis is to identify when
taint can be removed from a value. We call this the taint
sanitization problem. One common example where we wish
to sanitize is when the program computes constant functions.
A typical example in x86 code is b = a a. Since b will
always equal zero, the value of b does not depend upon a.
x86 programs often use this construct to zero out registers.
A default taint analysis policy, however, will identify b as
tainted whenever a is tainted. Some taint analysis engines
check for well-known constant functions, e.g., TEMU [2]
and TaintCheck [50] can recognize the above xor case.
The output of a constant function is completely indepen-
dent of user input. However, some functions allow users
to affect their output without allowing them to choose an
arbitrary output value. For example, it is computationally
hard to nd inputs that will cause a cryptographically secure
hash function to output an arbitrary value. Thus, in some
application domains, we can treat the output of functions
like cryptographic hash functions as untainted. Newsome
et al. have explored how to automatically recognize such
cases by quantifying how much control users can exert on
a functions output [49].
Finally, there may be application-dependent sanitization.
For example, an attack detector may want to untaint values if
the program logic performs sanitization itself. For example,
if the application logic checks that an index to an array is
within the array size, the result of the table lookup could be
considered untainted.
Time of Detection vs Time of Attack. Dynamic taint
analysis be used to ag an alert when tainted values are
used in an unsafe way. However, there is no guarantee that
the program integrity has not been violated before this point.
One example of this problem is the time of detection/time
of attack gap that occurs when taint analysis is used for
attack detection. Consider a typical return address overwrite
exploit. In such attacks, the user can provide an exploit
that overwrites a function return address with the address
of attacker-supplied shellcode. The tainted jump policy
will catch such attacks because the return address will
become tainted during overwrite. The tainted jump policy
is frequently used to detect such attacks against potentially
unknown vulnerabilities. [2123, 50, 64]
Note, however, that the tainted jump policy does not raise
an error when the return address is rst overwritten; only
when it is later used as a jump target. Thus, the exploit will
not be reported until the function returns. Arbitrary effects
could happen between the time when the return address is
rst overwritten and when the attack is detected, e.g., any
calls made by the vulnerable function will still be made
before an alarm is raised. If these calls have side effects,
e.g., include le manipulation or networking functions, the
effects can persist even after the program is aborted.
The problem is that dynamic taint analysis alone keeps
track of too little information. In a return overwrite attack,
the abstract machine would need to keep track of where
return addresses are and verify that they are not overwritten.
In binary code settings, this is difcult.
value v ::= 32-bit unsigned integer [ exp
::= Contains the current constraints on
symbolic variables due to path choices
Table VI: Changes to SIMPIL to allow forward symbolic
execution.
Another example of the time of detection/time of attack
gap is detecting integer overow attacks. Taint analysis alone
does not check for overow: it just marks which values are
derived from taint sources. An attack detector would need
to add additional logic beyond taint analysis to nd such
problems. For example, the tainted integer overow policy
shown in Table V is the composition of a taint analysis check
and an integer overow policy.
Current taint-based attack detectors [2, 21, 50, 64] typ-
ically exhibit time of detection to time of attack gaps.
BitBlaze [2] provides a set of tools for performing a post
hoc instruction trace analysis on execution traces produced
with their taint infrastructure for post hoc analysis. Post hoc
trace analysis, however, negates some advantages of having
a purely dynamic analysis environment.
IV. FORWARD SYMBOLIC EXECUTION
Forward symbolic execution allows us to reason about
the behavior of a program on many different inputs at
one time by building a logical formula that represents a
program execution. Thus, reasoning about the behavior of
the program can be reduced to the domain of logic.
A. Applications and Advantages
Multiple inputs. One of the advantages of forward symbolic
execution is that it can be used to reason about more than one
input at once. For instance, consider the program in Example
6 only one out of 2
32
possible inputs will cause the
program to take the true branch. Forward symbolic execution
can reason about the program by considering two different
input classes inputs that take the true branch, and those
that take the false branch.
Example 6. Consider the following program:
1 x : = 2 ge t i nput ( )
2 i f x5 == 14 t hen got o 3 e l s e got o 4
3 / / c a t a s t r o p h i c f a i l u r e
4 / / normal be hav i or
Only one input will trigger the failure.
B. Semantics of Forward Symbolic Execution
The primary difference between forward symbolic execu-
tion and regular execution is that when get input() is eval-
uated symbolically, it returns a symbol instead of a concrete
value. When a new symbol is rst returned, there are no
constraints on its value; it represents any possible value.
As a result, expressions involving symbols cannot be fully
evaluated to a concrete value (e.g., s+5 can not be reduced
further). Thus, our language must be modied, allowing a
value to be a partially evaluated symbolic expression. The
changes to SIMPIL to allow forward symbolic execution are
shown in Table VI.
Branches constrain the values of symbolic variables to
the set of values that would execute the path. The up-
dated rules for branch statements are given as S-TCOND
and S-FCOND in Figure 6. For example, if the execu-
tion of the program follows the true branch of if x >
2 then goto e
1
else goto e
2
, then x must contain a value
greater than 2. If execution instead takes the false branch,
then x must contain a value that is not greater than 2.
Similarly, after an assertion statement, the values of symbols
must be constrained such that they satisfy the asserted
expression.
We represent these constraints on symbol assignments
in our operational semantics with the path predicate .
We show how is updated by the language constructs in
Figure 6. At every symbolic execution step, contains the
constraints on the symbolic variables.
C. Forward Symbolic Execution Example
The symbolic execution of Example 6 is shown in Ta-
ble VII. On Line 1, get input() evaluates to a fresh sym-
bol s, which initially represents any possible user input. s
is doubled and then assigned to x. This is reected in the
updated .
When forward symbolic execution reaches a branch, as
in Line 2, it must choose which path to take. The strategy
used for choosing paths can signicantly impact the quality
of the analysis; we discuss this later in this section. Table VII
shows the program contexts after symbolic execution takes
both paths (denoted by the use of the S-TCOND and
S-FCOND rules). Notice that the path predicate depends
on the path taken through the program.
D. Forward Symbolic Execution Challenges and Opportu-
nities
Creating a forward symbolic execution engine is concep-
tually a very simple process: take the operational semantics
of the language and change the denition of a value to
include symbolic expressions. However, by examining our
formal denition of this intuition, we can nd several
instances where our analysis breaks down. For instance:
Symbolic Memory. What should we do when the
analysis uses the context whose index must be
a non-negative integer with a symbolic index?
System Calls. How should our analysis deal with
external interfaces such as system calls?
Path Selection. Each conditional represents a branch
in the program execution space. How should we decide
which branches to take?
We address these issues and more below.
Symbolic Memory Addresses. The LOAD and STORE rules
evaluate the expression representing the memory address
to a value, and then get or set the corresponding value
at that address in the memory context . When executing
concretely, that value will be an integer that references a
particular memory cell.
When executing symbolically, however, we must decide
what to do when a memory reference is an expression
instead of a concrete number. The symbolic memory address
problem arises whenever an address referenced in a load
or store operation is an expression derived from user input
instead of a concrete value.
When we load from a symbolic expression, a sound
strategy is to consider it a load from any possible sat-
isfying assignment for the expression. Similarly, a store
to a symbolic address could overwrite any value for a
satisfying assignment to the expression. Symbolic addresses
are common in real programs, e.g., in the form of table
lookups dependent on user input.
Symbolic memory addresses can lead to aliasing issues
even along a single execution path. A potential address
alias occurs when two memory operations refer to the same
address.
Example 7. Consider the following program:
1 s t or e ( addr1 , v )
2 z = l oad ( addr2 )
If addr1 = addr2, then addr1 and addr2 are aliased and
the value loaded will be the value v. If addr1 ,= addr2, then
v will not be loaded. In the worst case, addr1 and addr2
are expressions that are sometimes aliased and sometimes
not.
There are several approaches to dealing with symbolic
references:
One approach is to make unsound assumptions for
removing symbolic addresses from programs. For ex-
ample, Vine [2] can optionally rewrite all memory
addresses as scalars based on name, e.g., Example 7
would be rewritten as:
1 mem addr1 = v
2 z = mem addr2
The appropriateness of such unsound assumptions
varies depending on the overall application domain.
Let subsequent analysis steps deal with them. For
example, many application domains pass the generated
formulas to a SMT solver [6, 33]. In such domains
we can let the SMT solver reason about all possible
v is a fresh symbol
, get input() v
S-INPUT
, e e

= e

= [pc + 1]
, , , , pc, assert(e)

, , , , pc + 1,
S-ASSERT
, e e

e
1
v
1

= (e

= 1) = [v
1
]
, , , , pc, if e then goto e
1
else goto e
2

, , , , v
1
,
S-TCOND
, , e e

e
2
v
2

= (e

= 0) = [v
2
]
, , , , pc, if e then goto e
1
else goto e
2

, , , , v
2
,
S-FCOND
Figure 6: Operational semantics of the language for forward symbolic execution.
Statement Rule pc
start {} true 1
x := 2*get input() {x 2 s} true S-ASSIGN 2
if x-5 == 14 goto 3 else goto 4 {x 2 s} [(2 s) 5 == 14] S-TCOND 3
if x-5 == 14 goto 3 else goto 4 {x 2 s} [(2 s) 5 == 14] S-FCOND 4
Table VII: Simulation of forward symbolic execution.
aliasing relationships. In order to logically encode sym-
bolic addresses, we must explicitly name each memory
update. Example 7 can be encoded as:
mem
1
= (mem
0
with mem
0
[addr
1
] = v) z =
mem
1
[addr
2
]
The above formula should be read as mem
1
is the same
as mem
0
except at index addr
1
, where the value is v.
Subsequent reads are performed on mem
1
.
Perform alias analysis. One could try to reason about
whether two references are pointing to the same address
by performing alias analysis. Alias analysis, however,
is a static or ofine analysis. In many application
domains, such as recent work in automated test-case
generation [9, 1719, 29, 34, 35, 57], fuzzing [36], and
malware analysis [11, 45], part of the allure of forward
symbolic execution is that it can be done at run-time.
In such scenarios, adding a static analysis component
may be unattractive.
Unfortunately, most previous work does not specically
address the problem of symbolic addresses. KLEE and its
predecessors [17, 19] perform a mix of alias analyses and
letting the SMT solver worry about aliasing. DART [36] and
CUTE [57] only handle formulas that are linear constraints
and therefore cannot handle general symbolic references.
However, when a symbolic memory access is a linear ad-
dress, they can solve the system of linear equations to see if
they may be aliased. To the best of our knowledge, previous
work in malware analysis has not addressed the issue. Thus,
malware authors could intentionally create malware that
includes symbolic memory operations to thwart analysis.
Path Selection. When forward symbolic execution encoun-
ters a branch, it must decide which branch to follow rst.
We call this the path selection problem.
We can think of a forward symbolic execution of an entire
program as a tree in which every node represents a particular
instance of the abstract machine (e.g., , , , , pc, ). The
analysis begins with only a root node in the tree. However,
every time the analysis must fork, such as when a conditional
jump is encountered, it adds as children all possible forked
states to the current node. We can further explore any leaf
node in the tree that has not terminated. Thus, forward
symbolic execution needs a strategy for choosing which state
to explore next. This choice is important, because loops with
symbolic conditions may never terminate. If an analysis tries
to explore such a loop in a nave manner, it might never
explore other branches in the state tree.
Loops can cause trees of innite depth. Thus, the handling
of loops are an integral component in the path-selection
strategy. For example, suppose n is input in:
whi l e ( 3
n
+ 4
n
== 5
n
) { n++; . . . }
Exploring all paths in this program is infeasible. Although
we know mathematically there is no satisfying answer to the
branch guard other than 2, the forward symbolic execution
algorithm does not. The formula for one loop iteration will
include the branch guard 3
n
+4
n
= 5
n
, the second iteration
will have the branch guard 3
n+1
+4
n+1
= 5
n+1
, and so on.
Typically, forward symbolic execution will provide an upper
bound on loop iterations to consider in order to keep it from
getting stuck in such potentially innite or long-running
loops.
Approaches to the path selection problem include:
1) Depth-First Search. DFS employs the standard depth-
rst search algorithm on the state tree. The primary
disadvantage of DFS is that it can get stuck in non-
terminating loops with symbolic conditions if no max-
imum depth is specied. If this happens, then no other
branches will be explored and code coverage will be
low. KLEE [17] and EXE [19] can implement a DFS
search with a congurable maximum depth for cyclic
paths to prevent innite loops.
2) Concolic Testing. Concolic testing [29, 37, 57] uses
concrete execution to produce a trace of a program
execution. Forward symbolic execution then follows
the same path as the concrete execution. The analysis
can optionally generate concrete inputs that will force
the execution down another path by choosing a con-
ditional and negating the constraints corresponding to
that conditional statement.
Since forward symbolic execution can be magnitudes
slower than concrete execution, one variant of concolic
testing uses a single symbolic execution to generate
many concrete testing inputs. This search strategy is
called generational search [37].
3) Random Paths. A random path strategy is also im-
plemented by KLEE [17] where the forward symbolic
execution engine selects states by randomly traversing
the state tree from the root until it reaches a leaf node.
The random path strategy gives a higher weight to
shallow states. This prevents executions from getting
stuck in loops with symbolic conditions.
4) Heuristics. Additional heuristics can help select states
that are likely to reach uncovered code. Sample heuris-
tics include the distance from the current point of ex-
ecution to an uncovered instruction, and how recently
the state reached uncovered code in the past.
Symbolic Jumps. The premise of the GOTO rule requires the
address expression to evaluate to a concrete value, similar
to the LOAD and STORE rules. However, during forward
symbolic execution the jump target may be an expression
instead of a concrete location. We call this the symbolic
jump problem. One common cause of symbolic jumps are
jump tables, which are commonly used to implement switch
statements.
A signicant amount of previous work in forward sym-
bolic execution does not directly address the symbolic jump
problem [9, 1719, 29, 36, 37, 57]. In some domains, such
as automated test-case generation, leaving symbolic jumps
out-of-scope simply means a lower success rate. In other
domains, such as in malware analysis, widespread use of
symbolic jumps would pose a challenge to current automated
malware reverse engineering [11, 12, 45].
Three standard ways to handle symbolic jumps are:
1) Use concrete and symbolic (concolic) analysis [57]
to run the program and observe an indirect jump
target. Once the jump target is taken in the concrete
execution, we can perform symbolic execution of the
concrete path. One drawback is that it becomes more
difcult to explore the full-state space of the program
because we only explore known jump targets. Thus,
code coverage can suffer.
2) Use a SMT solver. When we reach a symbolic jump
to e with path predicate , we can ask the SMT solver
for a satisfying answer to e. A satisfying answer
includes an assignment of values to variables in e,
which is a concrete jump target. If we are interested
in more satisfying answers, we add to the query to
return values different from those previously seen. For
example, if the rst satisfying answer is n, we query
for e n. Although querying a SMT solver is a
perfectly valid solution, it may not be as efcient as
other options that take advantage of program structure,
such as static analysis.
3) Use static analysis. Static analysis can reason about
the entire program to locate possible jump targets. In
practice, source-level indirect jump analyses typically
take the form of pointer analyses. Binary-level jump
static analyses reason about what values may be ref-
erenced in jump target expressions [4]. For example,
function pointer tables are typically implemented as a
table of possible jump targets.
Example 8. Consider the following program:
1 b y t e s : = ge t i nput ( )
2 p : = l oad ( f u n c t a b l e + b y t e s )
3 got o p
Since functable is statically known, and the size
of the table is xed, a static analysis can determine
that the range of targets is load(functable+x) where
x[ 0 x k and k is the size of the table.
Handling System and Library Calls. In concrete execu-
tion, system calls introduce input values to a program. Our
language models such calls as get input(). We refer to calls
that are used as input sources as system-level calls. For
example, in a C program system-level calls may correspond
to calling library functions such as read. In a binary
program, system-level calls may correspond to issuing an
interrupt.
Some system-level calls introduce fresh symbolic vari-
ables. However, they can also have additional side effects.
For example, read returns fresh symbolic input and updates
an internal pointer to the current read le position. A
subsequent call to read should not return the same input.
One approach to handling system-level calls is to create
summaries of their side effects [13, 17, 19]. The summaries
are models that describe the side effects that occur whenever
the respective code is called concretely. The advantage
of summaries is that they can abstract only those details
necessary for the application domain at hand. However, they
typically need to be generated manually.
A different approach when using concolic execution [57]
is to use values returned from system calls on previous con-
crete executions in symbolic execution. For example, if dur-
ing a concrete execution sys_call() returns 10, we use
10 during forward symbolic execution of the corresponding
sys_call(). The central advantages of a concolic-based
approach is it is simple, easy to implement, and sidesteps
the problem of reasoning about how a program interacts with
its environment. Any analysis that uses concrete values will
not, by denition, provide a complete analysis with respect
to system calls. In addition, the analysis may not be sound,
as some calls do not always return the same result even
when given the same input. For example, gettimeofday
returns a different time for each call.
Performance. A straightforward implementation of forward
symbolic execution will lead to a) a running time exponential
in the number of program branches, b) an exponential
number of formulas, and c) an exponentially-sized formula
per branch.
The running time is exponential in the number of branches
because a new interpreter is forked off at each branch point.
The exponential number of formulas directly follows, as
there is a separate formula at each branch point.
Example 9. Consider the following program:
1 x : = ge t i nput ( )
2 x : = x + x
3 x : = x + x
4 x : = x + x
5 i f e t hen S1 e l s e S2
6 i f e2 t hen S3 e l s e S4
7 i f e3 t hen S5 e l s e S6
8 a s s e r t ( x < 10) ;
S
i
are statements executed in the branches. There are 8
paths through this program, so there will be 8 runs of the
interpreter and 8 path predicates.
The size of a formula even for a single program path
may be exponential in size due to substitution. During
both concrete and symbolic evaluation of an expression e,
we substitute all variables in e with their value. However,
unlike concrete evaluation, the result of evaluating e is
not of constant size. Example 9 demonstrates the problem
with x. If during forward symbolic execution get input()
returns s, after executing the three assignments will map
x s + s + s + s + s + s + s + s.
In practice, we can mitigate these problems in a number
of ways:
Use more and faster hardware. Exploring multiple
paths and solving formulas for each path is inherently
parallelizable.
Exponential blowup due to substitution can be handled
by giving each variable assignment a unique name, and
then using the name instead of performing substitution.
For example, the assignments to x can be written as:
x
1
= x
0
+ x
0
x
2
= x
1
+ x
1
x
3
= x
2
+ x
2
Identify redundancies between formulas and make them
more compact. In the above example, the path predi-
cates for all formulas will include the rst four state-
ments. Bouncer [22] uses heuristics to identify com-
monalities in the formulas during signature generation.
Godefroid et al. [37] perform post hoc optimizations of
formulas to reduce their size.
Identify independent subformulas. Cadar et al. identify
logically independent subformulas, and query each sub-
formula separately in EXE and KLEE [17, 19]. They
also implement caching on the SMT solver such that
if the same formula is queried multiple times they can
use the cached value instead of solving it again. For
example, all path predicates for Example 9 contain as
a prex the assignments to x. If these assignments
are independent of other parts of the path predicate,
KLEEs cache will solve the subformula once, and then
use the same returned value on the other 8 paths. Cadar
et al. found caching instrumental in scaling forward
symbolic execution [19].
One alternative to forward symbolic execution is to use
the weakest precondition [27] to calculate the formula.
Formulas generated with weakest preconditions require
only O(n
2
) time and will be at most O(n
2
) in size,
for a program of size n [15, 31, 43]. Unlike forward
symbolic execution, weakest preconditions normally
process statements from last to rst. Thus, weakest
preconditions are implemented as a static analysis.
However, a recent algorithm for efciently computing
the weakest precondition in any direction can be used
as a replacement for applications that build formulas
using symbolic execution [41]. The program must be
converted to dynamic single assignment form before
using this new algorithm.
Mixed Execution. Depending on the application domain
and the type of program, it may be appropriate to limit
symbolic input to only certain forms of input. For instance,
in automated test generation of a network daemon, it may
not make sense to consider the server conguration le
symbolically in many cases, a potential attacker will
not have access to this le. Instead, it is more important
to handle network packets symbolically, since these are the
primary interface of the program. Allowing some inputs to
be concrete and others symbolic is called mixed execution.
Our language can be extended to allow mixed execution
by concretizing the argument of the get input() expression,
e.g., get input(le), get input(network), etc.
Besides appropriately limiting the scope of the analysis,
mixed execution enables calculations involving concrete
values to be done on the processor. This allows portions
of the program that do not rely on user input to potentially
run at the speed of concrete execution.
V. RELATED WORK
A. Formalization and Systematization
The use of operational semantics to dene dynamic
security mechanisms is not new [38, 46]. Other formal
mechanisms for dening such policies exist as well [55].
Despite these tools, prior work has largely avoided formaliz-
ing dynamic taint analysis and forward symbolic execution.
Some analysis descriptions dene a programming language
similar to ours, but only informally discuss the semantics
of the analyses [29, 36, 64]. Such informal descriptions of
semantics can lead to ambiguities in subtle corner cases.
B. Applications
In the remainder of this section, we discuss applications of
dynamic taint analysis and forward symbolic execution. Due
to the scope of related work, we cite the most representative
work.
Automatic Test-case Generation. Forward symbolic execu-
tion has been used extensively to achieve high code-coverage
in automatic test-case generation [1719, 29, 36, 37, 57].
Many of these tools also automatically nd well-dened
bugs, such as assertion errors, divisions by zero, NULL
pointer dereferences, etc.
Automatic Filter Generation. Intrusion prevention/detec-
tion systems use input lters to block inputs that trigger
known bugs and vulnerabilities. Recent work has shown
that forward symbolic execution path predicates can serve
as accurate input lters for such systems [1315, 22, 23, 44,
47, 48].
Automatic Network Protocol Understanding. Dynamic
taint analysis has been used to automatically understand
the behavior of network protocols [16, 63] when given an
implementation of the protocol.
Malware Analysis. Automatic reverse-engineering tech-
niques for malware have used forward symbolic execu-
tion [11, 12, 45] and dynamic taint analysis [7, 8, 28, 58, 65]
to analyze malware behavior. Taint analysis has been used
to track when code unpacking is used in malware [65].
Web Applications. Many analyses of Web applications
utilize dynamic taint analysis to detect common attacks such
as SQL injections [5, 39, 40, 51, 56, 62] and cross-site
scripting attacks [54, 56, 61]. Some researchers have also
combined dynamic taint analysis with static analysis to nd
bugs in Web applications [5, 62]. Sekar [56], introduced taint
inference, a technique that applies syntax and taint-aware
policies to block injection attacks.
Taint Performance & Frameworks. The ever-growing need
for more efcient dynamic taint analyses was initially met
by binary instrumentation frameworks [21, 52]. Due to the
high overhead of binary instrumentation techniques, more
efcient compiler-based [42, 64] and hardware-based [25,
26, 59, 60] approaches were later proposed. Recent results
show that a dynamic software-based approach, augmented
by static analysis introduce minimal overhead, and thus can
be practical [20].
Extensions to Taint Analysis. Our rules assume data is
either tainted or not. For example, Newsome et al. have
proposed a generalization of taint analysis that quanties the
inuence that an input has on a particular program statement
based on channel capacity [49].
VI. CONCLUSION
Dynamic program analyses have become increasingly
popular in security. The two most common dynamic
taint analysis and forward symbolic execution are used
in a variety of application domains. However, despite their
widespread usage, there has been little effort to formally
dene these analyses and summarize the critical issues that
arise when implementing them in a security context.
In this paper, we introduced a language for demonstrating
the critical aspects of dynamic taint analysis and forward
symbolic execution. We dened the operational semantics
for our language, and leveraged these semantics to formally
dene dynamic taint analysis and forward symbolic exe-
cution. We used our formalisms to highlight challenges,
techniques and tradeoffs when using these techniques in a
security setting.
VII. ACKNOWLEDGEMENTS
We would like to thank Dawn Song and the BitBlaze team
for their useful ideas and advice on dynamic taint analysis
and forward symbolic execution. We would also like to thank
our shepherd Andrei Sabelfeld, JongHyup Lee, Ivan Jager,
and our anonymous reviewers for their useful comments and
suggestions. This work is supported in part by CyLab at
Carnegie Mellon under grant DAAD19-02-1-0389 from the
Army Research Ofce. The views expressed herein are those
of the authors and do not necessarily represent the views of
our sponsors.
REFERENCES
[1] Binary analysis platform (BAP). http://bap.ece.cmu.edu.
[2] BitBlaze binary analysis project. http://bitblaze.cs.berkeley.
edu, 2007.
[3] Andrew Appel. Modern Compiler Implementation in ML.
Cambridge University Press, 1998.
[4] Gogul Balakrishnan. WYSINWYX: What You See Is Not What
You eXecute. PhD thesis, Computer Science Department,
University of Wisconsin at Madison, August 2007.
[5] D. Balzarotti, M. Cova, V. Felmetsger, N. Jovanovic, E. Kirda,
C. Kruegel, and G. Vigna. Saner: Composing static and
dynamic analysis to validate sanitization in web applications.
In Proceedings of the IEEE Symposium on Security and
Privacy, 2008.
[6] Clark Barrett and Sergey Berezin. CVC Lite: A new imple-
mentation of the cooperating validity checker. In Proceedings
of the Conference on Computer Aided Verication, 2004.
[7] Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek,
Christopher Kruegel, and Engin Kirda. Scalable, behavior-
based malware clustering. In Proceedings of the Network
and Distributed System Security Symposium, 2009.
[8] Ulrich Bayer, Andreas Moser, Christopher Kruegel, and Engin
Kirda. Dynamic analysis of malicious code. Journal in
Computer Virology, 2(1):6677, 2006.
[9] P. Boonstoppel, C. Cadar, and D. Engler. RWset: Attack-
ing path explosion in constraint-based test generation. In
Proceedings of the International Conference on Tools and
Algorithms for Construction and Analysis of Systems, 2008.
[10] David Brumley, Juan Caballero, Zhenkai Liang, James New-
some, and Dawn Song. Towards automatic discovery of
deviations in binary implementations with applications to
error detection and ngerprint generation. In Proceedings
of the USENIX Security Symposium, August 2007.
[11] David Brumley, Cody Hartwig, Min Gyung Kang, Zhenkai
Liang, James Newsome, Pongsin Poosankam, and Dawn
Song. Bitscope: Automatically dissecting malicious binaries.
Technical Report CS-07-133, School of Computer Science,
Carnegie Mellon University, March 2007.
[12] David Brumley, Cody Hartwig, Zhenkai Liang, James New-
some, Pongsin Poosankam, Dawn Song, and Heng Yin.
Automatically identifying trigger-based behavior in malware.
In Botnet Detection, volume 36 of Countering the Largest
Security Threat Series: Advances in Information Security.
Springer-Verlag, 2008.
[13] David Brumley, James Newsome, Dawn Song, Hao Wang,
and Somesh Jha. Towards automatic generation of
vulnerability-based signatures. In Proceedings of the IEEE
Symposium on Security and Privacy, pages 216, 2006.
[14] David Brumley, James Newsome, Dawn Song, Hao Wang,
and Somesh Jha. Theory and techniques for automatic gener-
ation of vulnerability-based signatures. IEEE Transactions on
Dependable and Secure Computing, 5(4):224241, October
2008.
[15] David Brumley, Hao Wang, Somesh Jha, and Dawn
Song. Creating vulnerability signatures using weakest pre-
conditions. In Proceedings of the IEEE Computer Security
Foundations Symposium, 2007.
[16] Juan Caballero, Heng Yin, Zhenkai Liang, and Dawn Song.
Polyglot: Automatic extraction of protocol message format
using dynamic binary analysis. In Proceedings of the ACM
Conference on Computer and Communications Security, Oc-
tober 2007.
[17] Cristian Cadar, Daniel Dunbar, and Dawson Engler. Klee:
Unassisted and automatic generation of high-coverage tests
for complex systems programs. In Proceedings of the
USENIX Symposium on Operating System Design and Im-
plementation, 2008.
[18] Cristian Cadar and Dawson Engler. Execution generated test
cases: How to make systems code crash itself. In Proceedings
of the International SPIN Workshop on Model Checking of
Software, 2005.
[19] Cristian Cadar, Vijay Ganesh, Peter Pawlowski, David Dill,
and Dawson Engler. EXE: A system for automatically
generating inputs of death using symbolic execution. In
Proceedings of the ACM Conference on Computer and Com-
munications Security, October 2006.
[20] Walter Chang and Calvin Lin. Efcient and extensible
security enforcement using dynamic data ow analysis. In
Proceedings of the ACM Conference on Computer and Com-
munications Security, pages 3950, 2008.
[21] James Clause, Wanchun Li, and Alessandro Orso. Dytan: a
generic dynamic taint analysis framework. In International
Symposium on Software Testing and Analysis, 2007.
[22] Manuel Costa, Miguel Castro, Lidong Zhou, Lintao Zhang,
and Marcus Peinado. Bouncer: Securing software by blocking
bad input. In Proceedings of the ACM Symposium on
Operating System Principles, October 2007.
[23] Manuel Costa, Jon Crowcroft, Miguel Castro, Antony Row-
stron, Lidong Zhou, Lintao Zhang, and Paul Barham. Vigi-
lante: End-to-end containment of internet worms. In Proceed-
ings of the ACM Symposium on Operating System Principles,
2005.
[24] Jedidiah Crandall, Zhendong Su, S. Felix Wu, and Frederic
Chong. On deriving unknown vulnerabilities from zero-day
polymorphic and metamorphic worm exploits. In Proceedings
of the ACM Conference on Computer and Communications
Security, 2005.
[25] Jedidiah R. Crandall and Fred Chong. Minos: Architectural
support for software security through control data integrity.
In Proceedings of the International Symposium on Microar-
chitecture, December 2004.
[26] Michael Dalton, Hari Kannan, and Christos Kozyrakis. Rak-
sha: a exible information ow architecture for software
security. In Proceedings of the 34th International Symposium
on Computer Architecture, 2007.
[27] E.W. Dijkstra. A Discipline of Programming. Prentice Hall,
Englewood Cliffs, NJ, 1976.
[28] Manuel Egele, Christopher Kruegel, Engin Kirda, Heng Yin,
and Dawn Song. Dynamic spyware analysis. In Proceedings
of the USENIX Annual Technical Conference, June 2007.
[29] Michael Emmi, Rupak Majumdar, and Koushik Sen. Dynamic
test input generation for database applications. In Interna-
tional Symposium on Software Testing and Analysis, 2007.
[30] Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. The
program dependence graph and its use in optimization. ACM
Transactions on Programming Languages and Systems, 1987.
[31] C. Flanagan and J.B. Saxe. Avoiding exponential explosion:
Generating compact verication conditions. In Proceedings
of the Symposium on Principles of Programming Languages,
2001.
[32] Jeffrey Foster, Manuel Fahndrich, and Alexander Aiken. A
theory of type qualiers. In Proceedings of the ACM Confer-
ence on Programming Language Design and Implementation,
1999.
[33] Vijay Ganesh and David L. Dill. A decision procedure for
bit-vectors and arrays. In Proceedings on the Conference on
Computer Aided Verication, 2007.
[34] Patrice Godefroid. Compositional dynamic test generation. In
Proceedings of the Symposium on Principles of Programming
Languages, 2007.
[35] Patrice Godefroid, Adam Kiezun, and Michael Levin.
Grammar-based whitebox fuzzing. In Proceedings of the
ACM Conference on Programming Language Design and
Implementation, 2008.
[36] Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART:
Directed automated random testing. In Proceedings of the
ACM Conference on Programming Language Design and
Implementation, 2005.
[37] Patrice Godefroid, Michael Levin, and David Molnar. Auto-
mated whitebox fuzz testing. In Proceedings of the Network
and Distributed System Security Symposium, February 2008.
[38] Patrice Godefroid, Michael Levin, and David A. Molnar.
Active property checking. In Proceedings of the ACM
international conference on Embedded software, 2008.
[39] Vivek Haldar, Deepak Chandra, and Michael Franz. Dynamic
taint propagation for java. In Proceedings of the Annual
Computer Security Applications Conference, pages 303311,
2005.
[40] William G. J. Halfond, Ro Orso, and Panagiotis Manolios.
Using positive tainting and syntax-aware evaluation to counter
SQL injection attacks. In Proceedings of the ACM SIGSOFT
Symposium on Foundations of Software Engineering. ACM
Press, 2006.
[41] Ivan Jager and David Brumley. Efcient directionless weak-
est preconditions. Technical Report CMU-CyLab-10-002,
Carnegie Mellon University CyLab, February 2010.
[42] Lap Chung Lam and Tzi-cker Chiueh. A general dynamic in-
formation ow tracking framework for security applications.
In Proceedings of the Annual Computer Security Applications
Conference, 2006.
[43] K. Rustan M. Leino. Efcient weakest preconditions. Infor-
mation Processing Letters, 93(6):281288, 2005.
[44] Zhenkai Liang and R. Sekar. Fast and automated generation of
attack signatures: A basis for building self-protecting servers.
In Proceedings of the ACM Conference on Computer and
Communications Security, 2005.
[45] Andreas Moser, Christopher Kruegel, and Engin Kirda. Ex-
ploring multiple execution paths for malware analysis. In
Proceedings of the USENIX Security Symposium, 2007.
[46] George C. Necula, Scott McPeak, and Westley Weimer.
CCured: type-safe retrotting of legacy code. In Proceedings
of the Symposium on Principles of Programming Languages,
2002.
[47] James Newsome, David Brumley, Dawn Song, Jad Cham-
cham, and Xeno Kovah. Vulnerability-specic execution
ltering for exploit prevention on commodity software. In
Proceedings of the Network and Distributed System Security
Symposium, 2006.
[48] James Newsome, Brad Karp, and Dawn Song. Polygraph:
Automatically generating signatures for polymorphic worms.
In Proceedings of the IEEE Symposium on Security and
Privacy, May 2005.
[49] James Newsome, Stephen McCamant, and Dawn Song. Mea-
suring channel capacity to distinguish undue inuence. In
Proceedings of the ACM Workshop on Programming Lan-
guages and Analysis for Security, 2009.
[50] James Newsome and Dawn Song. Dynamic taint analysis
for automatic detection, analysis, and signature generation
of exploits on commodity software. In Proceedings of
the Network and Distributed System Security Symposium,
February 2005.
[51] Anh Nguyen-tuong, Salvatore Guarnieri, Doug Greene, Jeff
Shirley, and David Evans. Automatically hardening web
applications using precise tainting. In Proceedings of IFIP
International Information Security Conference, 2005.
[52] Feng Qin, Cheng Wang, Zhenmin Li, Ho-Seop Kim,
Yuanyuan Zhou, and Youfeng Wu. Lift: A low-overhead
practical information ow tracking system for detecting se-
curity attacks. In Proceedings of the 39th Annual IEEE/ACM
Symposium on Microarchitecture, 2006.
[53] Andrei Sabelfeld and Andrew C. Myers. Language-based
information-ow security. IEEE Journal on Selected Areas
in Communications, 21:2003, 2003.
[54] Prateek Saxena, Steve Hanna, Pongsin Poosankam, and Dawn
Song. FLAX: systematic discovery of client-side validation
vulnerabilities in rich web applications. Proceedings of the
Network and Distributed System Security Symposium, 2010.
[55] Fred B. Schneider. Enforceable security policies. ACM
Transactions on Information and System Security, 3(1):30
50, February 2000.
[56] R. Sekar. An efcient black-box technique for defeating
web application attacks. In Proceedings of the Network and
Distributed System Security Symposium, 2009.
[57] K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit
testing engine for C. In Proceedings of the joint meeting of
the European Software Engineering Conference and the ACM
Symposium on the Foundations of Software Engineering,
2005.
[58] Monirul Sharif, Andrea Lanzi, Jonathon Gifn, and Wenke
Lee. Automatic reverse engineering of malware emulators. In
Proceedings of the IEEE Symposium on Security and Privacy,
2009.
[59] G. Edward Suh, Jaewook Lee, and Srinivas Devadas. Secure
program execution via dynamic information ow tracking. In
Proceedings of the International Conference on Architectural
Support for Programming Languages and Operating Systems,
2004.
[60] Shobha Venkataraman, Avrim Blum, and Dawn Song. Limits
of learning-based signature generation with adversaries. In
Proceedings of the Network and Distributed System Security
Symposium, February 2008.
[61] Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin
Kirda, Christopher Kruegel, and Giovanni Vigna. Cross-site
scripting prevention with dynamic data tainting and static
analysis. In Proceedings of the Network and Distributed
System Security Symposium, 2007.
[62] Yao wen Huang, Fang Yu, Christian Hang, Chung hung Tsai,
D. T. Lee, and Sy yen Kuo. Securing web application code
by static analysis and runtime protection. In Proceedings of
the 13th conference on World Wide Web, 2004.
[63] Gilbert Wondracek, Paolo Milani Comparetti, Christopher
Kruegel, and Engin Kirda. Automatic network protocol
analysis. In Proceedings of the Network and Distributed
System Security Symposium, 2008.
[64] Wei Xu, Eep Bhatkar, and R. Sekar. Taint-enhanced policy
enforcement: A practical approach to defeat a wide range of
attacks. In Proceedings of the USENIX Security Symposium,
2006.
[65] Heng Yin, Dawn Song, Manuel Egele, Christopher Kruegel,
and Engin Kirda. Panorama: Capturing system-wide informa-
tion ow for malware detection and analysis. In Proceedings
of the ACM Conference on Computer and Communications
Security, October 2007.

Oakland 10
No ratings yet
Oakland 10
15 pages
Sym Bion
No ratings yet
Sym Bion
10 pages
Malware Analysis Using Machine Learning and Deep Learning Techniques
No ratings yet
Malware Analysis Using Machine Learning and Deep Learning Techniques
7 pages
Program Analysis ThuTrangNguyen Day 2
No ratings yet
Program Analysis ThuTrangNguyen Day 2
108 pages
A Survey of Static Program Analysis Techniques
No ratings yet
A Survey of Static Program Analysis Techniques
16 pages
CSAW ESC Final
No ratings yet
CSAW ESC Final
5 pages
An Introduction To Dynamic Analysis For R.E. (2020) PDF
No ratings yet
An Introduction To Dynamic Analysis For R.E. (2020) PDF
30 pages
ADORICOS IT-421 Activity2
No ratings yet
ADORICOS IT-421 Activity2
3 pages
Static Analysis of Binary Exe
No ratings yet
Static Analysis of Binary Exe
9 pages
Advanced Fuzzing and Symbolic Execution
No ratings yet
Advanced Fuzzing and Symbolic Execution
61 pages
Paper Ok
No ratings yet
Paper Ok
18 pages
5474-Article Text-8699-1-10-20200511
No ratings yet
5474-Article Text-8699-1-10-20200511
8 pages
SSof 2425 LBS 05 Dynamic Analysis For Security
No ratings yet
SSof 2425 LBS 05 Dynamic Analysis For Security
88 pages
Data-Flow Analysis - Part 2: Y.N. Srikant
No ratings yet
Data-Flow Analysis - Part 2: Y.N. Srikant
26 pages
Dfa Part 2 PDF
No ratings yet
Dfa Part 2 PDF
26 pages
Bertraneetal Uml FN Ieee Nov 2010
No ratings yet
Bertraneetal Uml FN Ieee Nov 2010
8 pages
Lesson 3 - Static & Dynamic Analysis
No ratings yet
Lesson 3 - Static & Dynamic Analysis
33 pages
Dynamic
No ratings yet
Dynamic
21 pages
Binja Symbexec SENinja
No ratings yet
Binja Symbexec SENinja
7 pages
Static Program Analysis: Anders Møller and Michael I. Schwartzbach
No ratings yet
Static Program Analysis: Anders Møller and Michael I. Schwartzbach
82 pages
Secure Programming With Static Analysis
No ratings yet
Secure Programming With Static Analysis
56 pages
Dynamic Malware Analysis Guide
No ratings yet
Dynamic Malware Analysis Guide
56 pages
Bounouh
No ratings yet
Bounouh
13 pages
Software Security Analysis From Automation To Intelligence
No ratings yet
Software Security Analysis From Automation To Intelligence
157 pages
Type-Based Analysis and Applications
No ratings yet
Type-Based Analysis and Applications
8 pages
Machine Learning For Analyzing Malware: Zhenyan Liu, Yifei Zeng, Yida Yan, Pengfei Zhang and Yong Wang
No ratings yet
Machine Learning For Analyzing Malware: Zhenyan Liu, Yifei Zeng, Yida Yan, Pengfei Zhang and Yong Wang
18 pages
Unit-5-2
No ratings yet
Unit-5-2
48 pages
Deep Learning for Malware Detection
No ratings yet
Deep Learning for Malware Detection
28 pages
1 Introduction
No ratings yet
1 Introduction
65 pages
Lecture Notes On Semantic Analysis and Specifications: 15-411: Compiler Design Andr e Platzer
No ratings yet
Lecture Notes On Semantic Analysis and Specifications: 15-411: Compiler Design Andr e Platzer
18 pages
A Contemporary Survey of Large Language Model Assisted Program Analysis
No ratings yet
A Contemporary Survey of Large Language Model Assisted Program Analysis
25 pages
Taint Analysis
No ratings yet
Taint Analysis
93 pages
25-Automatic Static Analysis-14!03!2024
No ratings yet
25-Automatic Static Analysis-14!03!2024
185 pages
Program Analysis
No ratings yet
Program Analysis
73 pages
PHD Thesis 2004 - Dynamic Binary Analysis and Instrumentation (Building Tools Is Easy) phd2004
No ratings yet
PHD Thesis 2004 - Dynamic Binary Analysis and Instrumentation (Building Tools Is Easy) phd2004
190 pages
Profiling (Computer Programming)
No ratings yet
Profiling (Computer Programming)
7 pages
Static and Dynamic Analysis: Synergy and Duality: Michael Ernst
No ratings yet
Static and Dynamic Analysis: Synergy and Duality: Michael Ernst
35 pages
Static and Dynamic Analysis: Synergy and Duality: Michael Ernst
No ratings yet
Static and Dynamic Analysis: Synergy and Duality: Michael Ernst
35 pages
Static and Dynamic Analysis: Synergy and Duality: Michael Ernst
No ratings yet
Static and Dynamic Analysis: Synergy and Duality: Michael Ernst
35 pages
Syntia: Synthesizing The Semantics of Obfuscated Code
No ratings yet
Syntia: Synthesizing The Semantics of Obfuscated Code
19 pages
Final Research Paper Malware
No ratings yet
Final Research Paper Malware
5 pages
Python Malware Evasion Techniques
No ratings yet
Python Malware Evasion Techniques
12 pages
Challenges of Malware Analysis: Obfuscation Techniques
No ratings yet
Challenges of Malware Analysis: Obfuscation Techniques
11 pages
Alz Arooni
No ratings yet
Alz Arooni
212 pages
Static Program Analysis
No ratings yet
Static Program Analysis
210 pages
Techniques For Malware Analysis Based On Symbolic Execution
No ratings yet
Techniques For Malware Analysis Based On Symbolic Execution
94 pages
Chapter6 Part2
No ratings yet
Chapter6 Part2
53 pages
FF 44
No ratings yet
FF 44
11 pages
Dynamic Anaylsis
No ratings yet
Dynamic Anaylsis
1 page
Analysis and Machine Learning Assistance
No ratings yet
Analysis and Machine Learning Assistance
2 pages
Screenshot 2024-05-11 at 11.38.19 AM
No ratings yet
Screenshot 2024-05-11 at 11.38.19 AM
28 pages
Mal Wares
No ratings yet
Mal Wares
48 pages
Paprer CJ Usenix03
No ratings yet
Paprer CJ Usenix03
18 pages
Available Expression Analysis
No ratings yet
Available Expression Analysis
9 pages
0936 Static Program Analysis
No ratings yet
0936 Static Program Analysis
162 pages
Quist 2009
No ratings yet
Quist 2009
6 pages
Malware Detection via API Sequences
No ratings yet
Malware Detection via API Sequences
6 pages
The Basic Building Blocks of Malware
No ratings yet
The Basic Building Blocks of Malware
16 pages
Towards An Operational Semantics For Alloy: Abstract
No ratings yet
Towards An Operational Semantics For Alloy: Abstract
16 pages
Christian View of Jewish Temple
No ratings yet
Christian View of Jewish Temple
66 pages
SR 71 Blackbird Owners Manual PDF
No ratings yet
SR 71 Blackbird Owners Manual PDF
4 pages
DigitalRoadmap Web PDF
No ratings yet
DigitalRoadmap Web PDF
11 pages
Subha Shaam Lun
No ratings yet
Subha Shaam Lun
3 pages
Stateful MapReduce PDF
No ratings yet
Stateful MapReduce PDF
6 pages
Android Reverse Engineering Guide
No ratings yet
Android Reverse Engineering Guide
69 pages
16 401
No ratings yet
16 401
2 pages
Affiliate Marketing Action Plan
No ratings yet
Affiliate Marketing Action Plan
7 pages
Actividades de Nivelacion 06
No ratings yet
Actividades de Nivelacion 06
1 page
Wifi Varminpool Heater
No ratings yet
Wifi Varminpool Heater
10 pages
Cambridge IGCSE: Information and Communication Technology 0417/21
No ratings yet
Cambridge IGCSE: Information and Communication Technology 0417/21
12 pages
Picmg - Comdg - 2.0 Released 2013 12 061 PDF
100% (1)
Picmg - Comdg - 2.0 Released 2013 12 061 PDF
218 pages
AE CC2019 Release
No ratings yet
AE CC2019 Release
42 pages
Ms Word: Parts of The Microsoft Word Window
No ratings yet
Ms Word: Parts of The Microsoft Word Window
2 pages
Neoway N706 AT Commands Manual V1.0
No ratings yet
Neoway N706 AT Commands Manual V1.0
183 pages
Coursework For Computer Science Degree
100% (2)
Coursework For Computer Science Degree
5 pages
Google Apps Deployment Guide
No ratings yet
Google Apps Deployment Guide
16 pages
Quarter1 ICF MOd1 Lesson 2 Wk3
No ratings yet
Quarter1 ICF MOd1 Lesson 2 Wk3
14 pages
Text Crawler
No ratings yet
Text Crawler
36 pages
IPP Module 5
No ratings yet
IPP Module 5
134 pages
T Rec G.719 200806 I!!msw e
No ratings yet
T Rec G.719 200806 I!!msw e
59 pages
Office 2016 Activation
No ratings yet
Office 2016 Activation
2 pages
Opengatev9 2
No ratings yet
Opengatev9 2
339 pages
DH XVR4x08-I-V2 MultiLang V4.002.0000000.2.R.240806 Release Notes
No ratings yet
DH XVR4x08-I-V2 MultiLang V4.002.0000000.2.R.240806 Release Notes
11 pages
EnerluxM Software Module Manual
No ratings yet
EnerluxM Software Module Manual
35 pages
Engineering Configuration Guide
100% (1)
Engineering Configuration Guide
14 pages
Introduction To Multimedia Big Data Computing For Iot
No ratings yet
Introduction To Multimedia Big Data Computing For Iot
35 pages
Nov Dec 2023
No ratings yet
Nov Dec 2023
2 pages
Paper1ga Final
No ratings yet
Paper1ga Final
5 pages
ECIE 3313 - 1 - 2017-18 CH - 02
No ratings yet
ECIE 3313 - 1 - 2017-18 CH - 02
59 pages
Aricent Uni-SON Feature Details
No ratings yet
Aricent Uni-SON Feature Details
13 pages
E020-020X, E020-0081, E020-101X Error Codes
No ratings yet
E020-020X, E020-0081, E020-101X Error Codes
5 pages
Intro to Industrial Robotics
0% (1)
Intro to Industrial Robotics
22 pages
LAND BANK OF THE PHILIPPINES-Assistant Vice President
No ratings yet
LAND BANK OF THE PHILIPPINES-Assistant Vice President
2 pages
Simulation of Heat Recovery Steam Generator in A
No ratings yet
Simulation of Heat Recovery Steam Generator in A
107 pages
Value-Added Services (VAS) in SAP EWM - SAP Blogs
100% (1)
Value-Added Services (VAS) in SAP EWM - SAP Blogs
15 pages

Oakland 10

Uploaded by

Oakland 10

Uploaded by

All You Ever Wanted to Know About

Dynamic Taint Analysis and Forward Symbolic Execution

) and provide the rules for adding

::= Maps variables to taint status

::= Maps addresses to taint status

keeps track of the taint status of

keeps track of the taint status of memory

are initialized so that all values are marked

keep the taint status for all

can map not

as tainted via T-ASSIGN. On line 2,

. On line 3, the program

, , , , pc, if e then goto e

, , , , pc, if e then goto e

You might also like