Flayer: Exposing Application Internals~
Flayer: Exposing Application Internals 1
Will Drewry and Tavis Ormandy
Google, Inc.
{wad,taviso}@google.com
Abstract
Flayer is a tool for dynamically exposing application innards
for security testing and analysis. It is implemented on the dynamic binary
instrumentation framework Valgrind [17] and its memory
error detection plug-in, Memcheck [21]. This paper focuses
on the implementation of Flayer, its supporting libraries, and their
application to software security.
Flayer provides tainted, or marked, data flow analysis and instrumentation
mechanisms for arbitrarily altering that flow. Flayer improves upon prior taint
tracing tools with bit-precision. Taint propagation calculations are performed
for each value-creating memory or register operation. These calculations are
embedded in the target application's running code using dynamic
instrumentation. The same technique has been employed to allow the user to
control the outcome of conditional jumps and step over function calls.
Flayer's functionality provides a robust foundation for the
implementation of security tools and techniques. In particular, this paper
presents an effective fault injection testing technique
and an automation library, LibFlayer. Alongside these contributions,
it explores techniques for vulnerability patch analysis and guided source code
auditing.
Flayer finds errors in real software. In the past year, its use has
yielded the expedient discovery of flaws in security critical software
including OpenSSH and OpenSSL.
1 Introduction
Vulnerabilities often lay undiscovered in software due to the complexity of the
code paths leading to them. Recent tools attempt to understand these paths and
modify running application code, detecting flaws ranging from undefined memory
use [21] to signedness conversion errors [15] to
unbounded memory access [32]. In addition, symbolic evaluation and
analysis frameworks, like EXE [8] and
SAGE [12], and other multiple execution path analysis
tools [16], have begun to augment this effort through the automated
generation of dangerous input. While execution path, or flow, analysis
techniques have been in use for over three decades [7],
practical analysis tools for white box testing and auditing scenarios have only
recently become commonplace
[15] [12] [8] [32] [19].
This paper presents Flayer, an execution flow analysis and
modification tool, and a complementary fuzz testing [14]
technique. Flayer is implemented as a plug-in to the dynamic binary
instrumentation framework Valgrind [17] using core
functionality from its memory error detection plug-in,
Memcheck [21]. It traces the flow of tainted, or marked,
input data through an application during execution and logs the traversal of
conditional jumps and system calls. Recent works, such as
autodafé [32] and
Byakugan [19], also rely on understanding input
flow through a process. However, these tools use input pattern matching
techniques for taint tracing which lack the accuracy of Flayer's dynamic binary
instrumentation based approach. Flayer improves on existing taint tracing
software, like TaintCheck [18] and
Catchconv [15], through the addition of bit-precise taint
propagation. This precision allows for taintedness to propagate into bitfields
and bit arrays creating a more accurate view of the impact input has on an
application's execution. Furthermore, Flayer is not solely a taint tracing
tool. It also provides the ability to redirect the flow irrespective of
input. Flayer can instrument the outcome of conditional jumps and
function calls in the execution path based on user-supplied arguments. In
addition, a library for automated execution and output processing,
LibFlayer, is available for use along with an interactive shell
interface, FlayerSh, for easy human interaction.
The application of Flayer's flow tracing and alteration functionality,
flaying, provides a means to directly expose code obscured behind
complex code paths for direct testing. This approach combined with random fuzz
testing results in a lightweight, yet effective testing technique.
1.1 Paper structure
The remainder of this paper discusses Flayer, its implementation and
applications. Section 2 covers the detailed implementation of Flayer. Section
3 introduces a new fuzz testing technique. Section 4 discusses other
techniques enabled through the use of Flayer and its supporting libraries.
Section 5 provides real world experiences where the presented software and
techniques have successfully discovered security-related application flaws.
Section 6 details the possibilities for future work, and Section 7 gives the
conclusions drawn.
2 Flayer
2.1 Foundation
Flayer is implemented as a plug-in to Valgrind, a framework for instrumenting
machine code at runtime. In particular, it is based upon functionality from
Memcheck. Memcheck is a Valgrind plug-in that provides four types of memory
error detection: byte-level addressability, heap allocations, memory block
argument overlapping, and definedness checking. Of these, definedness checking
was the basis for Flayer's taint propagation feature. Other functionality
provided directly by Valgrind was leveraged for implementing taint sources and
control flow alteration. In addition, Valgrind's default error output and
robust command line argument handling mechanisms enabled easy automation
with a simple wrapper library, LibFlayer.
2.2 Bit-precision taint tracing
Tainting is the process of tagging data with metadata that is
propagated when that data is involved in a value-creating operation. The
implementation of bit-precision taint tracing may be divided into three
logical pieces: initial taint assignment, taint propagation and notification,
and taint removal.
Taint is assigned to data based on the data sources specified on the command
line. The following sources are supported: network, file, and stdin. All data
originating from the network, the file system, or standard input are tainted
through the instrumentation of system calls made by the target application. In
most cases, this is handled by the read system call. As data enters the
application via this kernel interface, the instrumented call checks if the
source file descriptor is tainted and appropriately marks the destination
memory addresses. In addition, recvmsg and recvfrom are
instrumented in the same manner. File descriptor-based tainting is managed in two
ways. If standard input tainting is specified, data originating from file
descriptor 0 is tainted. For network and file tainting, file descriptor
tracking is handled through the instrumentation of the following system calls:
open, socket, connect, accept, socketpair,
and close. When the data sourced from the file system is to be tainted,
open controls whether a file descriptor is marked as providing tainted
data. By default, if file tainting is enabled, all file descriptors opened
with open will be marked. When a file descriptor is closed with
close, it is unmarked as providing tainted data. However, tainting all
input from open file descriptors may taint a large amount of data as shared
libraries are loaded and files are read by the target application. The command
line argument --file-filter exists to mitigate this problem. The
argument takes a string which specifies a path prefix to the desired file, or
files, to be tainted. This allows for targeted tainting of file input data.
Unfortunately, there are no such filters for network tainting. If enabled, all
network file descriptors are assumed to produce tainted data. Usually, this is
not a burden given that network operations are not fundamental to process
initialization. Along with system call instrumentation, taint may be assigned
through one other mechanism: client calls. Valgrind provides a mechanism where
special machine instructions may be inserted into an application, or library,
at compile time through the use of C macros. Usually used from preloaded
shared objects, these client calls may taint, untaint, or examine chunks of
application memory.
The propagation of taintedness, whether data is tainted or not, is largely
implemented using the undefinedness propagation technique implemented in Memcheck.
In this technique, all bits in memory and registers have associated bits of
metadata, shadow bits, which track taintedness. Furthermore, each value-creating
memory operation has a shadow operation which
calculates the taintedness of the result. This direct memory propagation
approach performs the majority of the taintedness propagation. Flayer also
implements an indirect technique to further expand coverage. Flayer preloads a
shared library that replaces several functions in the target application which
operate on strings and raw memory: strnlen, strlen,
strncmp, strcmp, memcmp, and bcmp. In practice,
these functions operate on memory that may be tainted but will not
propagate taintedness to their return value because that value is not the direct
result of a memory operation. For example, x = y + 1 results in
x being tainted if y is tainted. However, in the following
example len will not be tainted even if s is:
char *c = s; size_t len = 0;
for( ; *c; c++ ) { len++; }
return len;
While it is clear to a human that the final value stored in len is based
completely on the contents of s, direct memory-to-memory propagation
cannot address the situation. To work around this, the replacement functions
listed make use of client calls to determine if the source memory
is tainted and taint the return value appropriately. If these functions have been
inlined, or custom equivalents are used, the preloaded versions will not
be used and taintedness will not propagate indirectly.
Taintedness propagation functions generate external notification messages.
Given that Memcheck already reports on traversed conditional jumps, system call
argument usage, memory access, and SIMD or FP register memory loads, Flayer
inherited output that is sufficiently rich without the addition of further
messages.
Memory must be untainted when it no longer contains a tainted value to
avoid false positives. In most cases, memory is untainted through the taint
propagation code. If an untainted value is written directly to a tainted
memory location, that location will become untainted. Memory is also untainted
when it is allocated or freed on the heap through malloc/free
wrapper functions. All other cases are handled through Valgrind callbacks:
stack creation, stack destruction, and client calls.
2.3 Execution path alteration
Flayer alters a target program's execution path through direct instrumentation
of its machine code, a practice classically used in software cracking. In
particular, two types of alterations are possible: forcing conditional jumps
and stepping over function calls. The instrumentation occurs after machine
code is translated to Valgrind's intermediate representation (IR) and before it
is translated back to machine code.
Conditional jump alteration is controlled by the --alter-branch command
line argument. This argument takes a comma-separated list of instruction
pointer and value pairs joined by colons, e.g.
--alter-branch=0x8080:1,0x9090:0. The value specified after the
instruction pointer is that of the guard of the conditional jump. A value of
0 indicates that the branch should not be followed while a value of
1 will result in the branch being followed. This behavior occurs
irrespective of the values involved in the conditional itself. Any conditional
jump may be altered using this technique regardless of whether it is visible
during taint analysis.
In addition to forcing conditional jump outcomes, Flayer allows function
calls to be stepped over using the --alter-fn command line argument.
This argument takes a similar format to --alter-branch except
that the value may be any 32-bit integer. The address supplied is not that of
the function to be skipped, but instead, the address where the function is
called. At this address, Flayer adds two instructions. The first sets the
value of the EAX register to the 32-bit value supplied in the command line
argument. The second is a jump to the next physical instruction after the call
site. This forces the function call to be bypassed while still providing a
controllable return value.
2.4 LibFlayer
LibFlayer is a Python library which provides a programmatic interface to
Flayer. It is comprised of several components, the most important of which is
the Flayer class.
The Flayer class is the core interface of the library. It supplies the getters
and setters for managing Flayer command line arguments and provides interfaces
for interacting with parsed output. Through these interfaces it is possible
to specify what input type to taint, what file paths to filter, and what
conditional jump addresses to modify. The interface can be used directly or
wrapped further for higher levels of abstraction. One such wrapper provides
the interactive shell interface used by FlayerSh. In addition, some effort has
been invested in the automated exploration of execution path trees using
LibFlayer.
3 A new fuzz testing technique
3.1 Background
Random fault injection-based testing, or fuzz testing, is
the technique of supplying random input to an application with the intent of
discovering an unseen, and potentially dangerous, code path. Traditional fuzz
testing is often underutilized due to its inherent limitations. In
particular, exhaustive testing of an application's input space quickly becomes
infeasible. Fuzz testing one or two bytes may not be prohibitive, but
testing even a small set of 500 bytes requires 28*500 combinations
to completely exercise the input space.
While there are many specialized techniques to mitigate this exponential
explosion of combinations, two generalized practices have arisen. The first
is block-based [4], or format aware, fuzz testing.
Spike [5], PROTOS [20], and
Peach [11], among others, use this approach to limit the
randomness in the data to just the mutation of format-specific components.
This approach has shown its efficacy [4] but requires a
substantial initial investment in the form of extensive format specification.
Even in systems where this specification is generated
automatically [32] [6], fuzz testing based on a protocol
definition may not exercise code from undocumented features or proprietary
vendor extensions and may waste significant resources testing unimplemented
specification features. For example, consider testing a HTTP server.
WebDAV [13] alone adds nine new HTTP methods in
addition to multiple new HTTP headers. The combination of these HTTP methods,
headers, and their arguments takes a substantial time to explore regardless of
whether the server supports the functionality.
The second technique is exemplified in the work by Vuagnoux called
autodafé [32], as well as Pusscat's Byakugan [19].
The approach focuses on the use of recognizable patterns in the
input stream which are detected through function hijacking or frequent memory
scanning. This technique is useful for detecting which pieces of input
reach specific locations, but it is limited by design. Not only is it possible
for the marker text to be modified beyond recognition during execution, but the
method itself introduces uncertainties in measurement. The values in the
marker text will dictate which code paths are taken and intrinsically limit the
coverage.
Recently, variations on directed fuzz testing have been introduced parallel to
the work presented in this paper. Jared DeMott's Evolutionary Fuzzing
System [10] uses genetic algorithms to construct viable input sets
based on reproductive criteria driven by the amount of code coverage of each
successive run. It eliminates the risks of wasting effort on unimplemented
functionality and of failing to exercise undocumented features. Like
fuzz [14], it still must overcome basic protocol input
validation tests. Usually, these tests are used in software to determine the
format of incoming user input. This might be a version check similar to the
protocol banner in OpenSSH [3] or a file format type
indicator like the magic check in LibTIFF [2]. While this
limitation may not affect the approach dramatically, other
techniques, inspired by fuzz testing, address this issue through application
flow analysis. Catchconv [15], EXE [8],
and SAGE [12] leverage symbolic execution to guide input error
detection and generation. Constraints are extracted by tracing the execution
of an application on fixed input, such as a known good file. The extracted
constraints are then explored through virtualized execution and, in some cases,
through repeated execution on input mutated based on code coverage heuristics.
These approaches have shown promising results but are limited by
approximation errors in symbolic execution and the potential of poor initial
input selection.
3.2 Fuzzing flayed applications
Fuzzing flayed applications is a lightweight testing approach which minimizes
the initial time investment required from the auditor. The only initial work
required is flaying. It does not require a protocol aware input generator, a
large testing harness, or any input selection work. Instead, a time investment
is required when a crash condition is uncovered. The auditor must spend time
creating viable input or determining if the bug is unreachable in normal
circumstances.
Flaying is an iterative process for increasing the reachability of complex
application code by removing the outer layers of application defenses.
Initially, an auditor must supply random input to a target application and
analyze the resulting taint tracing output. As uninteresting, or
non-state building, sanity and error checks are traversed, they must be
forcibly followed or bypassed using Flayer's flow alteration commands. This
process is repeated until the desired code is directly exposed for testing.
Once exposed, traditional random fuzz testing is used to uncover
vulnerabilities. Upon the discovery of a vulnerability, the malicious input
must be crafted by the auditor such that it will bypass the removed checks in
an unaltered version of the software. The success of this technique is
discussed in Section 5.
$ valgrind \
--tool=flayer \
--taint-network=yes \
--trace-children=yes \
--alter-fn=0x8A2E:3 \
/usr/sbin/sshd -ddd -f \
$PWD/sshd_config -p 2222 -D
Figure 1: Bypassing the "Protocol Mismatch" error check on an
Ubuntu Feisty OpenSSH 4.3p2-8ubuntu1 binary
Flayer may be used on an application regardless of the availability of
the source code or debugging symbols. While the availability of this data will
speed the flaying and creation of valid input, simple heuristics work in
many cases which make them unnecessary. For instance, if testing of OpenSSH's
cipher suite negotiation is desirable, then it would be useful to bypass the
SSH protocol version check. This is done in Figure 1 by
stepping over a sscanf call. Address 0x8A2E was identified as
the call site to the offending check as it preceded the first tainted call to
the logging function which generated the bad protocol version error message.
Only the libc symbols were used to infer this. With the check
removed, it becomes possible to build a simple test harness that copies data
from /dev/urandom and sends it to the flayed sshd. In addition,
it is trivial to introduce the required data into any payload by prepending a
proper version value. While this is a simplistic example, it captures the
essence of the technique.
It is worth noting that the fuzz testing of flayed applications does not
require Flayer. This technique was first performed manually through
the removal of error and sanity checks using interactive debugging and source
code modification. However, the automation of the iterative discovery and
modification process greatly speeds the use. The primary benefit of manual
flaying is the ability to bypass state building statements through code
addition.
4 Further uses
The Flayer tool suite provides a useful feature set for software
auditors, developers, and maintainers. The ability to comprehend and interact
with the flow of data through an application provides unique insight into that
application's operation and makes other useful security auditing and testing
techniques possible.
4.1 Guided source code auditing
Many of the more dangerous vulnerabilities, such as remote execution of code,
result from malicious user input. Therefore, it is quite useful to determine
input entry points and input-tainted functions when auditing
an application. This is where Flayer proves useful.
By running a given application, compiled with debugging symbols, through
Flayer with an arbitrary input set, the auditor can see which
conditional jumps are traversed by the data along with the containing
functions. Given that the direct output from Flayer is not always
immediately comprehensible to a human auditor, this technique is augmented by
the use of FlayerSh.
$ dd if=/dev/urandom of=rnd.tiff \
bs=1k count=1
$ FlayerSh ./tiffinfo /demo/rnd.tiff
>>> filter(file="/demo/rnd.tiff")
>>> run();summary()
==> UninitCondition
id frame information
0x0 0x4051CC0 TIFFClientOpen
/demo/libtiff/tif_open.c:359
0x1 0x4051CD0 TIFFClientOpen
/demo/libtiff/tif_open.c:359
0x2 0x4051CE0 TIFFClientOpen
/demo/libtiff/tif_open.c:359
0x4 0x413F6A3 _itoa_word
0xd 0x41413B2 vfprintf
0xf 0x413F6BD _itoa_word
==> UninitValue
id frame information
0x3 0x413F69B _itoa_word
0xe 0x413F6B7 _itoa_word
>>> snippet(0x1, 2)
* Setup the byte order handling.
*/
| if (tif->tif_header.tiff_magic !=
TIFF_BIGENDIAN &&
tif->tif_header.tiff_magic !=
TIFF_LITTLEENDIAN
>> alter(0x0, 1)
...
Figure 2: A snippet of a guided auditing session in FlayerSh
reviewing a magic check in tiffinfo (LibTIFF-3.8.2).
FlayerSh parses the output of Flayer providing error summaries, branch
alteration, and source code snippet listing. Figure 2 provides
an example session which shows a run of tiffinfo on random input,
locations where tainted values were used, and the source code from one such use
in a magic value check. Using this shell, it is possible to rapidly follow the
data flow as well as review snippets of source code surrounding locations where
tainted data was used. This allows for quick insight into the operation of the
target application and immediately displays error checking locations without
the need for additional tools or software.
FlayerSh does not replace interactive debuggers or disassemblers,
such as GDB [1] or IDA Pro [9], but it does
provide a compromise between single stepping through code execution and
manually locating application error checking code.
>>> # LibTIFF 3.8.2 unpatched | >>> # LibTIFF 3.8.2 patched
>>> snippet(0x2) | >>> snippet(0x2)
* Read offset to next directory for sequential |
* scans. | /*
*/ | * Check for integer overflow when
(void) ReadOK(tif, &nextdiroff, | * validating the dir_off, otherwise
sizeof (uint32)); | * a very high offset may cause an
} else { | * OOB read and crash the client.
toff_t off = tif->tif_diroff; | * -- taviso@google.com, 14 Jun 2006.
| */
|if (off + sizeof (uint16) > tif->tif_size) { | |if (off + sizeof (uint16) > tif->tif_size ||
TIFFErrorExt(tif->tif_clientdata, module, | off > (UINT_MAX - sizeof(uint16))) {
"%s: Can not read TIFF directory count", | TIFFErrorExt(tif->tif_clientdata, module,
tif->tif_name); | "%s: Can not read TIFF directory count",
return (0); | tif->tif_name);
>>> | >>>
Figure 3: Patch analysis of LibTIFF version 3.8.2
using two FlayerSh instances.
4.2 Patch and vulnerability analysis
In complement to auditing and testing, Flayer and FlayerSh, in particular,
prove useful when analyzing input data flow through variants of the same piece
of software. This scenario occurs quite frequently in both the commercial and
open source worlds: projects fork, operating system distributions apply
different patches to the same original application, and systems become
dependent on old versions of software. When vulnerabilities are announced,
patches to the original source code will often not be useful to the maintainers
of modified source.
It is possible to run two instances of FlayerSh, one on the patched original
application and one on an unpatched variant, with a known bad input. This
approach allows one to review the code snippet of each of the conditional jumps
along the code path of both versions, and, if needed, to force specific
behavior to locate any vulnerable code. Performing this simultaneous analysis
results in a quick assessment of the variant's behavior.
Figure 3 provides an example of this. It shows a small piece of a
FlayerSh session for a version of LibTIFF patched for the directory offset
overflow and one that is not. In particular, it is displaying the affected
tainted conditional where a safety check has been added in one version but is
missing in the original.
5 Real world experience
Fuzz testing of flayed applications has been used with some success since the
summer of 2006. This work resulted in the discovery of multiple vulnerabilities
in well known open source applications:
- Seven vulnerabilities in LibTIFF version 3.8.2 were disclosed [22] [23] [24] [25] [26] [27] [28].
- A remote denial of service vulnerability was discovered [30] in OpenSSH which affected all versions before 4.4.
- An out of band read was discovered [31] in libPNG which affected versions 1.0.6 through 1.2.12.
- A NULL pointer dereference was disclosed [29] in OpenSSL which affected all current clients.
In addition, FlayerSh has been used to determine if variants of LibTIFF and
OpenSSH were affected by these vulnerabilities.
5.1 Finding a LibTIFF overflow
One of the recently reported vulnerabilities in LibTIFF resulted from an
unchecked integer value which had previously gone unnoticed. The value was
that of the TIFF directory entry offset read directly from a supplied TIFF
image file. This section provides a simple procedure for finding this
vulnerability with Flayer.
The first step is identifying a good test application. For the purposes
of this vulnerability, tiffinfo is used. LibTIFF version 3.8.2 was
downloaded and compiled with debugging symbols. With this completed, the
compiled tool is run under Flayer with some random input as seen in
Figure 4.
$ dd if=/dev/urandom of=test.tiff \
bs=1k count=1
$ valgrind --tool=flayer \
--taint-file=yes \
--file-filter=$PWD/test.tiff \
./tiffinfo $PWD/test.tiff
Figure 4: Tracing random input through tiffinfo
The first run will result in an error message about the TIFF header magic.
E.g., "Not a TIFF or MDI file, ...". In the Flayer output, there are
three tainted conditional jump events which occur prior to the first
printf call. It is assumed that this call issues the error message.
Each of these identified conditional jumps are tested by supplying each
instruction pointer address at which the event occur to Flayer. One such test
is shown in Figure 5.
$ valgrind --tool=flayer \
--taint-file=yes \
--file-filter=$PWD/test.tiff \
--alter-branch=0x4049E66:1 \
./tiffinfo $PWD/test.tiff
Figure 5: Testing a tainted conditional jump in tiffinfo
After some trial and error, it is possible to circumvent the BigTIFF and
version error checking resulting in a different error message:
"Can not read TIFF directory count". With the version checks cleared,
the directory count code may be exercised by the test harness
provided in Figure 6.
#!/bin/bash
while /bin/true; do
dd if=/dev/urandom \
of=test.tiff bs=1k \
count=1
valgrind --tool=flayer \
--taint-file=yes \
--file-filter=$PWD/test.tiff \
--alter-branch="0x4049E6C:1,
0x4049EA6:1" \
./tiffinfo ./test.tiff
if [[ $? -ne 0 && $? -ne 1 ]]
then; break; fi
done
Figure 6: An example Flayer test harness
The test harness is simple but has proved effective with LibTIFF and several
other tested applications. However, for this vulnerability, once the directory
count error message is triggered, a quick review of the source code
at the specified line number reveals an integer overflow. In addition, if the
auditor attempted to force the conditional jump with a guard value of 0
at that location, it would have immediately resulted in a segmentation fault.
5.2 The good and the bad
Flayer and flaying have been used extensively for real world application
auditing and fuzz testing. With use, the strengths and weaknesses of this tool
and related techniques are clear.
For patch analysis and guided auditing, Flayer has worked well for the authors'
needs, but auditing style is largely personal preference. With debugging symbols
and available source code, however, it has proved a straightforward means for
discovering input entry points to an application. This allowed for
targeted audits which follow the data flow through the audited application
without any initial analysis of the source code. In addition, the ability to
step over functions and force conditionals was useful in analyzing foreign
binary behavior. It is possible to guide binary analysis by indicating the
addresses where interesting behavior occurs and forcing that behavior to
continue. In many cases, if the target application crashes, it is possible to
infer the data primitives expected by examining the resulting logs.
Fuzzing flayed applications is a highly effective technique for
testing binary input such as image files and some network protocols. The
values supplied by generating random data from /dev/urandom will fully
exercise the handlers for the incoming binary code once the blocking checks are
removed. However, when the input format is highly structured, such as the
ASCII protocol HTTP, this coverage drops off significantly. The likelihood of
data originating from /dev/urandom generating valid HTTP messages is
extremely low. This does not completely discount the use of flaying and Flayer
from these scenarios, though. Instead, the fully random data source may be
replaced with a somewhat protocol aware payload generator. While a fully
protocol aware payload generator may yield the most thorough protocol coverage,
merging Flayer with a partially protocol aware generator allows for the
execution path taken to be targeted. For example, Flayer may be used to bypass
the HTTP version check in order to allow for a HTTP BNF-based fuzzer to
generate acceptable data without forcing it to be aware of which versions of
the protocol are normally implemented.
Flayer has its own limitations. The largest of these is that skipping sections
of code, conditional jump branches or entire functions, may result in missing
required runtime state. While this is often not a problem, in some cases
values are derived from the source data which need to fall within a small
range, and that value is used in subsequent calculations or even memory
allocations. When this occurs, Flayer is less useful and manual
code modification is required to force correct state. Flayer suffers from
another limitation. If a conditional jump is forced, it is forced
every time. When that conditional jump determines whether loop should
continue, it is possible to lock the application in a never ending loop.
Flayer provides no mechanism yet to alter the outcome of a conditional a
specific number of times.
A practical limitation of Flayer is that it does not yet provide full
coverage of all useful taint source system calls. One notable example is
mmap. This system call is used to map a file on the file system
directly into process memory. Surprisingly, instrumenting this system call has
not been necessary in testing and analysis done so far. Given that
instrumentation has been added as needed, this is only a minor limitation.
6 Future Work
There are many avenues left to explore with Flayer. Most immediately, Flayer's
implementation limitations should be removed. This includes expanding the
coverage of tainting input vectors, adding support for conditional jump
alteration a controllable number of times, adding network taint filtering, as
well as adding an assignment operator to conditional jumps. In the case of an
assignment operator, instead of forcing a jump by replacing the guard value,
the actual tainted value would be reassigned to the value it is being tested
against. This would address state building challenges in a simple, but
effective way.
Other, more challenging, work is possible. One example is the addition of
origin tracking of tainted memory. There is a Memcheck code branch which
supports this concept, but it does not do so in a way compatible with Flayer.
Adding this feature to the existing tool would allow further automated analysis
and potentially, the automatic generation of input for interesting code paths.
An alternate approach for reaching the same goal would be integrating Flayer's
output with a program slicing [33] system. This approach would
remove the need for origin tracking while still automatically generating input.
Additional work automating programmatic control flow comprehension is another
viable direction. It is possible to automate the process of flaying through
brute force flow alteration testing or through the integration with more
sophisticated systems. For instance, integration with a code coverage tool
would allow for automated runs of Flayer with randomly selected conditional
jumps to be optimized. This integration would enable a tree view of the code
path and provide pruning of dead end code paths from the analysis enhancing the
quality of testing.
Along with these extensions, further integration of Flayer with other fuzz
testing techniques will yield very useful results. Flayer may be used to
force other fuzz testing software to test more targeted areas of code than they
were previously able to. More investigation into the compatibility and benefit
will be explored.
7 Conclusions
The Flayer tool suite, built on the Valgrind framework using core concepts from
Memcheck, should be added to the toolkit of anyone who regularly performs
application auditing or vulnerability patch analysis.
Flayer provides mechanisms to trace input flow through an application and to
arbitrarily modify that flow. LibFlayer layers a convenient interface on
Flayer. FlayerSh provides a reference tool implemented on LibFlayer. This
suite enables multiple security auditing and testing techniques, such as
flaying. In concert, these tools and techniques allow one to more effectively
audit software.
The Flayer tool suite is a starting point for application auditing and analysis
that requires extremely little initial investment while yielding solid results.
Even though Flayer is still at an early stage, its techniques have proved their
efficacy through the discovery of vulnerabilities in Internet security critical
applications, such as OpenSSH and OpenSSL. This software is available for
public use and enhancement.
7.1 Availability
This entire tool suite is publicly available licensed under the GPL. It can be
downloaded at http://code.google.com/p/flayer. Contributions are
encouraged.
8 Acknowledgments
Thanks to Google and the Google Security Team for supporting this work and
to Chris Evans whose encouragement motivated the creation of this paper. In
addition, the authors would like to thank Julian Seward, David Molnar, and Chad
Dougherty for kind words and useful guidance.
References
- [1]
-
Gnu gdb.
http://www.gnu.org/software/gdb/.
- [2]
-
Libtiff.
http://http://www.remotesensing.org/libtiff/.
- [3]
-
Openssh.
http://www.openssh.org.
- [4]
-
D. Aitel.
The advantages of block-based protocol analysis for security testing.
http://www.immunitysec.com/resources-papers.shtml, 2002.
- [5]
-
D. Aitel.
Spike.
http://www.immunitysec.com/resources-freesoftware.shtml, 2003.
- [6]
-
Beyond Security, Inc.
Bestorm 2.0 whitepaper.
http://www.beyondsecurity.com/bestorm_whitepaper.html, September
2006.
- [7]
-
R. S. Boyer, B. Elspas, and K. N. Levitt.
Select: a formal system for testing and debugging programs by
symbolic execution.
In Proceedings of the international conference on Reliable
software, pages 234-245, New York, NY, USA, 1975. ACM Press.
- [8]
-
C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler.
Exe: automatically generating inputs of death.
In Proceedings of the 13th ACM conference on Computer and
communications security, pages 322-335, Alexandria, Virginia, USA, 2006.
- [9]
-
DataRescue sa/nv.
Ida pro.
http://www.gnu.org/software/gdb/.
- [10]
-
J. DeMott and Applied Security, Inc.
Evolutionary fuzzing system.
http://appliedsec.com/resources.html, May 2007.
- [11]
-
M. Eddington.
Peach.
http://peachfuzz.sourceforge.net/README.txt, May 2004.
- [12]
-
P. Godefroid, M. Levin, and D. Molnar.
Automated whitebox fuzz testing.
Technical Report MSR-TR-2007-58, Microsoft, May 2007.
- [13]
-
Y. Goland, E. Whitehead, A. Faizi, S. Carter, and D. Jensen.
Http extensions for distributed authoring - webdav.
http://www.ietf.org/rfc/rfc2518.txt, February 1999.
- [14]
-
B. Miller, L. Fredriksen, and B. So.
An empirical study of the reliability of unix utilities.
pages 32-44, December 1990.
- [15]
-
D. A. Molnar and D. Wagner.
Catchconv: Symbolic execution and run-time type inference for integer
conversion errors.
Technical Report UCB/EECS-2007-23, EECS Department, University of
California, Berkeley, February 4 2007.
- [16]
-
A. Moser, C. Kruegel, and E. Kirda.
Exploring multiple execution paths for malware analysis.
sp, 0:231-245, 2007.
- [17]
-
N. Nethercote and J. Seward.
Valgrind: A framework for heavyweight dynamic binary instrumentation.
In Proceedings of PLDI 2007, San Diego, California, USA, June
2007.
- [18]
-
J. Newsome and D. Song.
Dynamic taint analysis for automatic detection, analysis, and
signature generation of exploits on commodity software.
In Proceedings of the Network and Distributed System Security
Symposium (NDSS 2005), 2005.
- [19]
-
L. "pusscat" Grenier and Lin0xx.
Byakugan: Automating exploitation.
In ToorCon Seattle, Pioneer Square, Seattle, Washington, USA,
May 2007.
- [20]
-
J. Röning, M. Lasko, A. Takanen, and R. Kaksonen.
Protos - systematic approach to eliminate software vulnerabilities.
In Invited presentation at Microsoft Research, Seattle, USA,
May 2002.
- [21]
-
J. Seward and N. Nethercote.
Using valgrind to detect undefined value errors with bit-precision.
In Proceedings of the USENIX'05 Annual Technical Conference,
Anaheim, California, USA, April 2005.
- [22]
-
The MITRE Corporation.
CVE-2006-3459.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3459, July
2006.
- [23]
-
The MITRE Corporation.
CVE-2006-3460.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3460, July
2006.
- [24]
-
The MITRE Corporation.
CVE-2006-3461.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3461, July
2006.
- [25]
-
The MITRE Corporation.
CVE-2006-3462.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3462, July
2006.
- [26]
-
The MITRE Corporation.
CVE-2006-3463.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3463, July
2006.
- [27]
-
The MITRE Corporation.
CVE-2006-3464.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3464, July
2006.
- [28]
-
The MITRE Corporation.
CVE-2006-3465.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3465, July
2006.
- [29]
-
The MITRE Corporation.
CVE-2006-4343.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-4343, August
2006.
- [30]
-
The MITRE Corporation.
CVE-2006-4924.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-4924,
September 2006.
- [31]
-
The MITRE Corporation.
CVE-2006-5793.
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-5793, November
2006.
- [32]
-
M. Vuagnoux.
Autodafé: an act of software torture.
Technical report, Swiss Federal Institute of Technology (EPFL),
Cryptograhy and Security Laboratory (LASEC), August 2006.
- [33]
-
M. Weiser.
Program Slices: Formal, Psychological, and Practical
Investigations of an Automatic Program Abstraction Method.
PhD thesis, 1979.
Footnotes:
1First presented at the WOOT'07 First USENIX Workshop on Offensive Technologies.
File translated from
TEX
by
TTH,
version 3.67.
On 9 Aug 2007, 07:24.