Vous êtes sur la page 1sur 34

EMBED PBrush

A
SEMINAR REPORT ON

“BUFFER OVERFLOW ATTACKS”


SUBMITTED

TO

PUNE UNIVERSITY, PUNE


FOR THE DEGREE

OF

BACHELOR OF COMPUTER ENGINEERING


BY

GHANSHYAM SATYANARAYAN SHARMA


T.E. COMP (B)
Roll No. 42

UNDER THE GUIDANCE

OF

Mrs. MAYURA KINIKAR


DEPARTMENT OF COMPUTER ENGINEERING
MAHARASHTRA ACADEMY OF ENGINEERING
ALANDI (D), PUNE-412105
2008-2009

Certificate
This is to certify that Mr.GHANSHYAM SATYANARAYAN SHARMA has
successfully submitted his seminar report on

“BUFFER OVERFLOW ATTACKS ”

during the academic year 2008-2009 in the partial fulfillment towards completion of
Bachelors Degree Program in Computer Engineering under Pune University, Pune.

Prof. Mrs. MAYURA KINIKAR Prof. Mrs. Uma Nagraj

Guide. Head of
Department.
Computer Engineering.

Dr. J P George
Principal

2
DEPARTMENT OF COMPUTER ENGINEERING
MAHARASHTRA ACADEMY OF ENGINEERING
ALANDI(D), PUNE-412105
2008-2009

Acknowledgement
I have great pleasure in presenting this report on “BUFFER OVERFLOW
ATTACKS “.I take this opportunity to thank all those who have contributed
for this successful completion of this report.

My special thanks to Prof. Mrs. MAYURA KINIKAR for her suggestion


in this presentation of this report. I am also grateful to her for solving all my
technical difficulties.

Once again, I would like to thank all those who directly & indirectly made a
contribution to this report.

-Ghanshyam Sharma

3
Table Of Contents

Abstract …………..................................……………………...……… 05

1. Introduction …...……………………………………………………….06

1.1 Brief History………………………………………………………..06

1.2 What is Buffer?..................................................................................07

2. Anatomy of a Buffer Overflow Attack………………………………...09

2.1 What is stack?....................................................................................09

2.2 What is Return Address?...................................................................09

3. Smashing the Stack……………………………………………………..16

4. Screen Shots Showing Buffer Overflow Attacks………………………21

5. Heap ……………………………………………………………………22

6. Defenses against Buffer Overflow Attacks…………………………….23

6.1 Program Modification……………………………………………...23

6.2 Modifying the language and/ or compiler………………………….23

6.3 OS Kernel Stack execution privilege………………………………26


6.4 Safer C liabrary Support……………………………………………27

4
7. Future Scope……………………………………………………………31

8. Conclusion……………………………………………………………...32

9. References……………………………………………………………...33

Abstract:
Buffer overflows are serious security bugs that find regular mention in every computer
security bulletin. The design philosophy of the C language wherein flexibility and
efficiency are given greater priority than safety is one of the main reasons for buffer
overflow attacks. Due to C’s extensive use of pointers, arrays and pointer arithmetic
without any bounds checking, programs sometimes access data that they aren’t supposed
to. By combining the C programming language's flexible yet liberal approach to memory
handling with specific UNIX file system permissions, UNIX and its flavors can be
manipulated to grant unrestricted privilege to unknown and unregistered users. In this
paper we have tried to examine this violation and discussed the approach to mitigate this
vulnerability.

5
1. Introduction:

1.1 Brief History:


Buffer Overflows have been successfully used as a method of penetrating systems’
security for over 12 years. One of the first buffer overflow attacks which attracted
widespread attention due to its spectacular success was Robert Morris's Internet Worm. In
1988 Morris released a program which succeeded in infecting thousands of Unix hosts
on the Internet. One of the methods Morris used to gain access to a vulnerable system
was a buffer overflow bug in the fingerd daemon. Once it gained access to a vulnerable
system, Morris's program installed itself on the machine, and used several methods to
attempt to spread itself to other machines. The original intent of Morris was to spread to
other systems relatively slowly and undetected, without causing a significant disruption
on any of the affected machines. However, his attack failed completely in this. Morris
made a programming error which caused his worm to spread at a much higher rate than
originally intended. Because of this error, machines were infected and reinfected so
rapidly that the worm ended up overwhelming the attacked systems. Of course this
caused his program to be detected immediately, and transformed it into the most
devastating denial of service attack until that time. Morris's program usually did not
gain administrative root access, and did not destroy any information on the penetrated
system, nor leave time bombs or other malicious code behind .
From 1988 to 1996 the number of buffer overflow attacks remained relatively low. The
known vulnerabilities were fixed, and because the attack method was little known and
thought to be difficult to execute few new vulnerabilities were discovered. This

6
changed dramatically in 1996 when Levy published a very well written paper which
simultaneously showed that it was very likely that many programs harbored buffer
overflow vulnerabilities, and also demonstrated techniques of constructing buffer
overflow attacks which were likely to succeed against a target program suspected of
being vulnerable, even if the attacker had no access to the actual source code of the
target program. The combination of these two factors stimulated attackers to a flurry of
research activity which lead to many discoveries of new vulnerabilities. In addition,
many of the attacks were automated, which permitted the attack to be carried out even
by people with little or no knowledge. People who are relatively unsophisticated but
interested in such attacks are often called Script Kiddies. Unfortunately, there are far
too many script kiddies, who seem to have plenty of time on their hands, and also the
energy, patience and persistence to keep hacking systems this way. The unhappy result
is that these automated attacks have become a serious nuisance to the overworked system
administrators responsible for maintaining the integrity of their systems under
continuous attack.

1.2 What is Buffer?


A buffer is a temporary storage area in memory. It can be a statically or dynamically
allocated memory space. A buffer is said to overflow if the some program or routine tries
to stuff in more data than its capacity. If the data entered into a buffer exceeds the limit
specified by the program it gets stored into adjacent buffers. This might result in valid
information getting overwritten and can also be susceptible to what are known as buffer
overrun attacks. Although it may occur accidentally through programming mistakes and
improper use of pointers, buffer overflow is a common and extremely dangerous attack
on system security and private data. The attacker can write data which may contain codes
created to carry out certain actions which could, for example, damage the user's files,
result in a system crash, change data, or disclose confidential information. Buffer
overflow attacks arise because of the framework of the C programming language. Poor
programming practices have only worsened the issue.

7
This is a BUFFER of variables This is a variable

int z

Char y

float x

Figure 1 Computer Memory


Organization
The C language is a structured programming language, which uses the function call as the
unit of organization. Each time a function is called, arguments to the function get copied
to an area of memory called the stack. In assembly, you store things on the stack by
pushing them and retrieve them by popping them off the stack. All CPU architectures
currently in use support a function call by a stack mechanism and have a special register
called the stack pointer. There are special operators for carrying out PUSH and POP
operations. There is also an operator that takes an address off the stack and copies it into
the program counter, the register that determines the address of the next instruction to
execute. The register used for this purpose is called the instruction pointer and enables the
structured nature of C. Calling a function always pushes the return address onto the stack.

The problem with this design shows up within the called function. Any variables defined
within this function are also stored in space allocated on the stack. For example, if a
string, such as the name of a file to open, needs to be defined in the function, a number of
bytes will be allocated on the stack. The function can then use this memory, but it will
automatically be unallocated after the function returns. This is no doubted a very efficient
process and totally in sync with the design philosophy of C. But C does no bounds
checking when data is stored in this area, a loophole that can be exploited by the attacker.

C functions that copy data but do no bounds checking are the main cause of these
vulnerabilities. Functions like strcat (), strcpy (), sprintf (), vsprintf (), bcopy (), gets (),
and scanf () calls can be exploited because these functions don’t check to see if the buffer,
allocated on the stack, will be large enough for the data copied into the buffer. Some of
these functions have suitable replacements (like strncpy () and strncat () for strcpy () and

8
strcat () respectively) while others don’t. When such functions are called, data can be
entered in locations, which are actually not the allocated space for storing that data. This
results in overflow.

2. Anatomy of a Buffer Overflow Attack:


2.1 What’s a Stack?
The stack is the place where the software stores almost all-temporary information.
Example of temporary information is the return addresses from function calls, and all the
local variables. What's really important is functions can write to this space, and modify
any data on it.

2.2 What’s a return address?

When a function is called, the system will save where it was called. Once a function
exits, it will read this address and let the program return to what it was doing before the
function was called. If this address is maliciously altered, the program won't behave as it
was programmed to do.

It's worth to notice that the biggest problem is the ability for an attacker to modify the
return address. This is what makes it possible to make the code behave unexpectedly. In
an important program like the Unix command su, simply being able to make the program
jump into another part of itself could be enough to compromise the system. A stack
smashing attack usually has two mutually dependent goals:

9
Bottom of Fill
.
Memory Direction
. 2
Buffer
(Local Variable 2)
Buffer 1
(Local Variable 1)
Return Pointer
Function Call
Top of
Memory .
Normal
. Stack

Example:

int main(){
char large_string[256];
int i;
for (i = 0; i < 255; i++){
Top of stack
large_string[i] = ‘A’;of memory
Bottom
}
function(large_string);
}

void function(char *str){


char buffer[16];
strcpy(buffer, str);
}
Buffer grows downward
Stack grows upward

buffer [16]

sfp
ret address
*str 10
Bottom of stack
Top of memory
1) Insert Attack code:

The user actually enters as his input string an executable or a binary code pertaining to
the machine being attacked.

2) Change return address:


There is space on the stack above every buffer for the return address of the function. The
attacker writes arbitrary (and dangerous!) code up to the return address and alters the
return address to point to the arbitrary code. So when the function returns it jumps to the
code that has been placed on the buffer.
The codes that are most likely to be victim to buffer overflow attacks are the ones which
read in data using unsafe functions like gets () and which are used to move data like strcat
() Unfortunately, the local array as well as the function return address will both be stored
on the stack.
This is extremely dangerous because the attacker will easily be able to feed you hostile
code instead of data, and with a simple trick the attacker will make your program execute
the code. This vulnerability is known as "buffer overflow", and is a special case of the
overflow problems.

11
Writing a Buffer Overflow:

Exploiting the vulnerabilities of C to carry out a buffer overflow requires knowledge of


the Intel x86 family assembly programming and experience with disassembling software
like gdb. The section below gives an explanation of how dynamically created stack
buffers are ‘smashed’. For understanding this it is essential to know how a function call
in C is executed and basics of structured programming.
Procedure in C:
C carries out instructions by using procedures or what it calls functions. From one point
of view, a procedure call alters the flow of control just as a goto jump does, but unlike a
jump, when finished performing its task, it returns control to the statement or instruction
following the call. This high-level abstraction is implemented through a stack. A stack is
also used to dynamically allocate memory for the formal variables that are sent as
arguments to the function.

Stack region:
A stack is a contiguous block of memory containing data. A register called the stack
pointer (SP) points to the top of the stack. The bottom of the stack is at a fixed address.
The kernel dynamically adjusts its size at run time. The central processing unit executes
the stack instructions of PUSH and POP at program execution time.
The stack consists of logical stack frames that are pushed when calling a function and
popped when returning. A stack frame contains the parameters to a function, its local
variables, and the data necessary to recover the previous stack frame, including the value
of the instruction pointer (IP) at the time of the function call. The stack pointer usually
points to the top of the stack. Ideally giving offsets from the stack pointer can reference
the local variables. But when data is added to the top of a stack these offsets change. Thus
it is not possible for the compiler to keep track of all these changes in offsets. As a result
many compilers use a second register called the frame pointer (FP). The frame pointer
points to a fixed address within a stack frame. Thanks to this property, distance of local

12
variables from the frame pointer does not change and hence FP can be used to reference
local variables and parameters.
The first thing that a procedure does is save previous FP (so that it can be restored at
procedure exit). Then it copies SP (stack pointer) into FP and advances SP to reserve
space for the next local variable. This is called procedure prolog. Upon procedure exit,
the stack is cleaned up again. This is called procedure epilog.
A simple example of how a stack is formed by a function call is shown in the example
below:
void sample(int ,int ,int)
void main()
{
sample (5,6,7);
}

void sample (int a, int b,int c)


{
char arr1[5];
char arr [10];
}
The assembly code output of the function call procedure is as follows:(This is as it
appears on a Intel x86 CPU and UNIX as operating system)
 push1 $7
 push1 $6
 push1 $5
 call function

When the main () function encounters the function call, first the two arguments are
pushed onto the stack and the function is called. What exactly happens when the function
is called is that the instruction pointer, which points to the function, is pushed onto the
stack. In other words the return address of the function is stored on the stack. Next it
pushes the frame pointer onto the stack. It then copies the current SP into the frame

13
pointer. Memory space is allocated for the local variables by subtracting their size from
the address of the Stack Pointer. This procedure prolog can be represented in assembly
code as follows:
 push1 %ebp
 mov1 %esp,%ebp
 sub1 $, %esp
where ‘ebp’ and ‘esp’ are the mnemonics for current frame pointer and current stack
pointer.The formation of the stack when the above C code is called is shown in figure 1
below:

With this much information about how a function call is executed in C by use of a stack,
we can now go into the details of how a buffer overflow attack is coded and how it can be
used to execute random code.Lets consider another code snippet in which a buffer
overflows results.
void concat(char *);
void main ()
{
char string [200];
int ctr;
for(ctr=0;ctr<200;ctr++)
string [ctr]=’C’;
concat (string);
}
void concat(char *str)
{
char buff[16];

14
strcat(buff,str);
}
This is a program uses a typically unsafe C string function ‘strcat ()’. Here the function
appends the contents of the string ‘str’ to the end of ‘buff’ without any bounds checking.
This code gives a segmentation fault. Taking a closer look at the stack formation in this
case will make it clear how the contents of the return address are undesirably overwritten.

The reason why we get a segmentation fault is because of the fact that we store 200 bytes
in buffer; an astronomical number considering that it can hold only 16 bytes. The 184
bytes after the allocated buffer get overwritten. This includes the FP, the return address,
even the *str. We have initialized every element of str to the character ‘C’. The hex
character value of the string is 0x43.Thus the return address is overwritten as
0x43434343.This is outside the process address space. Thus there is a segmentation fault.
This knowledge can help us carry out arbitrary instructions as is shown with regards to
our first example.
Now, as we know, before ‘arr1’ is the frame pointer FP and before that is the return
address which is 4 bytes past the end of ‘arr1’. As we know memory can be accessed only
in multiples of word size, which in this case is 4 bytes or 32 bits. Thus our 5-byte array
arr1 actually occupies 8 bytes or 2 words. Thus the buffer actually occupies 8 bytes.
Hence the return address is 12 bytes after the array.
Void sample(int a,int b,int c)
{
char arr1[5];
char arr[10];
int *ret;
ret=arr1+12;
(*ret)+=8;
}

15
void main()
{
int x;
x=0;
sample(5,6,7);
x=1;
printf(“%d”,x);
getch();
}
What we have done is added 12 to arr1. This new address is where the return address is
stored. To know how much to add to the return address, first use a test value, compile the
program and then run the disassembler gdb. We get an assembly code, which can be used
to jump the assignment statement ‘x=1’..
3. Smashing the Stack

One classification of buffer overflow attacks depends on where the buffer is allocated. If
the buffer is a local variable of a function, the buffer resides on the run-time stack. This
is the type of attack examined in Levy's article and it is by far the most prevalent form of
buffer overflow attack.
When a function is called in a C program, before the execution jumps to the actual code
of the called function , the activation record of the function must be pushed on the run-
time stack. In a C program the activation record consists of the following fields:

space allocated for each parameter of the function;


the return address;
the dynamic link;
space allocated to each local variable of the function.
For convenience we will consider the address of the dynamic link field to be the base
address of the activation record. The function must be able to access it's parameters and
local variables. This requires that during the execution of the function a register hold

16
the base address of the activation record of the function., i.e. the address of the dynamic
link field. Parameters are below this address on the stack, and local variables above.
When the function returns, this register must be restored to its previous value, to point to
the activation record of the calling function. To be able to do this, when the function is
called the value of this register is saved in the dynamic link field. Thus the dynamic
link field of each activation record points to the dynamic link field of the previous
activation record on the stack, which in turn points to the dynamic link field of the
previous activation record, and so on, all the way to the bottom of the stack. The first
activation record on the stack is that of main(). This chain of pointers is called the
dynamic chain.
In many C compilers the buffer grows towards the bottom of the stack. Thus if the buffer
overflows and the overflow is long enough the return address will be corrupted, (as well
as everything else in between, including the dynamic link.) If the return address is
overwritten by the buffer overflow so as to point to the attack code, this will be executed
when the function returns. Thus, in this type of attack, the return address on the stack is
used to hijack the control of the program.
Overwriting the return address, as explained above, gives the attacker the means of
hijacking the control of the program, but where should the attack code be stored? Most
commonly it is stored in the buffer itself. Thus the payload string which is copied into
the buffer will contain both the binary machine language attack code as well as the
address of this code which will overwrite the return address.
There are a few difficulties that the attacker must overcome to carry out this plan. If the
attacker has the source code of the attacked program it may be possible to determine
exactly how big the buffer is and how far it is from the return address, determining how
big the payload string must be. Also, the payload string cannot contain the null character
since this would abort the copying of the payload into the buffer. Some copying routines
of the C library use carriage returns and new lines as a delimiter instead, so these
characters should also be similarly avoided in the payload string.
Access to the source code is nowadays quite common for many Operating Systems, e.g.
Linux, OpenBSD, Free BSD, and even Solaris. Levy shows, however, that there is no
need to have access to the source, or even knowledge of the exact details of how the

17
attacked program works. The address of the attack code can be guessed, and through
various techniques an approximate guess will do. For example, the attack code could
start with a long list of no operation instructions, so that control could be passed to any
of these in order to correctly execute the crucial part of the attack code which spawns the
shell and comes after the no ops. This technique was already used in the Morris worm.
Similarly, the tail of the payload string could consist of a repeated list of the guessed
address of the attack code that we want to overwrite the return address with. These
techniques increase considerably the chances of guessing the address of the attack code
close enough for the attack to work. For more details check Levy's article.
We now examine why buffer overflows are so common. Suppose that the buffer is a
character array used to store strings. Most programs have string inputs or environment
variables which can be used by the attacker to deliver the attack. The program must read
this input and parse it in order to make the appropriate response to the input. Often, to
parse the input,the program will first copy it into a local variable of a function and then
parse it. To do this the programmer reserves a large enough buffer for any reasonable
input. To copy the input into the buffer the program will typically use a string copying
function of the standard C library such as strcpy(). If done carelessly, this introduces a
buffer overflow vulnerability. This pattern is so well established in the C
programmer's repertoire that it makes very likely that many programs will contain
buffer overflow vulnerabilities.
The problem arises partly because C represents strings in a dangerous way. The length
of a string is determined by terminating the sequence of characters by a null character.
This representation is convenient, because strings can have arbitrary length and yet it
allows for efficient processing of strings. But at the same time it is also dangerous,
because the scheme breaks down if a string is not null terminated, and because there is
no way of knowing the length of the string prior to processing all its characters. The
typical C culture emphasizes efficiency over correctness, prudence or safety, which
compounds the problem. It would require a massive amount of education to change this
well entrenched programming practice. A consequence of this is that it is unlikely that
buffer overflow vulnerabilities can be eradicated at the source by not introducing them
into a program in the first place. Not only it will be difficult to eliminate the

18
vulnerability from the enormous quantity of software already deployed, but it seems
likely that programmers will continue to write new vulnerable software.
Miller studied the behavior of UNIX utilities when given random input in many
distributions, both commercial and open source. His study is important and relevant to
our discussion, because while unexpected input is not necessarily directly related to
buffer overflows, the inability of programs to handle unexpected input comes from the
same tendency of programmers to concentrate only on reasonable input that leads also to
buffer overflow flaws. Attackers are not reasonable. On the contrary, they wish to exploit
this blind spot of programmers for unreasonableness, to find a hole in the program's logic
that they can use for their own purposes. So Miller's study provides some evidence on
how common buffer overflow problems are likely to be. Unfortunately, in almost all
distributions more than half of the utilities crashed under Miller's experiment.
Miller also gives us some insight into the speed with which vendors are making progress
in improving the quality of their software, if at all, because he repeated the study five
years later. Indeed, his results show that progress is being made. But progress has been
very modest.
Another interesting result of Miller is the confirmation of the widely held anecdotal belief
that Open Source provides significantly higher quality software than commercial
offerings. This seems to suggest the power of somewhat chaotic large-scale parallelism
over better organization of small-scale parallelism. The former is prevalent in the Open
Source model, in which many pairs of eyes scrutinize the software but relatively
uncoordinated. The latter is characteristic of commercial organizations, with fewer pairs
of eyes scrutinizing the software but in a much more systematic and organized fashion.

Many times, the execution shell code is not precompiled with the UNIX distributions as a
part of the binaries. Thus the smasher has to find ways to feed his shell code into the
runtime environment. Stack smashers have devised creative ways to accomplish this.
In order to inject the shell code into the runtime process, stack smashers manipulate
command line arguments, shell environment variables, and interactive input functions
with the necessary shell code sequence. Most stack smashing attacks depend upon shell
code instructions to accomplish their task.. These type of exploits depend on knowing at

19
what address in memory this shell code will reside. Taking this into consideration, many
stack smashers pad their shell code with NULL (or noop) assembly operations, which
gives the shell code a ‘wider space’ in memory and makes it easier to guess where the
shell code may be when manipulating the return address. This approach, combined with
an approach whereby the shell code is followed by many instances of the ‘guessed’ return
address in memory; is a common strategy used in constructing stack smashing exploits.
An additional approach, when small programs with memory restrictions are exploited, is
to store the shellcode in an environment variable.

Small Buffer Overflows:


Many times, the size of the buffer declared is small i.e. the program allocates very less
memory to it. In this case there is a possibility that the entire desired shell code might not
fit into the buffer. Furthermore, there is also a chance that the return address might be
overwritten by the shell code instructions themselves and not the address of the shell
code. Also, the number of NOPs you can pad in the front of the string might be so small
that the chance of guessing their address is minuscule. A unique approach has to be
adopted to obtain an interactive shell from these programs. This approach requires you to
have control over the shell’s environment variables. What we will do is place our
shellcode in an environment variable, and then overflow the buffer with the address of
this variable in memory. This method also increases your chances of the exploit working
as you can make the environment variable holding the shell code as large as you want.
The environment variables are stored in the top of the stack when the program is started.

20
4. Heap
The final memory segment we need to cover is the heap. The heap is the memory area
where you can allocate memory during the execution of a binary (by means of a system
function called malloc(), memory allocation). You (well, the programmer) can just say: “I
now need 5’000 bytes of memory” and there it is, if you have been blessed by the
operating system! This is particularly helpful if you can’t predict how much space you
will
actually need, since this will depend on the input to the program (do you recall our
discussion of fixed buffer length in the “So what’s a Buffer Overflow, after all?” section?
Great!). The counterpart to malloc() and the memory allocation is incidentally the
free() function, which returns the memory to the operating system.
The heap is actually closely related to the already mentioned concept of a pointer in the C
language, a memory address that holds no “real” data, but another memory address. Part
of the magic of malloc() is that it provides you with the lowest address of the memory
region you have been granted – how could you otherwise access it? Variables holding
memory addresses are of pointer type, hence the address returned by malloc() is for

21
future use stored in such a pointer. This can on the one hand be very useful, but has on
the other hand been a constant source of various problems with C programs8, in
particular
if pointers are not properly initialized or operations on pointers are done wrong.

5. Screen Shots Showing the Buffer Overflow Attack:

Screen 1

Screen 2

22
6. Defenses against Buffer Overflow Attacks:
A centralized or decentralized approach can be taken to avoid stack smashing security
vulnerabilities. To do so, changes must be implemented in the targeted programs
themselves, in the operating system kernel, or in the framework of the C language. A
centralized approach involves modification of system libraries or an operating system
kernel while a decentralized approach involves the modification of privileged programs
and/or C programming language compilers. We take a look at some of the decentralized
and centralized approaches along with their pros and cons.

6.1 Program Modification:


To effectively fix defective SUID root programs, a number of modifications can be made
to the program's source code to avoid stack-smashing vulnerabilities. Standard C byte
copy or concatenation functions often are crucial in most buffer overflow exploits. Also,
functions that return a pointer to a result in static storage can be used in stack smashing
exploits. In other terms, standard C function calls that copy strings without checking their
length are insecure. Since many string functions perform no bounds checking on the
buffer being appended or copied (whichever may be the case), they are susceptible to

23
stack smashing attacks. In addition to replacing vulnerable functions it is also essential to
check shell environment pointers and excessive command line arguments for invalid data.
Stack smashers are creative and often hide shell code and other crucial exploit
information in excessive command line arguments or environment variables. Thus,
securing source code must be a comprehensive process to be effective, and all avenues of
unauthorized input must be inspected and properly terminated if invalid.

6.2 Modifying the language and/or compiler:


Many different language based and compiler based approaches have been adopted to
guard against stack smashing vulnerabilities.
 A decentralized approach to preventing stack-smashing vulnerabilities is to
redesign or modify the C language compiler's performance in a given UNIX
operating system concerning vulnerable functions. However, it is important to
note that, in most cases, these modifications to the C programming language are
not trivial and involve root-level modifications to the concepts and methodologies
behind the C programming language.
 A simple approach of this nature involves changes to the C compiler. An
advantage of this approach is that it does not redesign the language itself. That is,
it encourages secure programming without changing the code or its performance.
 A middle-of-the-road approach of this nature involves slight modifications to the
compiler, which would modify only the “dangerous” functions in the C library
and perform a stack integrity check before referencing the appropriate return
value. If the integrity check fails, it would simply print a warning message and
exit the affected program. The main disadvantage to this approach is that all
dangerous functions would suffer a significant performance penalty, and like the
previous approach, this does not consider possible bugs in the programmer-
defined functions since it is confined to the system libraries. Also code required to
be put for affecting the above changes is in assembly language, which is
architecture dependant. It thus follows that this cannot be transported to different
CPU architectures.

24
 An extreme approach to solving the problem of buffer overflows is to implement
bounds checking. While this sounds the most foolproof and convenient, it is
potentially the most dangerous. Having static bounds checking would reduce C’s
flexibility, simplicity and efficiency.
 To avoid this, another approach is to modify the way pointers are defined and
manipulated in C. According to this new approach, a pointer would be declared
by giving three parameters, the pointer itself and the upper and lower bounds of
the address space which can be accessed using it. Thus it would be then
unnecessary to have bounds checking. Inspite of this advantage, by giving the
compiler the additional information about the upper and lower bounds of the
pointer address space, a sizeable overhead would be generated. This would
approximately increase the execution time by a factor of 10 and also increase the
time for register allocation by a factor of 3.
 A unique approach to modifying the compiler in this manner was done by
Richard Jones and Paul Kelly at Imperial College in July 1995.Their approach
involved modifying the compiler to perform the same type of bounds checking.
However the uniqueness laid in the fact that this invlolved no changes made to
the design of C or the representation of pointers in the language. Furthermore,
there was an option to turn the bounds checking mode on or off in a given
program. Thus all programs didn’t suffer the overhead generated due to the extra
code added.
By representing every pointer with a new base pointer, k, that is derived from the original
pointer, p, by using the formula:’p+2*k+1’ Only one pointer is valid for a given region
and one can check whether a pointer arithmetic expression is valid by finding its base
pointer's storage region. This is checked again to ensure that the expression's result
points to the same storage region. In their implementation Jones and Kelly modified the
front end of the GNU project's cc compiler, gcc. Code was added to check pointer
arithmetic and use, and to maintain a table of known allocated storage regions using
splay trees for efficiency. Despite slightly unfavourable performance statistics, and
inspite of the fact that this meant modifying the C level at a low-level, this modification
involves patching and recompiling the existing C compiler and its libraries. Furthermore,

25
all previously compiled binaries must be deleted and recompiled with the new libraries.
Once this is done, all binaries on the system will execute with respect to this patch. The
performance penalties are modest as is shown in the statistics of 2 typical algorithms: a
recursive fibonaci generation and a pointer intensive matrix multiplication.

• nfib (dumb doubly-recursive Fibonacci): no slowdown.

o Execution time: same.

o Compile-time: slowdown of 3 (very small)

o Executable size: much larger due to inclusion of library.

• Matrix multiply (ikj, using array subscripting):

o Execution time: slowdown of around 30 compared to unoptimised.

o Compile-time: slowdown of around 2.

o Executable size: roughly the same.

In short, modifying the C language and the language compiler involves making changes
at a very non-trivial level. This, as we have seen can lead to performance penalties of
varying degrees but considering the security threats that buffer overflow provide, some
penalty should be tolerable.

6.3 OS Kernel Stack execution privilege:


The most centralized approach in preventing some stack smashing vulnerabilities
involves modifying an operating system's kernel segment limit such that it does not cover
the actual stack space. This approach effectively removes the kernel's stack execution
permission. This has a fundamental advantage over other counter measures. As the most
centralized method in limiting stack smashing vulnerabilities, no recompilation of C
libraries or the actual compiler would be necessary, only the operating system kernel need
be recompiled. A practical execution of this concept on the Linux operating system is
described below, this description touches on the details of implementation as well as

26
some of the problems. To remove stack execution privilege in UNIX, the operating
system dynamic memory allocation stack of the operating system is marked as non-
executable. Stack smashing exploits depend on an executable stack when returning back
into a memory address, which executes an interactive shell. By removing this
functionality from the system, some stack smashing vulnerabilities can be stopped. A
patch removing stack execution permission was written for the Linux operating system.
This patch involved changing the kernel's code segment limit using a new descriptor, so
that it does not cover the actual stack space, effectively removing its stack execution
privilege. As a patch that is not difficult to compile into a kernel and test, one must be
aware of the potential difficulties with this method. First, nested function calls or
trampoline functions do not work properly with patched kernels.
Furthermore, signal handler returns in the Linux operating system require an executable
stack. Signal handlers are absolutely crucial in an operating system. A system with a non-
executable stack also hinders objective C development efforts as well as other functional
languages might also be affected. Furthermore, every program contains code that
performs fundamental operations such as saving and restoring values from CPU registers,
performs system calls. In contrast to the formulated stack smashing exploits available, an
attack such as this would be impossible to prevent by changing the stack execution
privilege. In other words, removing the stack execution permission only prevents today's
stack smashing exploits from working properly. As exploits become more sophisticated,
stack execution bits may have little or no relevance in terms of the exploit. As an aside,
this type of patch can also be implemented in system CPU hardware. New system
architectures could simply have multiple stacks: one for call frames, and one for
automatic storage. In conclusion, by removing stack execution from the system kernel,
one can attempt to stop the stack-smashing problem at the source. However, this
approach suffers in implementation because the necessary code is non-portable, standard
compiler functions and operating system signal handling behavior is modified and may
be unpredictable. In addition to these points, this approach is not proven to stop more
sophisticated stack smashing exploits.

27
6.4 Safer C library support

A much more robust alternative would be if we could provide a safe version to the C
library functions on which the attack relies to overwrite the return address. This idea
seems to have occurred independently to several people. Alexander Snarskii seems to
have been the first one to think of it .He implemented it for the FreeBSD version of Unix
and offered it to the development group of FreeBSD. His explanation of the method was
unfortunately a little obscure, and either he may not have fully realized the true power of
his method, or if he did, he certainly did not elaborate on it in his note. Thus Snarskii's
idea had less impact than it should have had. Baratloo, Tsai , and Singh from Bell Labs
independently rediscovered the idea , and wrote a much more substantial white paper
about it. This author also rediscovered this defense independently. The Bell Labs group
implemented the vulnerable functions in a library called LibSafe, which can be freely
downloaded from their site.
Can we replace a vulnerable function in the C library by a safer version? We will discuss
the idea in terms of strcpy(), but it will become readily apparent that the method
generalizes to any of the other vulnerable string manipulation functions. At first sight a
safer version of strcpy() appears impossible because strcpy() does not know the size of
the buffer that it is copying into. So complete avoidance of overflowing the buffer is not
possible. Nonetheless, strcpy() has access to the dynamic chain on the stack, and
successive dynamic links are like bright markers delimiting the activation records of all
the currently active functions. The idea is to use this information to prevent strcpy() from
corrupting the return address or the dynamic link fields.

Table 1 :Some of the Problematic C-Functions:

28
Using these markers and the address of the buffer itself strcpy() can first determine
which activation record contains the buffer, or else that the buffer is not on the stack at
all. To do this strcpy() finds the interval [a,b] of consecutive dynamic links which
contains the buffer. The cases in which the buffer is either below the first activation
record on the stack, or above the last activation record can be handled as special cases
with appropriate values of either a or b. Once the values of a and b are determined, we
can compute an upper bound on the size of buffer. For example, if the buffer grows
towards the bottom of the stack then |buffer -a | is an upper bound on the size of the
buffer. This can be used by strcpy() to limit the length of the copied string so that
neither the dynamic link nor the return address are overwritten. Furthermore, strcpy()
can detect an attempt to do so, report the problem to syslog, and safely terminate the
application.
LibSafe does not replace the standard C library. The method relies instead on the loader
searching LibSafe before the standard C library, so that the safe functions are used
instead of the standard library functions. This scheme is more flexible than replacing the
functions in the C library itself. For example, it is possible to have one program use the
C library functions and another use the LibSafe versions. By setting appropriate
environment variables LibSafe can be installed as the default library. But from a security

29
perspective, there seems to be little reason to keep the vulnerable functions installed on
the system, so the usefulness of this extra flexibility is somewhat questionable.
This defense has several advantages. It is effective against all buffer overflow attacks that
attempt to smash the stack in which the target program uses one of the vulnerable C
library functions to copy into the buffer. The method does not totally prevent buffer
overflows. It can't, because it does not know the true size of the buffer. It is still possible
to overflow areas between the buffer and the dynamic link. But the critical return address
and the dynamic link fields are protected from being overwritten.
The method fails to provide any protection against heap based buffer overflow attacks
(see below), or attacks which do not need to hijack control by overwriting the return
address. Both of these kinds of attack, however, are much harder to pull off, and
consequently much rarer. The method would also fail to protect a program that does not
use the standard C library functions to copy into the buffer. For example, if the target
program contains custom code to copy the string into the buffer it will not be protected.
However, it seems clear that few programs will have such custom code. Generally
speaking it is considered to be bad programming practice to "reinvent the wheel", so
programmers are encouraged to use the standard libraries.
Though programs that rely on custom code may contain buffer overflow vulnerabilities
just as much as those that use the standard C library, they will be less likely to be
detected. Because of this they will enjoy some immunity from attack. This is security
through obscurity, which in general is not a good way to secure a system. Nonetheless it
is of some security value.
The overhead of the safe functions is negligible, and the cost of installing the library and
configure the system to use it is very low. Another advantage is that it works with the
binaries of the target program, and does not require access to their source code. Finally, it
can be deployed without having to wait for the vendor to react to security threats, which
is a very desirable feature. It is a much more robust defense than disabling stack
execution. Though we have discussed variants of attacks against which it will offer no
protection, it is very effective against the class of attacks that it is designed for, and it
cannot be easily circumvented. The attacker has no way of interfering with the detection
of the buffer overflow attack, because this occurs before the attacker has a chance to

30
hijack control. We conclude that overall, this defense offers a very significant
improvement of the security of a system at very low cost. In our opinion it is a sure
winner.
We also mention Andrey Kolishak's BOWall protection. This is available for Windows
NT systems, with full source. This solution has some similarities to both the safer Library
approach, and to the methods to be presented in the next Section.
Kolishak's approach is similar to the others in this Section, because it works by replacing
the DLL's that contain the vulnerable library functions with a safer library version.
However, unlike LibSafe or Snarskii's method, it seems to be a buffer overflow detection
system, which is more similar to the methods of the next Section. It works by saving the
return address when the function enters, and checking it before actually returning. If
corruption of the return address is detected it does not return, so hijacking of control is
prevented. Kolishak also has a second component of BOWall which relies on some
specific Windows NT security features.

7. Future Scope
 None of the countermeasures are perfect
 The earlier stack overruns are addressed in the design process the better
 Systematic work on removing security relevant buffer overflows is a relatively
recent effort
 Further research on formal methods for software security is needed.

31
8. Conclusion:
Stack smashing attacks are among the most common ways to gain access to a UNIX
privileged file system. Prevention of these attacks is one of the primary concerns of the
OS and networking community. The expertise of programmers who write privileged code
as well as that of the UNIX gurus would be most crucial in building OS and software that
are resistant to buffer attacks. With the combined efforts of these different groups stack
smashing and indeed all other buffer overflow vulnerabilities can be defeated.

32
9. References:
1. Stefan Axelsson, A Comparison of the Security of Windows NT and UNIX, 1998
http://www.securityfocus.com/data/library/nt-vs-unix.pdf

2. Arash Baratloo, Timothy Tsai, and Navjot Singh, Libsafe: Protecting Critical
Elements of Stacks
http://www.securityfocus.com/library/2267
http://www.bell-labs.com/org/11356/libsafe.html

33
3. Bulba and Kil3r, Bypassing StackGuard and Stackshield, Phrack Magazine 56 No
5, 1999.
http://phrack.infonexus.com/search.phtml?view&article=p56-5

4. Crispin Cowan, Perry Wagle, Calton Pu, Steve Beattie, and Jonathan Walpole,
Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade, in
DARPA Information Survivability Conference and Expo 2000.
http://www.cse.ogi.edu/DISC/projects/immunix/publications.html
http://www.securityfocus.com/library/1674

5. David Curry, Improving the Security of your Unix System, 1990


http://www.securityfocus.com/library/1913

6. Drew Dean, Edward W. Felten, and Dan S. Wallach, Java Security: From HotJava
to Netscape and Beyond, in Proc. of the IEEE Symp. on Security and Privacy,
1996
http://www.cs.princeton.edu/sip/pub/secure96.html

7. Casper Dik, Non-Executable Stack for Solaris, posted to comp.security.unix


January 2, 1997.
http://x10.dejanews.com/

8. DilDog, The TAO of Windows Buffer Overflow, 1998


http://www.cultdeadcow.com/cDc_files/cDc-351/

34

Vous aimerez peut-être aussi