Exploring Buffer Overflows In C, Part One: Theory
Cybersecurity is one of the fastest evolving tech fields and the stakes are high. Mistakes can be in the order of millions of dollars. Computers have invaded all aspects of our everyday lives. Although this means I can access millions of cat pictures with the touch of a button, it is dangerous to assume that everyone using a computer is in it for the fuzzy felines. Credit cards, passwords, and social security numbers are moving across the internet just as quickly as cat pictures but with a lucrative black market. There is a lot to gain from a successful hack and hackers will be doing their best to break into the systems we rely on and use daily. Ranging from high-tech exploits such as 2018’s Spectre and Meltdown to low-tech exploits like phishing and social engineering, it is important for us as developers to be aware of the ways malicious actors can gain access to our systems.
Our daily development work is done in a world of abstraction. The software we create today relies on existing software and technology. We aren’t starting from scratch every time we begin building an application. We don’t often have to worry about the fine points of memory management or the individual packets we send with every HTTP request, but because these are the foundations of modern computing, a vulnerability in one level makes everything above it vulnerable. In this post, we’re going to leave our comfy world of IDEs and automatic garbage collection and take a deep dive into the world of buffer overflows.
First documented in 1972, buffer overflows have been used as part of several high profile computer security incidents, such as the Morris Worm (1988) and Code Red (2001). The Morris Worm is one of the first computer viruses to receive worldwide media attention and used a buffer overflow found in a common Linux program to infect other computers. Reports say about 6,000 computers were infected (which in 1988 is roughly 10% of the entire internet). Code Red also spread by exploiting a buffer overflow vulnerability, however, this worm targeted unpatched versions of IIS, Microsoft’s web server. At its peak, 359,000 computers were infected. Both of these viruses severely disrupted the internet at the time and required thousands of hours to quarantine and repair the damage.
A buffer overflow occurs when a program tries to put too much data in a reserved area of memory. These reserved areas are called buffers. Much like when you over-fill a coffee cup, anything that can’t fit has to go somewhere. The coffee “overflows” onto the counter, covering anything nearby. The same idea applies to computer memory – memory next to the buffer will be overwritten by whatever couldn’t fit! But what are the implications of this? How do we go from a messy counter top to a computer worm? As I mentioned, anything we can’t fit into this buffer will overwrite the memory next to it. What if something next to the buffer was important? What if we could replace some data that already exists in the program with our own custom data? We could change variables to whatever we want! This is where the danger (and fun) hides. A malicious user has the power to modify what our program does. To understand how that could be achieved we have to put our Computer Science 101 hats on and review what we’ve learned about memory.
Computer memory is where any currently running processes store information such as variables, as well as where our code sits, broken down into the smallest individual steps that the computer understands (referred to as instructions). Picture memory as a tower of Jenga blocks. At the bottom, you have block number 0, the block above that is block number 1, and so on. Each of these numbers is referred to as the address for that block of memory. With a memory address we can quickly go to that location in memory and see what is stored there at any time (provided that the memory we are accessing does not belong to a different program). Being able to easily access a given memory location gives the ability for memory to be organized without having to worry about where information is physically stored. Almost like jotting down directions to your friend’s house, we only have to remember the address we want to visit in the future.
Memory is organized into different sections each with a specific purpose. For the purpose of this series, we are going to talk about the text section and the stack. The text section is where the computer will store any executable instructions (our code). The program counter (PC in Diagram 1) is the memory address of the next instruction to be executed. Each time a process executes an instruction, the program counter is incremented or modified to point to the next instruction. Our code doesn’t necessarily run top to bottom – we have loops, conditional statements, and functions we need to jump away to. The program counter is what makes this possible by keeping track of where we are and what we are currently doing.
The stack is effectively the “working memory” of the computer. This is where the computer keeps track of the values and functions that are currently being executed. The stack starts at the highest memory address and grows downwards towards lower memory addresses. Each time we enter a function we place data on the top of the stack and each time we return from a function we remove that data from the top of the stack. The stack pointer (SP in Diagram 1) always points towards the top of the stack and will adjust each time we add or remove data. The stack pointer will always point to the most recently added data. The data added to the stack is actually a group of data related to the function that is currently being executed. This is where any local variables, function arguments, etc. will be stored for the duration of the function. This group of data is called a stack frame. At a minimum, each frame will store the return address or the address of the instruction that should be executed when the current function is complete. When our code jumps to another function we need some way to hold our place. This is the job of the return address in the stack frame, to temporarily bookmark our original position. When we finish a function and remove the relevant stack frame, the program counter becomes equal to the return address stored in the frame we just removed. This way, the subsequent instruction in our code is executed no matter where it is physically stored in memory.
Diagram 1: Memory layout
That’s fine and dandy but how does this let me hack a computer?
So we know we can overflow a buffer and overwrite adjacent data, we know where buffers are stored in memory, and we know what that memory looks like. But what is the exploit? Since the buffer we are going to overflow is stored on the stack in the stack frame of its function, the scope of what we can manipulate is somewhat limited. Our coffee will only spill onto the counter near the mug. Whatever is near our mug will be the first to be covered with coffee. Stored nearby our buffer is the return address – the place the computer will read the next instructions from. We’ll be able to overflow the buffer and change what the program tries to do!
Earlier I mentioned that we cannot access memory that doesn’t belong to our program. There is a layer of security between programs and the physical memory that prevents us from snooping into what other programs are doing. We also cannot modify the text section – it is usually read-only to prevent currently running code from being modified. At first glance, it seems our options of what we can do with the return address is pretty limited. Any memory that belongs to the currently running program is still fair game though. If only we could insert our own instructions into our program! If we knew that somewhere in memory there were instructions we wanted to run next, we could replace the return address with the address of the desired instructions
As it turns out, this is what we are going to fill the buffer with! We can fill this buffer with instructions for the computer to open a new shell (or terminal, or command prompt). The demo in Part Two will be on Linux so we will focus on the shell). We would have almost full control of the computer if we could open a shell whenever we wanted. All programs run under a set of permissions and any new program spawned by an existing program will have those same permissions. If the program we are attacking is running with root or admin privileges, our new shell would also have those privileges. Having admin access on the target computer would let us create accounts, install new software (such as a virus), or just delete everything! When the program prompts for the user to provide some sort of input without putting a limit on how much data can be received, we can overflow the buffer and exploit this vulnerability with our carefully crafted input data. We need the contents of our buffer to look just like it was the text section of memory. We will have to store the instructions of whatever code we want to run. We don’t have to worry about the specifics of this until Part Two, however. Just know that the injected code will be the instructions understood by the computer to open a shell. If we change the return address to be the memory address of our buffer, which is where our injected code is stored, then the computer will assume that whatever code we injected is what needs to be executed next (see Diagram Two). We trick the computer into thinking it is running the next portion of the program but instead we provide our own instructions!
Diagram 2: Overflowing a buffer with injected code and replacing the RA (return address) with the injected code’s address.
Once a hacker gains access to a shell with root access, they have free reign until they are detected. According to Nuix’s 2018 Black Report, a survey of known hackers, it takes just 15 hours for the majority of hackers to break in, identify valuable data, and steal it. That is an incredibly small window of time to detect the attack, let alone isolate and resolve the issue. Buffer overflows require some technical knowledge but, compared to other exploits, are not very sophisticated and are comparatively easy to pull off. But hope is not lost! They are easy to prevent with a combination of operating system level security and safe coding practices. Developers need to be aware of how these vulnerabilities work and what we can do to prevent them. In Part Two I will talk about some of the ways the operating system prevents buffer overflow attacks, how developers can safeguard their own code, and most importantly we will walk through how to actually exploit this vulnerability!