بسم الله الرحمن الرحيم

What is a Buffer Overflow

A buffer overflow is a software vulnerability that occurs when a program writes more data into a fixed-size memory buffer than it was designed to hold. The excess data spills into adjacent memory regions, potentially overwriting critical program state.

This can result in application crashes, data corruption, or most critically, arbitrary code execution by redirecting the program’s control flow.

Attackers use buffer overflow problems by filling up a program's memory area more than it can hold. This changes how the program runs, causing it to act in a way that can harm files or reveal sensitive information. For example, an attacker might add extra code that gives them control over the system.

If attackers understand how a program uses memory, they can send input that's too big for the buffer to handle. This lets them change parts of the program where instructions are stored, replacing them with their own commands. For example, they might change a pointer, which is like a sign that shows where something else is in memory, to point to their own harmful code, taking control of the program.

// Character buffer
char buffer[64];   // 63 chars + null terminator

// Integer buffer
int data_buffer[256];

Types of Buffer Overflows

1. Stack-Based Buffer Overflow

This is the most common and historically significant buffer overflow type. It occurs in stack memory and allows attackers to overwrite local variables, saved frame pointers, or return addresses.

Occurs in the call stack
Overwrites return addresses and control data
Directly enables control-flow hijacking

void vulnerable_function(char *input)
{
  char buffer[64];          // Stack buffer
  strcpy(buffer, input);    // No bounds checking
}

High Memory Addresses
┌────────────────────┐
│  Function A Frame  │
├────────────────────┤
│  Return Address    │ ← Hijack target
├────────────────────┤
│  Saved EBP         │
├────────────────────┤
│  Local Variables   │
├────────────────────┤
│  buffer[64]        │ ← Overflow starts here
└────────────────────┘
Low Memory Addresses

2. Heap-Based Buffer Overflow

Heap-based buffer overflows occur in dynamically allocated memory regions. Instead of overwriting return addresses directly, they corrupt heap metadata or adjacent objects, enabling advanced exploitation techniques.

Affects memory allocated via malloc, new, etc.
Targets heap structures and object pointers
Common in long-running services and daemons

char *buffer = (char *)malloc(64);
strcpy(buffer, large_input);   // Overflows into adjacent heap memory

Understanding the Runtime Stack

Call Stack Stack Frames Control Flow

To understand stack-based buffer overflow attacks, you first need a clear picture of how a running process uses the stack to manage execution.

What “the stack” means

A call stack is like the detailed instructions a program follows, written in a special language called assembler. It's a list of variables and frames that show the computer the correct order to carry out tasks. Each time a function starts running, it gets a frame on the stack, and the function that's currently being worked on is at the top of the stack.

Stack Pointer: Points to the top of the process call stack.
Instruction Pointer: Points to the address of the next CPU instruction to be executed
Base Pointer (BP): Points to the base of the current stackframe

Why it matters

Stack frames contain both data and control information. When a buffer overflows, execution flow can be redirected.

High Addresses
┌───────────────┐
│ Return Addr   │ ← control
├───────────────┤   flow target
│ Saved Frame   │
├───────────────┤
│ Locals        │
├───────────────┤
│ buffer[...]   │ ← overflow
└───────────────┘   starts here
Low Addresses

Vulnerable C++ Example: Stack Buffer Overflow

This example shows a classic mistake: copying user-controlled data into a fixed-size stack buffer without checking the input length. If the input exceeds the buffer size, it overwrites adjacent stack memory and may corrupt control data (depending on build settings and mitigations).

Vulnerable

No bounds check

              
#include <stdio.h>

static void hi(void);
int main(void)
{
  for (;;)
  {
    hi();
  }
}

static void hi(void)
{
  char buffer[5];            // tiny stack buffer: only 5 bytes total
  int i = 0;
  int ch;

  puts("Say something:");

  /* Read until newline/EOF, but never check i against buffer size */
  while ((ch = getchar()) != '\n' && ch != EOF) 
  {
    buffer[i++] = (char)ch;  // ❌ writes past buffer when i >= 5
  }

  buffer[i] = '\0';          // ❌ also out-of-bounds if i >= 5

  printf("You said: %s\n", buffer);
  printf("\n");
}

Why this code is vulnerable

The buffer has a fixed size of 5 bytes, which means it can store at most 4 characters plus a null terminator.
The input loop continuously writes characters into buffer using buffer[i++] without checking whether i is still within the valid range of the buffer.
If the user types more than 4 characters before pressing Enter, subsequent writes occur past the end of the buffer, overwriting adjacent stack memory.
The null terminator assignment (buffer[i] = '\0') can also write out of bounds when i >= sizeof(buffer), causing additional memory corruption.
Because the buffer lives on the stack, this overflow can corrupt nearby local variables, saved frame pointers, or even the function’s return address.

Safer approach

Bounds enforced, Length checked, Overflow prevented

              
#include <stdio.h>

static void hi(void);
int main(void)
{
  for (;;) 
  {
    hi();
  }
}

static void hi(void)
{
  char buffer[5];
  int i = 0;
  int ch;
  printf("Say something: ");
  while ((ch = getchar()) != '\n' && ch != EOF) 
  {
    if (i < (int)sizeof(buffer) - 1) {
      buffer[i++] = (char)ch;
    }
    else 
    {
      // Buffer full: keep reading to discard extra input
      // until newline/EOF so next iteration starts clean.
    }
  }
  buffer[i] = '\0';
  printf("You said: %s\n", buffer);
  printf("\n");
}

Why this approach is safe

The buffer has a fixed, known size (char buffer[5]), which allows the program to strictly control how many bytes may be written.
Every write into the buffer is guarded by a bounds check (i < sizeof(buffer) - 1), ensuring there is always space reserved for the null terminator.
Once the buffer is full, additional input characters are deliberately discarded instead of being written to memory, preventing any overwrite of adjacent stack data.
The string is always null-terminated at a valid index, guaranteeing that string operations such as printf("%s") remain well-defined.
Because no writes occur outside the buffer boundary, nearby stack variables, saved frame pointers, and return addresses remain intact.

Common C/C++ Functions Vulnerable to Buffer Overflow

The following C/C++ standard library functions are inherently dangerous when used with attacker-controlled or unbounded input. They perform little to no bounds checking and can easily overwrite adjacent memory, leading to stack or heap corruption.

Direct copy / input

gets
strcpy
strcat
sprintf

Formatted input

scanf
fscanf
vscanf
vfscanf
vsscanf

String read / copy

vsprintf
streadd
strecpy

Important: These functions are not vulnerable because of misuse, they are dangerous by design. None of them know the size of the destination buffer unless the programmer enforces strict external checks.

Typical Stack Buffer Overflow Exploitation Workflow

This section outlines the general methodology attackers historically followed when exploiting stack-based buffer overflows. The goal is to explain the reasoning behind each phase, not to provide an exploit recipe.

1. Fuzzing Input Size

Attackers begin by supplying inputs of increasing size to determine when the application crashes. A crash confirms that user-controlled data has exceeded a memory boundary and that the vulnerable code path is reachable.

2. Identifying Control Over Execution

After a crash is reproducible, attackers analyze whether critical control data, such as the instruction pointer has been overwritten with user-controlled values.

3. Calculating the Exact Offset

Attackers then determine the exact offset required to reach and overwrite critical control data in memory. This allows them to understand how many bytes of input are needed to overwrite the instruction pointer (EIP).

4. Locating Usable Memory Space

At this stage, attackers evaluate where their input resides in memory and how much contiguous space is available. This determines whether additional data can be stored reliably after the overflow occurs.

5. Identifying Bad Characters

Many programs modify or reject certain characters during input handling. Attackers test how different byte values behave to understand which characters terminate input, and remove them when creating a payload.

6. Evaluating Security Protections

Modern binaries include mitigations such as stack cookies, DEP/NX, ASLR, SafeSEH, and CFG. Attackers assess which protections are enabled and how they impact the feasibility of exploitation.

7. Identifying Redirection Opportunities

If execution control is theoretically possible, attackers look for existing instruction sequences or stable code paths that could be reused to redirect execution without injecting new code.

8. Payload Strategy (High-Level)

In the end, attackers figure out how to use controlled execution by looking at how memory is organized, what inputs are allowed, and what protections are in place. Whether a payload works and how much damage it can do depends a lot on the system being targeted and the defenses it has. However, the mane goal is to take full control of the targeted system.

Key takeaway: Exploiting a stack buffer overflow is a structured, multi-stage process. Modern mitigations often break this chain early, which is why many buffer overflows today result in crashes rather than reliable code execution.

Hands-On Practice Labs

Understanding buffer overflows conceptually is important, but real learning happens through controlled, ethical, and well-designed practice environments. The following labs are intended for educational and defensive learning purposes, allowing students to observe how memory corruption occurs and how modern protections affect exploitability.

Labs

Labs focusing on basic stack overflows, crashes, and understanding memory layout without bypassing advanced protections.

References & Further Reading

The following references provide authoritative explanations, historical context, and defensive insights related to buffer overflows, memory corruption, and modern exploitation mitigations. They are recommended for readers who want to deepen their understanding beyond this article.

Smashing the Stack for Fun and Profit

The Art of Exploitation

Tip: When studying memory corruption topics, prioritize sources that explain both the vulnerability mechanics and the defensive controls that mitigate them. This balanced approach leads to better secure coding practices.