SSRF Explained — How Server-Side Request Forgery Works, Attack Techniques, and Defenses

SSRF (Server-Side Request Forgery) is a vulnerability where the web application's server fetches an attacker-supplied URL on the attacker's behalf. The essence is "letting the attacker reach internal resources they couldn't touch from outside by using the server as a pivot". The server sits inside the trust boundary, so internal APIs, cloud metadata endpoints, corporate networks, and localhost all become reachable. The OWASP Top 10 added it as standalone category A10:2021, and it continues to be the entry point for large-scale incidents like Capital One's 100M-record breach (2019). This article covers the three types / attack techniques like URL parser bypasses and DNS rebinding / famous incidents / layered defenses via allowlist, IMDSv2, and egress filtering.

What SSRF is — using the server as a pivot #

Web applications often have a feature like "take a URL and fetch its contents on the server side". Preview generation, external API integration, webhooks, file ingestion, PDF rendering, image resizing, OG image fetching — common everywhere.

When the URL is passed straight to an HTTP client without validation, the attacker can force the server to hit any destination. Not just outside addresses like https://evil.example/, but also addresses that only the server itself should be able to reach from the inside — http://127.0.0.1:8080/admin, http://169.254.169.254/, and so on.

▸ What's possible the moment SSRF lands

Internal networks, cloud control planes, and adjacent services often treat the server as a trusted peer. SSRF borrows that trust wholesale. The attacker can now reach internal APIs, cloud metadata, Redis / MySQL admin ports, the corporate Confluence, devices on 192.168.x.x, etc., via the server — even if those endpoints were unreachable from their own IP. If XSS is "breaching the browser's trust boundary", SSRF is "breaching the server's trust boundary".

Replay the attack — step through how an attacker uses the web server as a stepping stone to hit the cloud metadata endpoint (169.254.169.254) and steal credentials.

Don't confuse it with CSRF — the structure is reversed #

CSRF (Cross-Site Request Forgery) makes the victim's browser send a request to the target site. The attacker borrows the victim's cookies and authentication. SSRF, by contrast, makes the target site's server send a request to some other server. The attacker borrows the server's IP / IAM role / internal reachability. The victim's session has nothing to do with it. Both "forge a request", but the attack surface and defenses are completely different.

Placement in the OWASP Top 10 #

The 2021 edition added it as a standalone category, A10:2021 — Server-Side Request Forgery. Until then, it was scattered across "Injection" and "Broken Access Control", but as IAM theft via SSRF became common in the cloud era, the OWASP community vote placed it first, and it was separated out. It is now treated as a representative modern Web vulnerability.

Why SSRF is so devastating in the cloud era #

SSRF has been a classic vulnerability for 20+ years, but only in the late 2010s — after cloud went mainstream — did it become widely seen as "critical". The reason is clear.

Cloud metadata endpoints #

AWS / GCP / Azure and other IaaS providers let each instance fetch short-lived credentials tied to its IAM role from a special local IP.

Cloud	Endpoint	What you can get
AWS	`http://169.254.169.254/latest/meta-data/`	IAM role name + temporary credentials (AccessKey/Secret/Token)
GCP	`http://metadata.google.internal/computeMetadata/v1/`	Service account token (requires `Metadata-Flavor: Google` header)
Azure	`http://169.254.169.254/metadata/instance`	Managed Identity token (requires `Metadata: true` header)
Alibaba/Oracle	`http://100.100.100.200/` etc.	Equivalent metadata

These endpoints respond without authentication to HTTP requests from the instance itself. By design that's "instance-local traffic" and considered safe — but SSRF breaks that assumption: external attackers force the server to hit its own metadata, and the response (including IAM credentials) is returned to them.

▸ SSRF → IAM theft → full S3 / RDS privileges

The extracted temporary credentials (AccessKey/SecretKey/SessionToken) carry all the privileges of the IAM role attached to that instance. If the instance has s3:* or rds:*, the attacker can wield those privileges from their own laptop via AWS CLI. The Capital One incident is exactly this pattern.

Internal networks and service meshes #

Even outside the cloud — within corporate networks, inside Kubernetes clusters, inside service meshes — the design assumption is often "if you're inside the same network, mutual auth is unnecessary or loose". SSRF is the ticket that makes the attacker a "person on the inside".

Hit http://kube-apiserver/api/v1/... from inside a Kubernetes Pod
Internal admin dashboards (Jenkins / Grafana / Kibana / Consul) listening without auth
Internal URLs for LDAP / Active Directory / Confluence / GitLab
localhost:6379 Redis, localhost:9200 Elasticsearch (both unauthenticated by default)

So SSRF is "an attack entry point", not just a single vulnerability #

Unlike XSS or SQLi where "this vulnerability alone causes the damage", SSRF is the starting point of lateral movement. Getting one SSRF opens up 10 to 100 additional attack surfaces beyond it. That's why it's "devastating in the cloud era".

The three types of SSRF #

Classified by how the response is visible to the attacker. Detection difficulty and ease of exploitation differ dramatically.

Basic SSRF (full response visible) #

The server returns the fetched response as-is back to the requester. The most exploitable type.

Typical scenarios: "Paste a URL, get a preview" APIs, PDF generators, URL-based image converters. The attacker submits ?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/ and the response JSON contains the full metadata.

Blind SSRF (response invisible, but side effects possible) #

The server fires the request but does not return the response body to the attacker. Common in "ping this webhook" / "store something based on this URL" flows. Direct information exfiltration is harder, but you can still:

Timing differential — infer reachability of internal hosts from response time (192.168.0.1 returns in milliseconds, unused IPs time out after seconds)
Internal APIs with side effects — even without reading the response, the attacker can trigger state-changing operations on internal APIs (delete, transfer, grant)
OOB (Out-of-Band) detection — observe via a separate channel whether the server made a request to an attacker-controlled DNS / HTTP server. Burp Collaborator is the standard tool

Semi-Blind SSRF (partially visible) #

The full body is not returned, but fragments — HTTP status code / Content-Length / error messages — are observable. Enough for port scanning. "200 OK = open, 500 = closed or invalid response" gives you a map of internal services.

▸ "Blind" doesn't mean safe

Even with no visible response, POST /admin/users/delete to an internal API still executes. It's easy to forget that SSRF can fire "writes with side effects", not just "reads". Even if you're limited to GET only, an internal API that accepts destructive operations via GET (and plenty exist, in violation of REST conventions) is game over.

Typical attack code and bypass techniques #

Minimal vulnerable code #

PHP — passing the URL straight to file_get_contents

<?php
// preview.php — fetch a URL and return the body
$url = $_GET['url'];
header('Content-Type: text/plain');
echo file_get_contents($url);  // ★ the hole is here

# Attack (fetch AWS IAM credentials) https://victim.example/preview.php?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/web-role

file_get_contents / curl / requests.get / fetch — in any language, passing user-supplied URLs through the HTTP client unchanged results in SSRF.

The scheme-restriction trap #

It's tempting to think "just allow http:// and https://", but some HTTP clients support many other schemes.

Scheme	Impact
`file://`	Read local files on the server (`file:///etc/passwd`)
`gopher://`	Send arbitrary byte sequences to a port. Classic for driving Redis / SMTP directly via SSRF
`dict://`	DICT protocol. Port scanning and pulling responses from some services
`ftp://`	Operate FTP servers. Stings if internal FTP is around
`ldap://`	LDAP queries. Used to fire Log4Shell-style payloads
`jar://` (Java)	Java-specific. `jar:http://...!/...` for internal ZIP extraction

PHP's curl_init allows many schemes by default. You must explicitly constrain via CURLOPT_PROTOCOLS.

URL parser / normalization pitfalls #

Defenders think "just block 127.0.0.1 and localhost", but attackers can express the same host in countless representations.

A partial list of variations pointing to 127.0.0.1

http://127.0.0.1/           # normal notation
http://localhost/           # via hosts file
http://127.1/               # short form (equivalent to 127.0.0.1)
http://0/                   # 0 → 0.0.0.0 → treated as loopback
http://0177.0.0.1/          # octal (0177 = 127)
http://2130706433/          # decimal (32-bit integer)
http://0x7f.0.0.1/          # hex
http://[::1]/               # IPv6 loopback
http://[::ffff:127.0.0.1]/   # IPv4-mapped IPv6
http://attacker.example/     # attacker domain with A record → 127.0.0.1

Blacklists are always bypassed. The defenses below boil down to "resolve DNS first, then check the final IP against private/loopback/link-local ranges".

DNS Rebinding #

A particularly nasty technique. The attacker controls a DNS server that returns a public IP on the first lookup and 127.0.0.1 on the second. If the server's flow is "① validate URL → DNS lookup, then ② send HTTP request → DNS lookup again" (two separate lookups), the IP gets "rebound" between validation and the actual request, defeating the check.

The fix is simple: "resolve DNS once, pin the resolved IP, then send the HTTP request directly to that IP (with the Host header set to the original domain)". Many SSRF defense libraries (e.g., ssrfFilter, safe-curl) adopt this pinning pattern.

Redirect chains #

Even allowlist-based defenses (only allow specific domains) can be defeated by automatic HTTP redirect following. The attacker sets up an endpoint on their allowed domain that returns 301 Location: http://169.254.169.254/.... The server blindly follows the redirect from the initial allowed-domain response and ends up at the internal IP.

Auto-follow redirects (curl -L equivalent) should be disabled, or each redirect target should be re-validated with the same checks.

URL parser interpretation differences (Parser Confusion) #

Attacks that exploit differences in interpretation between the language's URL parser and the HTTP client's URL parser.

http://allowed.example#@evil.example/
http://allowed.example@evil.example/
http://allowed.example\\@evil.example/

One language parses allowed.example as the host (validation passes), while a different HTTP client parses evil.example as the host (the actual connection goes there). Multiple CVEs have come from this divergence between Python's urllib and Go's net/url. The defensive rule: host validation and HTTP request must use the same library.

Famous SSRF incidents #

Capital One 100M-record breach (2019) #

The inflection point that put SSRF in mainstream news. A WAF (ModSecurity-based) at Capital One running on AWS had an SSRF: the attacker ① POSTed http://169.254.169.254/latest/meta-data/iam/security-credentials/... as a URL parameter to extract the temporary IAM credentials, then ② used those credentials with AWS CLI to S3:ListBuckets → S3:GetObject and exfiltrated about 100 million credit card application records.

One root cause was excessive S3 permissions on the WAF instance. The damage wasn't from SSRF alone — IAM least-privilege violations massively amplified it. This triggered AWS to introduce IMDSv2 (described later), which requires a session token to access the metadata endpoint.

GitLab (CVE-2021-22214 and others) #

GitLab has had several SSRF issues in CI/CD and repository import features. CVE-2021-22214 allowed unauthenticated access to internal metadata endpoints; for self-hosted GitLab deployments running in the cloud, this could lead to IAM theft. GitLab responded with stronger allowlists and explicit loopback / link-local rejection.

Shopify Bug Bounty (2018–2022) #

Shopify has received many SSRF reports via HackerOne, paying up to $25,000 per finding. "Internal microservices that interpret ?url=", "URL validation bypasses on OAuth callbacks", "via image-fetching workers" — a famous example of how large-scale e-commerce platforms continuously expose SSRF crevices because the attack surface is so wide.

Microsoft Exchange (ProxyShell / SSRF + auth bypass, 2021) #

A series of Exchange vulnerabilities (CVE-2021-34473 and others) chained SSRF + authentication bypass + arbitrary file write to reach RCE. The SSRF component was used to hit the internal /autodiscover/autodiscover.json endpoint to extract other users' backend URLs. The incident scale reached tens of thousands of Exchange servers compromised worldwide, with webshells dropped indiscriminately.

Lessons #

Cloud metadata is the attacker's top-priority target
Restricting IAM role privileges alone cuts damage by orders of magnitude (the heart of Capital One)
SSRF is dangerous alone, but chained with other vulnerabilities it reaches RCE-level severity (the heart of Exchange)
When running self-hosted OSS internally (GitLab, Confluence, Jira, etc.), their SSRF is just as dangerous under the same assumptions

Defenses — defense in depth #

Like XSS, no single SSRF defense holds. Combine URL validation, IP pinning, scheme restrictions, egress filtering, IMDSv2, and IAM least privilege.

Allowlist-based URL validation — highest priority #

Blacklists (block 127.0.0.1) are bypassed. Switch to a scheme that explicitly enumerates legitimate destinations.

Python — allowlist pattern that validates the IP after DNS resolution

import ipaddress, socket
from urllib.parse import urlparse
ALLOWED_SCHEMES = {'http', 'https'}
ALLOWED_HOSTS   = {'api.partner.example', 'cdn.partner.example'}

def validate_url(url: str) -> str: u = urlparse(url) if u.scheme not in ALLOWED_SCHEMES: raise ValueError('bad scheme') if u.hostname not in ALLOWED_HOSTS: raise ValueError('host not allowed') # resolve DNS and reject private/loopback/link-local final IPs (rebinding defense) addr = socket.gethostbyname(u.hostname) ip = ipaddress.ip_address(addr) if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved: raise ValueError('private/loopback ip') return addr # return the resolved IP, caller connects to this IP directly

Key points:

Explicitly limit schemes
Allowlist hostnames
Validate the final IP after DNS resolution with ipaddress (rebinding defense)
Connect directly to the IP from validation, with only the Host: header carrying the domain (no rebind between validation and actual call)
Forbid redirects (or re-validate redirect targets if allowed)

IMDSv2 (AWS) — authentication on the metadata endpoint #

Introduced by AWS in response to Capital One. Metadata access becomes a two-step session: PUT /latest/api/token to fetch a short-lived token, then attach X-aws-ec2-metadata-token header — no response otherwise.

IMDSv2 request (hard to perform via SSRF)

TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
    -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" \
     "http://169.254.169.254/latest/meta-data/iam/security-credentials/"

If SSRF is restricted to "GET only", "no header injection", "single-hop TTL", then getting the token itself is hard — that's IMDSv2's defensive value. On EC2, disable IMDSv1 in the launch options (MetadataOptions: HttpTokens=required); this is current best practice.

Egress filtering — stop it at the way out #

Restrict the destinations your web app server can reach via firewall / NAT GW / VPC Endpoint Policy. Even if SSRF succeeds, if the network layer blocks it, internal IPs and metadata are unreachable.

Deny 169.254.0.0/16 / 10.0.0.0/8 / 192.168.0.0/16 / 172.16.0.0/12 in the outbound rules of VPC security groups
AWS VPC Endpoint Policy restricts "S3 from this instance is allowed only to this bucket"
Place ECS tasks for the app in a dedicated subnet + dedicated SG that blocks metadata IP at the SG level

Scheme / port / size limits #

In the HTTP client configuration:

Only http / https schemes (block file://, gopher://, dict://, ldap://, jar://, etc.)
Only ports 80 / 443 (so internal service ports can't be probed)
Disable redirect following (or re-validate the redirect target)
Short timeouts (makes Blind SSRF timing analysis harder)
Response size limit (also blocks resource-exhaustion via hitting heavy internal APIs)

For PHP cURL:

curl_setopt($ch, CURLOPT_PROTOCOLS, CURLPROTO_HTTP | CURLPROTO_HTTPS);
curl_setopt($ch, CURLOPT_REDIR_PROTOCOLS, CURLPROTO_HTTP | CURLPROTO_HTTPS);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);

IAM least privilege — the last line of defense for limiting damage #

Even if SSRF lands and the IAM credentials are stolen, if the role can do little, the damage is small. The heart of the Capital One incident was that the WAF instance had S3:ListAllMyBuckets and S3:GetObject attached. Asking "does the WAF actually need to read S3?" would likely have prevented the disaster.

Create dedicated IAM roles per instance / Pod
Restrict resources (s3:GetObject only on arn:aws:s3:::myapp-uploads/*)
Restrict actions (s3:GetObject only, not s3:*)
For buckets holding sensitive data, use VPC Endpoint Policy + Bucket Policy as double lock

Sanitizer libraries #

Writing SSRF defenses by hand is full of traps. Use battle-tested libraries:

Node.js: ssrf-req-filter, request-filtering-agent
Python: safeurl-python, advocate
Go: safehttp, safedialer
Java: safe-url, OWASP ESAPI

These do "DNS resolution → IP classification → IP pinning → redirect re-validation" internally.

Testing and detection #

Manual testing #

Hammer every feature that takes a URL (preview / webhook / external import / OG image / PDF render / image transform) with SSRF payloads.

Representative test payloads

http://127.0.0.1/
http://169.254.169.254/latest/meta-data/
http://metadata.google.internal/computeMetadata/v1/
file:///etc/passwd
gopher://127.0.0.1:6379/_*1%0D%0A%248%0D%0Aflushall%0D%0A
http://collaborator-id.burpcollaborator.net/  # for Blind detection
# If the above are blocked, retry with notation variants (decimal IP / 0177.0.0.1, etc.)

Burp Suite Collaborator (Blind SSRF detection) #

For Blind SSRF, the response is invisible, so OOB detection is the standard. Burp Collaborator provides an attacker-owned DNS / HTTP server and records whether the target tried to resolve / fetch a Collaborator-controlled domain.

DNS interaction only → SSRF exists, but no HTTP is sent (only DNS resolution)
HTTP interaction too → SSRF succeeds, arbitrary URLs can be GETed
Neither → no SSRF at that location

Free alternatives include interactsh (Project Discovery) and canarytokens.

Static analysis #

Trace "user input → HTTP client call" data flow in source code. Semgrep / CodeQL / SonarQube have SSRF rules. Even a simple grep of framework-specific sinks (requests.get, urllib.request.urlopen, Net::HTTP.get, HttpClient.get) gives you a list of suspects.

Production monitoring #

Monitor anomalous traffic to the metadata endpoint via VPC Flow Logs (normal traffic is bounded; sudden spikes are abnormal)
Alert on outbound attempts to private networks in firewall / NAT GW logs
WAF-layer rules that alert on literal strings like 169.254.169.254 in requests

Related attacks #

Attack	Relation
CSRF (Cross-Site Request Forgery)	Confusingly named, but structurally opposite. CSRF makes the victim's browser send the request; SSRF makes the server send it
RFI / LFI (Remote/Local File Inclusion)	PHP `include`-style vulnerabilities that read URLs. Functionally overlaps with SSRF, but the result is file ingestion → RCE. Mostly closed by `allow_url_include=Off`
XXE (XML External Entity)	XML parsers resolving `<!ENTITY xxe SYSTEM "http://...">` to fetch external resources — abused as a form of SSRF
CSWSH (Cross-Site WebSocket Hijacking)	Browser-side WebSocket origin-check flaws. Different from SSRF, but similar in exploiting the "server-to-server is trusted" assumption
Smuggling (HTTP Request Smuggling)	Exploiting HTTP interpretation differences between front-end and back-end. Also a way to reach internal APIs, overlapping with SSRF's impact vector

Summary — 7 things developers must cover at minimum #

SSRF is unglamorous but one of the most dangerous entry points among modern Web app vulnerabilities to watch out for. In today's stacks where cloud metadata / internal APIs / private networks live just one hop away, the moment "the server can fetch arbitrary URLs" arises, IAM theft, internal exploration, and lateral movement immediately become possible.

▸ 7 things developers must cover at minimum

Introduce URL allowlist validation — restrict hostnames, schemes, and ports
Validation must target the final IP after DNS resolution, mechanically rejecting private/loopback/link-local
Disable HTTP client redirect following (or re-validate the redirect target)
On AWS, require IMDSv2 (HttpTokens=required)
Tighten egress at VPC / SG — block metadata IPs and private CIDRs at the exit
Give instances / Pods least-privilege IAM roles (both action and resource)
Don't roll your own SSRF defenses — use battle-tested libraries (safe-url family)

The simplest "take a URL and fetch it" features are precisely the ones that breed SSRF. At the design stage, remember: trust boundaries cannot be defined by IP addresses — only by explicit allowlists.