FastAPI Backend Security Checklist

2.1 CORS Configuration

The Risk

Using allow_origins=["*"] with allow_credentials=True in FastAPI is especially dangerous - it lets any website make authenticated requests to your API using a visitor's cookies. The real question is: does your API use cookies or header-based auth? This determines whether wildcard origins are safe or dangerous.

The Solution

If your API uses header-based auth (X-API-Key or Bearer token) and never reads cookies, allow_origins=["*"] with allow_credentials=False is safe - the auth header is what protects it, not the origin. This is the practical standard for backends with 30+ frontends. If your API uses cookies for auth, you MUST use an explicit origin allowlist. The key rule: allow_credentials=True + wildcard origins is always dangerous.

The Fix

# PATTERN 1: Header-based auth (X-API-Key / Bearer)
# Wildcard is SAFE here - auth header is the real security
app.add_middleware(CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=False,
    allow_methods=["*"],
    allow_headers=["Content-Type", "Authorization", "X-API-Key"],
)

# PATTERN 2: Cookie-based auth (sessions, JWTs in cookies)
# MUST use explicit origin list - cookies are sent automatically
origins = os.getenv("CORS_ORIGINS", "").split(",")
origins = [o.strip() for o in origins if o.strip()]
app.add_middleware(CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,  # sends cookies
    allow_methods=["*"],
    allow_headers=["Content-Type", "Authorization"],
)

Starlette gotcha: allow_origins=["https://*.yourdomain.com"] is treated as a LITERAL string, not a pattern. For subdomain wildcards, use allow_origin_regex=r"https://.*\\.yourdomain\\.com". DANGER: Never use *.vercel.app, *.netlify.app, or *.railway.app as allowed origins - anyone can deploy a page on these platforms for free and call your API cross-origin.

2.2 Rate Limiting (SlowAPI)

The Risk

Without rate limiting, a single attacker can exhaust your server's CPU, memory, and database connections by sending thousands of requests. This is especially critical for endpoints that run SQL queries or call external APIs - each request consumes real resources.

The Solution

Use a rate limiting library (SlowAPI for FastAPI) that tracks requests per IP address. Set a limit for each endpoint - for example, 30 requests per minute for a data query API. When someone exceeds the limit, they get a "too many requests" response and have to wait.

The Fix

from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded

RATE_LIMIT = os.getenv("RATE_LIMIT", "30/minute")

# Use get_client_ip (item 2.8), NOT get_remote_address
# get_remote_address reads proxy IP behind Cloudflare
limiter = Limiter(key_func=get_client_ip)

# CRITICAL: without this, SlowAPI returns an HTML error page
# instead of a JSON 429 response
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.get("/api/endpoint")
@limiter.limit(RATE_LIMIT)
async def endpoint(request: Request):  # request param is REQUIRED
    ...

SILENT FAILURE: If your endpoint signature doesn't include request: Request, the @limiter.limit() decorator does nothing - no rate limiting, no error, no warning. The endpoint works fine, you think it's rate limited, but it isn't. This is the #1 SlowAPI gotcha. Also: without the exception handler, clients get an HTML page instead of a clean 429 JSON.

2.3 Concurrency Limiting

The Risk

Rate limiting caps requests per minute, but an attacker can still send 30 simultaneous requests that all start executing at once. Each might run a heavy SQL query or allocate memory. Without concurrency limits, a burst of parallel requests can crash your server even within rate limits.

The Solution

Set a cap on how many requests from the same IP address can be processing at the same time - for example, 4 concurrent requests per visitor and 10 across all visitors. If someone already has 4 queries running, their 5th request waits or gets rejected. This prevents one person from hogging all your server's resources at once.

The Fix

MAX_PER_IP = int(os.getenv("MAX_CONCURRENT_PER_IP", "4"))
MAX_GLOBAL = int(os.getenv("MAX_CONCURRENT_GLOBAL", "10"))

# Critical: use asyncio.shield() on release
# Without shield, if the ASGI middleware cancels
# the request, the counter never decrements (leaks)
async def release_concurrency(ip):
    try:
        await asyncio.shield(_do_release(ip))
    except asyncio.CancelledError:
        pass  # shield caught it - counter is safe

Load limits from env vars so you can tune without redeploying. The CancelledError catch is required - shield() re-raises it after protecting the inner coroutine. Log ACQUIRED/RELEASED/DENIED events for diagnosing counter leaks in production.

2.4 SQL Validation Stack

The Risk

If your API accepts SQL queries from users (text-to-SQL, data exploration tools), a single malicious query can drop tables, read password hashes, or generate billions of rows to crash your database. Row limits alone don't protect you - CROSS JOIN creates trillion-row intermediate results before LIMIT applies, REPEAT() generates single-cell 95MB responses, and aggregate functions like STRING_AGG collapse 1M rows into one row bypassing LIMIT entirely. This requires a multi-layer defense - no single check is sufficient.

The Solution

Apply multiple layers of checks before running any user-submitted SQL. First, only allow queries that start with SELECT or similar read-only commands. Then block dangerous keywords like DELETE, DROP, or ALTER. Block resource exhaustion functions - CROSS JOIN (CPU bomb via cartesian products), REPEAT (bandwidth bomb via inflated strings), MD5/SHA256/REGEXP_REPLACE (CPU bombs on computed large strings), and recursive CTEs (a query that loops forever on a single table, pinning a CPU core with no large output to trip row or size limits). Add a SQL parser (sqlglot) for structural validation that catches attacks regex cannot see - CTE alias bypass, implicit cartesian products, nested blocked functions. Block data-generating functions, strip out SQL comments (used to bypass filters), hide system tables, and enforce a row limit. No single check is enough - you need all of them together.

The Fix

# Apply ALL of these layers (regex + parser):
1. Prefix allowlist: SELECT, SHOW, DESCRIBE, EXPLAIN, WITH
2. Keyword blocklist: INSERT, UPDATE, DELETE, DROP, ALTER,
   CREATE, TRUNCATE, EXEC, COPY, GRANT, REVOKE, SET...
3. Block GENERATE_SERIES, RANGE, UNNEST (billion-row attacks)
4. Block CROSS JOIN, REPEAT, LPAD, RPAD, REGEXP_REPLACE, MD5, SHA256
   (CPU and bandwidth bombs - small output, massive compute)
   and recursive CTEs / WITH RECURSIVE (loop forever on one table)
5. SQL parser structural check (see item 2.4b for details)
   (catches CTE bypass, aliased self-joins, implicit cartesian)
6. Reject queries containing SQL comments (/* or --)
   (reject entirely - don't strip, stripping changes query meaning)
7. Block system catalog: pg_shadow, pg_authid, pg_roles
8. Auto-append LIMIT 1000 if missing
9. Enforce response size limit (e.g. 1MB) - catches STRING_AGG
   and other aggregate dumps that bypass row limits
10. Block computed ORDER BY (prevents function injection)
11. Block subqueries in ORDER BY (O(n^2) attacks)
12. Limit subquery depth (max 3 SELECT keywords)

WHOLE-WORD MATCHING: Keyword blocklist needs word boundaries or you get false positives - "DESCRIPTION" contains "SET", "RESERVED" contains "SET". Use: if f" {kw} " in f" {upper} " or f" {kw}(" in f" {upper}(" or upper.startswith(kw). MCP GOTCHA: MCP clients send multi-line SQL. Use re.DOTALL in ORDER BY regex or function detection will miss matches across newlines. SUBQUERY ORDER: Check original SQL for ORDER BY subqueries BEFORE stripping content inside parentheses (for LIMIT detection) - stripping parens removes the subqueries you need to catch. REGEX VS PARSER: Regex handles 95% of attacks (fast, simple). But regex cannot understand SQL structure - CTEs, aliases, and subqueries can disguise cartesian products. Use a SQL parser like sqlglot (item 2.4b) as a second structural layer. ROW LIMITS VS RESPONSE SIZE: Auto-append LIMIT protects against SELECT * but not against aggregates (STRING_AGG collapses 1M rows into 1 row) or CROSS JOIN (computation before LIMIT). The response size check is the catch-all.

2.4b SQL Parser Structural Validation (sqlglot)

The Risk

Regex-based SQL validation works on raw text - it cannot understand SQL structure. Attackers exploit this gap with CTEs (WITH t AS (...) SELECT FROM t a, t b), table aliases, nested function calls, and implicit joins that look different in text but produce the same dangerous execution plan. Every regex bypass the pen tester found in Round 3 was structural - the SQL was semantically identical to a blocked pattern but syntactically different enough to pass text-based checks.

The Solution

Add a SQL parser (sqlglot) as a structural validation layer on top of your regex checks. The parser builds an Abstract Syntax Tree (AST) and validates the actual query structure: how many tables appear in each SELECT's FROM clause, which functions are called regardless of nesting or aliasing. Keep the regex layer as the first line of defense (fast, catches obvious attacks) and use the parser as a second pass for structural checks that regex cannot handle. sqlglot supports both Postgres and DuckDB dialects and is pure Python with no C dependencies.

The Fix

import sqlglot
from sqlglot import exp as sqlglot_exp

def validate_sql_structure(sql, engine="postgres"):
    dialect = "duckdb" if engine == "duckdb" else "postgres"
    try:
        parsed = sqlglot.parse_one(sql, dialect=dialect)
    except sqlglot.errors.ParseError:
        return "Invalid SQL syntax"

    # Block multiple table sources per SELECT (catches cartesian products,
    # CTE alias bypass, implicit joins via FROM a, b)
    for select in parsed.find_all(sqlglot_exp.Select):
        from_clause = select.find(sqlglot_exp.From)
        if not from_clause:
            continue
        from_tables = list(from_clause.find_all(sqlglot_exp.Table))
        join_tables = []
        for join in select.find_all(sqlglot_exp.Join):
            if join.find_ancestor(sqlglot_exp.Select) is select:
                join_tables.extend(join.find_all(sqlglot_exp.Table))
        if len(from_tables) + len(join_tables) > 1:
            return "Multiple table sources not allowed"

    # Block dangerous functions via AST (catches nested/aliased usage)
    BLOCKED = {"REPEAT", "LPAD", "RPAD", "GENERATE_SERIES", "MD5", "SHA256"}
    for func in parsed.find_all(sqlglot_exp.Func):
        name = func.name.upper() if hasattr(func, 'name') else ""
        if name in BLOCKED:
            return f"Blocked function: {name}"

    # Block recursive CTEs (a single-table query that can loop forever;
    # catches WITH RECURSIVE even when split across whitespace/newlines)
    for cte in parsed.find_all(sqlglot_exp.With):
        if cte.args.get("recursive"):
            return "Recursive CTEs not allowed"
    return None

HYBRID APPROACH: Do not replace regex with the parser - use both. Regex is fast and catches 95% of attacks. The parser catches the remaining 5% that require structural understanding. GRACEFUL FALLBACK: Wrap the import in try/except so the server still works (regex-only) if sqlglot is not installed. DIALECT MATTERS: sqlglot.parse_one(sql, dialect="duckdb") handles DuckDB-specific syntax that would fail with the default dialect. PER-SELECT CHECK: Count tables per individual SELECT, not globally across the whole query - otherwise legitimate correlated subqueries (WHERE x IN (SELECT y FROM same_table)) get blocked. SELF-JOIN DETECTION IS THE SAME TRAP: if you block self-joins by flagging a physical table whose name appears more than once, scope that duplicate check per query-scope too (per SELECT, per CTE body, or per parenthesized fragment), not globally across the whole statement. The same base table read in two SEPARATE CTEs (one CTE counts it, another aggregates it) is not a self-join and is fully bounded. A global duplicate-name check will 403 every normal multi-CTE dashboard query. A real self-join keeps both references inside one FROM or JOIN scope, so it is still caught; allow self-joins wrapped in an explicit subquery or CTE, and keep a per-query timeout plus connection recovery as the backstop. See the Database MCP Server repo (github.com/amararun/shared-fastapi-database-mcp) for the full working implementation.

2.5 Admin Write Table Whitelist

The Risk

Even with API key authentication, a compromised client or misconfigured admin tool can issue DELETE FROM critical_table. If your endpoint allows write operations through a proxy-to-database pattern, restricting which tables are writable is a last-resort guard.

The Solution

Maintain a list of table names that are allowed for write operations (insert, update, delete). Before running any write query, extract the table name from the SQL and check it against your list. If the table isn't on the list, reject the query. This is a safety net - even if someone has valid credentials, they can only modify approved tables.

The Fix

ALLOWED_TABLES = {"users", "sessions", "audit_log"}

import re
match = re.search(
    r'\\b(?:INTO|UPDATE|FROM|DELETE\\s+FROM)\\s+["\\'\\']?(\\w+)',
    sql, re.IGNORECASE
)
if match and match.group(1).lower() not in ALLOWED_TABLES:
    raise HTTPException(400, detail="Operation not allowed")

Only needed for admin/write endpoints. Read-only APIs should use the full SQL validation stack (2.4) instead. DuckDB gotcha: DuckDB supports INSERT OR REPLACE INTO and INSERT OR IGNORE INTO which the standard INTO regex won't match. Use a broader pattern: /(?:INTO|UPDATE|FROM|DELETE\\s+FROM|OR\\s+REPLACE\\s+INTO|OR\\s+IGNORE\\s+INTO)\\s+(\\w+)/. Also ensure table name case in your Set matches the normalized SQL case.

2.6 Error Message Sanitization

The Risk

Returning str(e) to clients in error responses leaks internal details: file paths, table names, database versions, library versions. Attackers use this information to map your infrastructure and craft targeted attacks. Health endpoints that return version info are equally dangerous.

The Solution

Never send raw error messages to the user. Log the full error details on your server for debugging, but return only a generic, helpful message to the client like "Query failed - check your SQL syntax." Health check endpoints should return just "status: ok" without version numbers, file paths, or configuration details.

The Fix

# WRONG - leaks internals
raise HTTPException(400, detail=f"Error: {str(e)}")

# RIGHT - log internally, return generic message
logger.error(f"Query error: {e}")
raise HTTPException(400, detail="Query failed. Check SQL syntax.")

# SAFETY NET - global exception handler catches anything
# that slips through individual try/except blocks
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
    logger.error(f"Unhandled: {request.url.path}: {exc}", exc_info=True)
    return JSONResponse(status_code=500,
        content={"detail": "Internal server error"})

# Health endpoints - only return status
@app.get("/health")
async def health():
    return {"status": "ok"}  # No versions, paths, or config

Serverless proxy gotcha: when forwarding backend responses, wrap response.json() in its own try/catch. Backends behind Cloudflare can return HTML challenge pages (Bot Fight Mode) or HTML error pages as 200 status. response.json() throws SyntaxError on non-JSON - handle it: try { data = await response.json(); } catch { return res.status(502).json({ error: "Invalid backend response" }); } STATUS CODE UNIFORMITY: Beyond sanitizing error messages, make ALL rejection paths return the same HTTP status code AND body. If your endpoint returns 405 for wrong method, 400 for missing params, and 403 for bad auth - an attacker can distinguish these cases and map your internal routing logic. Use a single response (e.g., 403 + {"error":"Forbidden"}) for every rejection regardless of reason. This turns the endpoint into a black box - wrong method, invalid action, bad token all look identical from outside. INPUT ECHO: Never echo user input in error messages (e.g., "Unknown action: ${action}"). This lets attackers enumerate valid values by observing which inputs get a different response than the echo.

2.7 API Key Authentication

The Risk

Non-public endpoints without authentication are discoverable and callable by anyone. If the API key defaults to an empty string when the environment variable isn't set, every request passes validation - this is a common deployment mistake that silently disables auth.

The Solution

Require an API key in a request header for every non-public endpoint. At startup, check that the API key environment variable is actually set - if it's missing, refuse to start the server. This "fail-secure" approach ensures you never accidentally run without authentication because someone forgot to set the variable during deployment.

The Fix

API_KEY = os.getenv("API_KEY")
if not API_KEY:
    raise RuntimeError("API_KEY not configured")  # fail-secure

async def verify_api_key(request: Request):
    key = request.headers.get("X-API-Key", "")
    if key != API_KEY:
        raise HTTPException(401, detail="Invalid API key")

@app.post("/api/endpoint", dependencies=[Depends(verify_api_key)])

If the env var is not set, REJECT all requests. Never default to empty string. X-API-Key vs Bearer: use X-API-Key for server-to-server calls (simpler, no token lifecycle). Use Bearer tokens (JWT) when you need user identity, token expiry, or OAuth flows. Don't mix both on the same endpoint - pick one auth scheme per endpoint.

2.8 Client IP Extraction (Anti-Spoofing)

The Risk

Behind Cloudflare or nginx, request.client.host returns the proxy's IP, not the actual user's IP. This means all users share one "IP" for rate limiting - either everyone gets blocked or nobody does. Worse, X-Forwarded-For, X-Real-IP, and any custom headers (like X-Original-Client-IP) can be freely set by the client - an attacker sends a fake header to bypass per-IP rate limiting entirely. Only CF-Connecting-IP is safe because Cloudflare sets it from the actual TCP connection at the edge, overwriting any client-supplied value.

The Solution

Write a helper function that prioritizes cf-connecting-ip first - it is the only header that cannot be spoofed by the client. Cloudflare determines the client IP from the TCP connection itself (the IP layer of the network stack, not any HTTP header). All other headers (X-Forwarded-For, X-Real-IP, custom headers) are just HTTP headers that any client can set to any value. Fall back to x-forwarded-for and request.client.host for non-Cloudflare environments.

The Fix

def get_client_ip(request: Request) -> str:
    for header in ['cf-connecting-ip',
                   'x-forwarded-for',
                   'x-real-ip']:
        val = request.headers.get(header, '')
        if val:
            return val.split(',')[0].strip()
    return request.client.host if request.client else 'unknown'

# CRITICAL: SlowAPI ignores this unless you override key_func
limiter = Limiter(key_func=get_client_ip)  # NOT get_remote_address

SlowAPI's default get_remote_address reads request.client.host, which is the proxy IP behind Cloudflare. You MUST pass your custom IP extractor as key_func when creating the Limiter. This is a silent failure - rate limiting appears to work but applies to one shared IP instead of per-user. IP SPOOFING: A pen test on the IMDB Dashboards backend (March 2026) confirmed that X-Original-Client-IP headers sent by the client were trusted, completely bypassing per-IP rate limiting and concurrency limits. Fix: always prioritize cf-connecting-ip - Cloudflare sets it at the TCP level, the client cannot override it. MULTI-HOP GOTCHA (March 2026): CF-Connecting-IP is only correct when Cloudflare is the first proxy before your backend. If your architecture has multiple hops - for example, a Vercel serverless function that calls a Cloudflare-proxied backend - Cloudflare overwrites CF-Connecting-IP at each hop with the IP of whatever connected to it. Your backend sees the serverless platform's IP (AWS Lambda, Google Cloud Run), not the real user. The fix: the serverless function reads the real IP from CF-Connecting-IP (which IS correct at that hop, since the browser connected through Cloudflare to reach the serverless platform), then forwards it via a custom header that Cloudflare doesn't recognize and won't overwrite. Your logging or backend layer checks this custom header first. This is a common blind spot - CF-Connecting-IP being "unspoofable" creates false confidence when your traffic passes through Cloudflare more than once. Prefer fixing this in a centralized logging layer rather than in every backend's middleware - one deploy covers all backends instead of redeploying dozens of services. LIVE SPOOFING TEST RESULTS (March 2026): Verified on a FastAPI backend behind Caddy reverse proxy on Hetzner, tested with Cloudflare on (orange cloud) and off (grey cloud). (1) CF-Connecting-IP is trivially spoofable without Cloudflare - curl with -H "CF-Connecting-IP: 7.7.7.7" and the middleware trusts it completely. Anyone who discovers your origin server IP can bypass CF-Connecting-IP protection. (2) Caddy reverse proxy OVERWRITES X-Forwarded-For with the real TCP source IP - spoofed XFF values are replaced. This is Caddy-specific behavior and a useful safety net. (3) X-Real-IP passes through completely unvalidated by Caddy - anyone can set any value. (4) request.client.host (FastAPI/Starlette ASGI scope) always returns the LAST PROXY IP (e.g. Caddy at 172.18.0.10), never the end user. This is the TCP-level connection source to your app process, which is your reverse proxy in any Docker/container setup. Only useful if your app receives connections directly with no proxy in front. (5) With Cloudflare orange cloud: CF-Connecting-IP correctly shows the real user IP (verified: Indian IP 223.185.x.x). X-Forwarded-For shows a Cloudflare internal edge IP (172.70.x.x), NOT the user. (6) No single header works in all architectures. Your get_client_ip function must match your actual network topology: how many proxies, is Cloudflare present, is there a serverless hop? Test with your own browser and verify what lands in your database.

2.9 Webhook Signature Verification

The Risk

Without signature verification, anyone who discovers your webhook URL can POST fake events - fake payment confirmations, fake user signups, fake booking notifications. Your backend processes them as real, leading to data corruption or unauthorized actions.

The Solution

When receiving a webhook, verify the signature that the sender attached. The sender (Stripe, GitHub, Brevo, etc.) signs each webhook with a shared secret - your server recalculates the signature and compares. If they don't match, the webhook is fake. Also reject webhooks older than 5 minutes to prevent replay attacks where someone resends a captured webhook later.

The Fix

import hmac, hashlib

def verify_webhook(payload: bytes, signature: str, secret: str):
    expected = hmac.new(
        secret.encode(), payload, hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)

raw_body = await request.body()
sig = request.headers.get("x-webhook-signature", "")
if not verify_webhook(raw_body, sig, os.getenv("WEBHOOK_SECRET")):
    raise HTTPException(401, detail="Invalid signature")

# Reject old webhooks (replay protection)
timestamp = int(request.headers.get("x-webhook-timestamp", "0"))
if abs(time.time() - timestamp) > 300:  # 5 minutes
    raise HTTPException(401, detail="Webhook expired")

2.10 SSRF Protection

The Risk

If any endpoint fetches a URL supplied by the client (proxy endpoints, URL preview generators), attackers can use your server to reach internal services: AWS metadata at 169.254.169.254 (leaks credentials), localhost admin panels, or private network scanning. Your server becomes a weapon against your own infrastructure.

The Solution

Maintain a list of approved domains and URL paths that your server is allowed to fetch. When a user provides a URL, check that the domain and path are on your approved list before making the request. Block any requests to internal addresses like localhost or cloud metadata endpoints. This prevents attackers from using your server as a proxy to reach things they shouldn't.

The Fix

# Python - domain allowlist + redirect validation
ALLOWED_DOMAINS = {"api.example.com", "cdn.example.com"}
from urllib.parse import urlparse

def validate_url(url: str):
    parsed = urlparse(url)
    if parsed.scheme not in ("https",):
        raise HTTPException(400, "HTTPS only")
    if parsed.hostname not in ALLOWED_DOMAINS:
        raise HTTPException(400, "Domain not allowed")

validate_url(user_url)  # check initial URL

# CRITICAL: disable auto-redirect, validate each hop
resp = httpx.get(user_url, follow_redirects=False)
while resp.is_redirect:
    redirect_url = str(resp.next_request.url)
    validate_url(redirect_url)  # check EVERY redirect target
    resp = httpx.get(redirect_url, follow_redirects=False)

# Also enforce response size cap
MAX_SIZE = 50 * 1024 * 1024  # 50MB

Most HTTP clients follow redirects by default. An attacker hosts a redirect on an allowed domain that points to 169.254.169.254 (cloud metadata). Also block file:// URIs and IP addresses in hostnames. PREFERRED PATTERN: If your proxy uses structured parameters (not raw paths), use a value-based whitelist - client sends a parameter, server maps to a fixed URL: ALLOWED = ['stocks', 'users']; url = f"{BACKEND}/api/{param}". This is safer because the user never controls the URL path directly.

2.11 API Monitoring

The Risk

Without request logging, you can't detect attacks, debug failures, or understand usage patterns. When something goes wrong, you're flying blind. Monitoring should log endpoints, status codes, and response times - not request bodies or PII.

The Solution

Add a monitoring middleware that automatically logs every API request - which endpoint was called, whether it succeeded or failed, and how long it took. Send these logs to a centralized dashboard so you can spot unusual patterns (sudden spike in errors, slow queries, unknown endpoints being probed). Only log operational data, never personal information.

The Fix

pip install your-api-monitor  # or any monitoring middleware

# Use try/except - if package isn't installed, app still starts
try:
    from your_api_monitor import APIMonitorMiddleware
    app.add_middleware(APIMonitorMiddleware,
        app_name="YOUR_APP_NAME",
        include_prefixes=("/api/", "/analyze"),
    )
except ImportError:
    pass  # monitoring unavailable, app runs without it

# Env vars: API_MONITOR_URL, API_MONITOR_KEY

Without try/except, a missing package crashes the entire app on startup. This happened on fresh deploys where pip install failed silently. Monitoring should degrade gracefully - a monitoring failure shouldn't take down your API.

2.12 File Upload Validation

The Risk

If your backend accepts file uploads without validation, attackers can upload executable files (.py, .sh, .exe), oversized files that fill your disk, or files disguised with fake extensions (.jpg that is actually a .html with embedded scripts). If uploaded files are served back to users from your domain, an uploaded HTML file executes in your site's security context.

The Solution

Validate every uploaded file on three levels: check the file extension against an allowlist (only accept the types your app actually needs), enforce a maximum file size so nobody fills your disk, and store uploaded files in a location where they cannot be executed by the server. Never trust the browser's Content-Type header - check the actual file contents. Serve downloads with Content-Disposition: attachment so browsers download instead of rendering.

The Fix

import os
ALLOWED_EXTENSIONS = {".csv", ".xlsx", ".parquet", ".duckdb"}
MAX_FILE_SIZE_MB = 50

async def validate_upload(file: UploadFile):
    # 1. Check extension
    ext = os.path.splitext(file.filename)[1].lower()
    if ext not in ALLOWED_EXTENSIONS:
        raise HTTPException(400, "File type not allowed")

    # 2. Check size (read in chunks, don't load all into memory)
    size = 0
    while chunk := await file.read(8192):
        size += len(chunk)
        if size > MAX_FILE_SIZE_MB * 1024 * 1024:
            raise HTTPException(413, "File too large")
    await file.seek(0)  # reset for actual processing

    # 3. Serve with safe headers
    # Content-Disposition: attachment (download, don't render)
    # X-Content-Type-Options: nosniff (don't guess type)

Never store uploads in a publicly accessible folder. Use signed URLs with expiry for download access.

2.13 SSL Certificate Verification in Backend Calls

The Risk

Python's requests.post(..., verify=False) disables SSL certificate checking, enabling man-in-the-middle attacks. An attacker on the network path can intercept and modify traffic between your backend and other services - reading API keys, injecting malicious responses, or stealing data in transit. This is especially common in backends that were set up during development and never fixed for production.

The Solution

Always use verify=True (the default) for external API calls. When the target is behind Cloudflare or has a valid certificate, there is no excuse for verify=False. For Docker-internal calls where services communicate without SSL, use an internal URL variable that explicitly skips SSL only for known-internal addresses. This way, internal calls use Docker networking (no SSL needed) and external calls always verify certificates.

The Fix

# WRONG - disables all SSL verification
resp = requests.post(url, json=data, verify=False)

# RIGHT - verify certificates for external calls
resp = requests.post(url, json=data)  # verify=True is default

# For Docker-internal services (no SSL):
BACKEND_URL = os.getenv("BACKEND_URL")         # external, verified
BACKEND_URL_INTERNAL = os.getenv("BACKEND_URL_INTERNAL")  # Docker alias

async def call_backend(path, data):
    url = BACKEND_URL_INTERNAL or BACKEND_URL
    verify = BACKEND_URL_INTERNAL is None  # skip SSL only for internal
    return requests.post(f"{url}{path}", json=data, verify=verify)

verify=False and empty API_KEY defaults (item 2.7) are correlated - backends with one usually have both. Audit them together. If you see verify=False in code review, check API_KEY handling too.

2.14 Centralized Logging & PII Retention

The Risk

Item 1.14 says "don't log PII in console.log" - that's about careless logging in serverless functions visible to anyone with dashboard access. But for incident response, you NEED request bodies and client IPs. Without them, you can't answer "what happened?" after an attack - you can see that endpoints were called, but not what payloads were sent (SQL injection attempts, credential stuffing patterns, malformed requests from scanning tools). The question isn't whether to capture PII, it's how to capture it safely with controls and auto-deletion.

The Solution

Set up a centralized logging pipeline: every backend sends request metadata (endpoint, status, response time, client IP, request body) to a dedicated logging service, which stores it in a database with a retention policy. Keep identifiable data (IP, request body) for 30 days for incident response, then auto-NULL it via a daily cron job. Keep permanent non-identifiable hashes (SHA256 of request body) for pattern analysis. All backends flow to one dashboard so you can correlate attacks across services - during an incident, you need one place to look, not 30 containers to SSH into.

The Fix

# The pipeline:
# Backend → monitoring middleware → Logger Service → DB → Dashboard
#                                                        ↓
#                                          Daily cron: NULL PII > 30 days

# Each backend: 3 lines + 2 env vars
try:
    from your_api_monitor import APIMonitorMiddleware
    app.add_middleware(APIMonitorMiddleware,
        app_name="APP_NAME",
        include_prefixes=("/api/",),
    )
except ImportError:
    pass
# Env vars: API_MONITOR_URL, API_MONITOR_KEY

# PII cleanup endpoint (called by daily cron)
@app.post("/admin/cleanup-raw-data")
async def cleanup_raw_data():
    retention = int(os.getenv("PII_RETENTION_DAYS", "30"))
    result = await db.execute(
        "UPDATE api_logs SET client_ip = NULL, request_body = NULL "
        "WHERE created_at < NOW() - INTERVAL '%s days' "
        "AND client_ip IS NOT NULL", retention
    )
    return {"nullified": result.rowcount}
# request_body_hash stays - count patterns without identifying sender

This is the opposite of item 1.14 and that's intentional. 1.14 is about careless logging (console.log in Vercel). This is deliberate capture with access controls and auto-deletion. The middleware should extract client IP from cf-connecting-ip automatically (ties into item 2.8) - don't rely on individual endpoints to log it. Truncate request bodies to 10KB max to prevent log storage abuse. One more trap with this exact pipeline: do NOT rate-limit your own log-collector per-IP. A normal rate limit (one address may make only N requests a minute) is designed for real visitors, who each arrive from their own address. But your logs all arrive from a HANDFUL of addresses - one shared backend, or a single edge worker that is one IP for everything. A per-IP limit then counts all your telemetry as if it were one person hammering you, and starts dropping your own logs - worst during a traffic spike, exactly when you most want the record. The collector is already gated by its secret key, so give it a high (or per-key) limit, not a tight per-IP one.

2.15 Old Report / Temp File Cleanup

The Risk

Endpoints that generate reports (PDF, HTML, CSV) create files on disk. Without cleanup, these accumulate indefinitely - filling the disk, slowing the server, and potentially exposing old reports to anyone who guesses the filename pattern.

The Solution

Add cleanup logic that deletes generated files older than a set period (e.g., 2 hours). Run cleanup on each new request (lazy cleanup) or via a scheduled cron job. Use a dedicated output directory that's easy to sweep.

The Fix

import os, time
from pathlib import Path

OUTPUT_DIR = Path("./output")
MAX_AGE_HOURS = 2

def cleanup_old_files():
    """Delete generated files older than MAX_AGE_HOURS"""
    cutoff = time.time() - (MAX_AGE_HOURS * 3600)
    for f in OUTPUT_DIR.glob("*"):
        if f.is_file() and f.stat().st_mtime < cutoff:
            f.unlink()

# Call at the start of each report generation endpoint
@app.post("/generate-report")
async def generate_report(...):
    cleanup_old_files()  # sweep old files first
    # ... generate new report

This applies to any endpoint that creates files: PDF generators, CSV exports, chart images, temp uploads. Never store generated files in a publicly accessible static directory - use a non-routable path and serve via a download endpoint with proper auth.

2.16 Read-Only Enforcement

The Risk

SQL validation (item 2.4) blocks write keywords in queries, but regex-based validation can have gaps - clever encoding, dialect-specific syntax, or new SQL features might bypass it. If the database connection itself allows writes, a validation bypass means data can be modified or deleted.

The Solution

Enforce read-only mode at both the SQL validation level AND the database connection level. This is defense in depth - even if SQL validation misses something, the database connection itself will reject any write operation.

The Fix

# Postgres: use a read-only database role
CREATE ROLE readonly_user WITH LOGIN PASSWORD 'xxx';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;
# Connect with this role for all query endpoints

# DuckDB: open in read-only mode
conn = duckdb.connect("data.duckdb", read_only=True)

# Application level: validate BEFORE database level
validated_sql = validate_sql(raw_sql)  # regex + parser
result = readonly_conn.execute(validated_sql)  # read-only conn

Write operations (data imports, admin updates) should use a separate connection with a different role - never the same connection used for user-facing queries. Keep write endpoints behind additional auth and rate limiting.

2.17 Asymmetric-Cost Endpoints (Heavy Work Behind a Cheap Request)

The Risk

Think of a button that's one easy click for a visitor but makes your server sweat - "Download everything as one file", "Generate my PDF report", "Export the whole dataset". One click for them can be 30 seconds of hard work for your server. An attacker doesn't need an army: they just keep triggering that one expensive thing, from a handful of different internet addresses so it never looks like a single person hammering you - and your server stays pinned, slows down, or falls over. The sneaky part: most security alarms listen for ERROR responses (lots of "page not found", failed logins). But this expensive request returns a normal SUCCESS, so the alarm never rings while the damage is done. And because the attacker keeps switching addresses, a "too many requests from one address" limit never trips either. (In technical terms: a heavy endpoint returning HTTP 200 is invisible to 4xx-based jails, and rotating IPs defeat per-IP limits.)

The Solution

Don't rebuild the expensive thing every time someone asks. Make it a few times a day on a timer (a scheduled job) and hand everyone the same ready-made copy - like baking the bread each morning instead of from scratch per customer. If something truly must be built on demand, only let a couple be built at the same time (a queue/limit), so a flood waits in line instead of all crashing onto the server at once. Make sure that tacking random junk onto the web address can't trick your server into rebuilding - it should recognise it's the same file and serve the ready copy. And limit abuse not just per-address but per-network (the attacker's whole "neighbourhood" of addresses), since attackers hop between addresses. (Technical: pre-generate + cache, a global concurrency semaphore on the build path, a cache key that ignores junk params, and rate-limit per-IP AND per-ASN/subnet.)

The Fix

# Global concurrency cap on the expensive path (FastAPI example)
import asyncio
_build_sem = asyncio.Semaphore(2)   # at most 2 heavy builds at once, server-wide

@app.get("/report/heavy")
async def heavy(...):
    cached = get_cached(stable_key)          # key ignores junk ?cb= params
    if cached: return cached                 # cheap path for everyone
    if _build_sem.locked():                  # already at capacity
        raise HTTPException(429, "busy, retry shortly")
    async with _build_sem:
        return build_and_cache(...)          # only a few run concurrently

# Better still: pre-generate on a cron and serve the static result (no build on request).

The jail-invisible-200 point is the subtle, often-missed one: most auto-block systems watch for a flood of 4xx. A heavy endpoint that happily returns 200 is exactly what they DON'T catch. Audit your app for "cheap request, expensive response" endpoints and protect those specifically - they're the take-it-down vector even when the rest of the app is well rate-limited. (For large file downloads specifically, the cleanest fix is to serve them from edge object storage - see the Perimeter section.)

2.18 Lock Down HTML Reports Your Backend Builds and Serves

The Risk

When your backend turns user input into a web page and serves it back (a report, a chart page, an analysis "tearsheet"), whatever the user typed (a stock symbol, a label, a name) gets printed inside that page. If you drop it in raw, a visitor can supply a value that is not really a label but a hidden instruction, and the browser will run it as code on YOUR web address. That smuggled code can read the visitor's login session, change what the page shows, or send them to a fake login screen. This is the same family as the front-end script-injection problem, just on the server side: the risk is ANY user value that lands inside an HTML page you hand back.

The Solution

Use two cheap defenses together. First, "escape" the user's text before you place it in the page, so characters like < and > turn into harmless symbols instead of the start of a tag. Second, attach two response headers that tell the browser to be strict: a Content-Security-Policy (which sources of scripts, styles and images are even allowed to run) and X-Content-Type-Options: nosniff (do not guess that a file is something other than what you said). Together they mean that even if a bad value slips past the escaping, the browser refuses to run an injected script. Apply the headers to every report page your backend serves - both files served from your /static folder and any HTML you return directly.

The Fix

import html
# 1. Escape the user value where it enters the page
heading = f"<h1>{html.escape(user_symbol)} Report</h1>"   # <, >, " become safe

# 2. Security headers on every served report page (FastAPI middleware)
@app.middleware("http")
async def report_headers(request, call_next):
    resp = await call_next(request)
    if request.url.path.startswith("/static/"):
        resp.headers["X-Content-Type-Options"] = "nosniff"
        if request.url.path.endswith(".html"):
            resp.headers["Content-Security-Policy"] = (
                "default-src 'self'; "
                "script-src 'self' 'unsafe-inline'; "   # the report's own inline scripts
                "style-src 'self' 'unsafe-inline'; "
                "img-src 'self' data: https:; "
                "object-src 'none'; base-uri 'none'; "
                "form-action 'none'; frame-ancestors 'self'"
            )
    return resp

The trap to avoid: if the report has its OWN legitimate scripts (interactive charts, expand/collapse buttons, a download button) then a strict "no scripts at all" policy will silently break them - the page still shows but the buttons go dead. Use a relaxed policy that allows the page's own inline scripts (and the chart library it loads from a CDN, if any) while still blocking plugins (object-src none), address-bar tricks (base-uri none), forms that post to other sites (form-action none) and being embedded by foreign sites (frame-ancestors). Always confirm the report renders the SAME before and after adding the headers. And escape at the one spot where the value enters the page, not scattered everywhere. If your backend just hands the HTML to another service to host (e.g. a markdown-to-PDF service that also returns an HTML preview), the headers and escaping belong on whoever actually serves the page.

Best used with an AI agent

2.1 CORS Configuration

The Risk

The Solution

The Fix

2.2 Rate Limiting (SlowAPI)

The Risk

The Solution

The Fix

2.3 Concurrency Limiting

The Risk

The Solution

The Fix

2.4 SQL Validation Stack

The Risk

The Solution

The Fix

2.4b SQL Parser Structural Validation (sqlglot)

The Risk

The Solution

The Fix

2.5 Admin Write Table Whitelist

The Risk

The Solution

The Fix

2.6 Error Message Sanitization

The Risk

The Solution

The Fix

2.7 API Key Authentication

The Risk

The Solution

The Fix

2.8 Client IP Extraction (Anti-Spoofing)

The Risk

The Solution

The Fix

2.9 Webhook Signature Verification

The Risk

The Solution

The Fix

2.10 SSRF Protection

The Risk

The Solution

The Fix

2.11 API Monitoring

The Risk

The Solution

The Fix

2.12 File Upload Validation

The Risk

The Solution

The Fix

2.13 SSL Certificate Verification in Backend Calls

The Risk

The Solution

The Fix

2.14 Centralized Logging & PII Retention

The Risk

The Solution

The Fix

2.15 Old Report / Temp File Cleanup

The Risk

The Solution

The Fix

2.16 Read-Only Enforcement

The Risk

The Solution

The Fix

2.17 Asymmetric-Cost Endpoints (Heavy Work Behind a Cheap Request)

The Risk

The Solution

The Fix

2.18 Lock Down HTML Reports Your Backend Builds and Serves

The Risk

The Solution

The Fix