Attack Vectors
Every attack type Arcis blocks, what it is, and how the protection works. Use this as a reference when investigating a blocked request or auditing coverage.
Bypass resistance (v1.6)
Every detector below sits on top of three normalization passes that close the encoding-, casing-, and Unicode-bypass classes in one place. A fullwidth <script>, a URL-encoded %3Cscript%3E, and a plain <script> all reach the XSS detector as the same string.
NFKC normalization
Unicode NFKC compatibility normalization runs at the top of sanitizeString / sanitize_string / SanitizeString across all three SDKs. Fullwidth glyphs (<, >, ’), ligatures, half-width katakana, mathematical alphanumeric symbols, and other Unicode variants collapse to their canonical ASCII forms before any pattern matches. This closes the entire fullwidth-bypass class for every detector in one pass.
Multi-decode chain
URL decoding plus HTML entity decoding, bounded at 4 passes, runs immediately after NFKC. Triple-encoded %2526lt%253Bscript%2526gt%253B resolves to plain <script> before pattern matching. The bound prevents pathological inputs from blowing up CPU.
Behavior change. After v1.6, sanitizeString("John%20Doe") returns "John Doe" instead of "John%20Doe". If you were relying on percent-encoded literals reaching downstream code, decode at your own boundary instead.
Mutation tester
A test suite that runs every payload through 8 mutators (alternating case, uppercase, URL-encode once, URL-encode twice, HTML hex entity, HTML decimal entity, HTML named entity, fullwidth ASCII) across the XSS, SQL injection, and path traversal corpora. 142 mutation checks per SDK on Python + Node today; Go variant lands in v1.7. If a future pattern change re-opens any of these bypass classes, CI fails loud.
XSS (Cross-Site Scripting)
Attackers inject malicious scripts into pages viewed by other users, stealing cookies, sessions, or defacing content.
How Arcis stops it: Strips <script>, javascript:, event handlers (onclick=), <iframe>, <object>, <embed>, SVG onload, and HTML injection vectors (<form>, <meta>, <base>, <link>). Then HTML-encodes remaining special characters.
SQL Injection
Attackers inject SQL syntax into inputs to read, modify, or destroy database data.
How Arcis stops it: Detects SQL keywords (SELECT, UNION, DROP), comment syntax (--, /*), boolean logic (OR 1=1), and time-based blind attacks (SLEEP, BENCHMARK, pg_sleep, WAITFOR DELAY). Replaces detected patterns with [BLOCKED].
Defense-in-depth only. Arcis is not a replacement for parameterized queries. Always use an ORM or prepared statements as your primary defense.
NoSQL Injection
Attackers pass MongoDB operators in JSON bodies to bypass authentication or query logic.
How Arcis stops it: Blocks 35 dangerous MongoDB operators including $gt, $where, $regex, $function, $accumulator, $expr, and aggregation pipeline operators. Dangerous keys are stripped from objects recursively (case-insensitive).
Command Injection
Attackers inject shell commands into inputs that get passed to exec(), spawn, or similar.
How Arcis stops it: Strips shell metacharacters (;, &, |, backtick, $()), URL-encoded control characters (%00 to %1F), and output redirection (>>, <<).
Path Traversal
Attackers use ../ to access files outside the intended directory.
How Arcis stops it: Strips ../, ..\\, URL-encoded variants (%2e%2e), and double-encoded sequences (%252e). Unicode-normalizes input first (NFKC) to catch fullwidth-dot bypasses. Loops until stable to catch nested sequences like ....//.
SSRF (Server-Side Request Forgery)
Attackers trick the server into making requests to internal services or cloud metadata endpoints.
How Arcis stops it: validateUrl() blocks private IPs (10.x, 172.16-31.x, 192.168.x), loopback (127.x, ::1), link-local (169.254.0.0/16, which includes AWS/GCP/Azure metadata), cloud metadata hostnames, and IP-encoding bypasses: decimal (2130706433), octal (0177.0.0.1), hex (0x7f000001), IPv6-mapped (::ffff:127.0.0.1).
CSRF (Cross-Site Request Forgery)
Attackers trick logged-in users into submitting forms to your site from a malicious page.
How Arcis stops it: csrfProtection() issues a double-submit cookie with HMAC. Token validation uses constant-time comparison to prevent timing attacks. Optionally enforces __Host- cookie prefix for maximum security.
SSTI (Server-Side Template Injection)
Attackers inject template syntax that gets evaluated server-side (Jinja2, Twig, ERB, etc.).
How Arcis stops it: Detects and strips Jinja2 ({{ }}), Twig, Freemarker (${ }), ERB / EJS (<% %>), Pug (#{ }), and Python dunder chains (__class__, __mro__, __globals__).
XXE (XML External Entity)
Attackers inject malicious XML entities to read local files or make SSRF requests.
How Arcis stops it: Strips <!DOCTYPE>, <!ENTITY>, SYSTEM/PUBLIC references, <![CDATA[]]> sections, and parameter entity syntax.
LDAP Injection
Attackers inject LDAP filter syntax to bypass authentication or read directory data.
How Arcis stops it: sanitizeLdap() escapes filter operators (*, (, ), \\, null byte) in search filters. sanitizeLdapDn() escapes DN special characters (\\, #, +, <, >, ;, ").
JSONP Injection
Attackers inject malicious JavaScript into JSONP callback names.
How Arcis stops it: sanitizeJsonpCallback() allows only safe identifier characters. Any payload with brackets, dots, or special characters beyond dotted identifiers is rejected.
Prototype Pollution
Attackers inject __proto__, constructor, or prototype keys into JSON bodies to poison JavaScript object prototypes.
How Arcis stops it: Recursively strips 7 dangerous keys (case-insensitive): __proto__, constructor, prototype, __defineGetter__, __defineSetter__, __lookupGetter__, __lookupSetter__.
Rate Limiting
Attackers flood endpoints for brute force, scraping, or denial of service.
How Arcis stops it: Three algorithms available: fixed window (simple), sliding window (smooth), token bucket (burst-tolerant). Per-IP isolation. Pluggable storage: in-memory default, Redis for multi-instance deployments. Fails open on infrastructure errors.
Bot Detection
Scrapers, crawlers, and scanner bots probe for vulnerabilities or scrape content.
How Arcis stops it: 635 patterns across 7 categories (search engines, social, monitoring, AI crawlers, scrapers, automated tools, unknown). Behavioral fingerprinting on top of user-agent matching.
Security Headers
Response headers instruct browsers to enforce additional security policies.
How Arcis stops it: Sets 16 security headers on every response, including Content-Security-Policy, Strict-Transport-Security, X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Referrer-Policy, Cross-Origin-Opener-Policy, Cross-Origin-Resource-Policy, Cross-Origin-Embedder-Policy, Origin-Agent-Cluster.
Open Redirect
Attackers use your redirect endpoints to send users to phishing sites from your domain.
How Arcis stops it: validateRedirect() allows only relative paths or whitelisted hostnames. Blocks absolute URLs, javascript:, protocol-relative (//evil.com), and backslash bypasses.
Error Leakage
Uncaught errors expose stack traces, database connection strings, file paths, and internal IPs.
How Arcis stops it: Error handler scrubs sensitive info from production responses: stack traces, DB errors, connection strings, internal IPs, framework internals, environment variable values.
HTTP Header Injection
Attackers inject CRLF into header values to split responses or inject fake headers.
How Arcis stops it: sanitizeHeaderValue() strips CRLF (\r\n), bare CR, bare LF, and null bytes from any string written to response headers.
HPP (HTTP Parameter Pollution)
Attackers send duplicate query/body parameters to bypass validation (?role=user&role=admin).
How Arcis stops it: Normalizes duplicates using last-value-wins. Original multi-value array preserved in req.queryPolluted for auditing. Per-parameter whitelist supported.
Prompt Injection (LLM-handler routes)
User-controlled text embeds jailbreak frameworks (DAN/STAN/DUDE), fake <system> tags, base64-smuggled instructions, or (v1.6) agent toolcall markers that pivot the LLM into invoking dangerous tools.
How Arcis stops it: detectPromptInjection / detect_prompt_injection covers 28 signatures across HIGH/MEDIUM/LOW tiers, plus 5 v1.6 patterns for agent toolcall injection: "tool_call" / "function_call" markers, ANSI escape sequences, Claude <tool_use> XML tags, tool-name spoofing (exec / shell / run_command), and "tool_result" markers. tokenBudget middleware caps per-key LLM token spend over a sliding window.
Modern Deserialization (v1.6)
Request bodies that look like serialized-object payloads for runtimes where deserialization equals code execution: Python pickle, Java FastJSON @type, PHP unserialize, Ruby Marshal, .NET BinaryFormatter.
How Arcis stops it: detectDeserialization(payload) / detect_deserialization(payload) returns the runtime tag ('python_pickle', 'java_fastjson', 'php_unserialize', 'ruby_marshal', 'dotnet_binary_formatter') or null. Detection-only because a forgiving parser might still deserialize the remainder to something dangerous if you strip the head bytes. Caller decides: refuse, log a security event, route to a sandboxed handler.
GraphQL Abuse (depth, alias bomb, fragment cycle)
Single GraphQL query asks for the same field thousands of times via aliases (query { a:user{} b:user{} c:user{} ... } repeated 1000x = 1000x backend cost). Or a self-referential fragment (fragment A on User { ...A }) causes the executor to spin.
How Arcis stops it: graphqlGuard rejects oversize queries, deep selection sets, introspection in production, and (v1.6) alias bombs via max_aliases (default 50) and fragment cycles via block_fragment_cycles (default true). DFS walks fragment definitions with brace-matched body extraction so spreads inside a query operation do not pollute the dependency graph.
Stateful per-IP Correlation (v1.6)
Today's middleware judges each request alone. That misses scanner sweeps (one IP firing payloads from every category), credential stuffing (dozens of distinct usernames on /login from the same IP in a minute), and race-window probes (POST /transfer + GET /balance within 200ms).
How Arcis stops it: CorrelationWindow middleware records a small rolling event log per IP (60s window, capped at 10,000 IPs, 200 events per IP, LRU eviction). Detection helpers: detect_scanner(ip), detect_credential_stuffing(ip, route), detect_race_window(ip, (a, b)). protectLogin / protectSignup / protectApi accept a correlation: { window } option to wire it through with one line and refuse with 429 + structured detection details on a hit.