Output Encoding

Introduction

Output encoding transforms data before rendering it in a specific context to prevent injection attacks. While input validation filters dangerous content, output encoding ensures that even if dangerous characters slip through, they are rendered as data rather than code. Context-sensitive encoding is critical — the same string must be encoded differently depending on whether it appears in HTML, JavaScript, CSS, or a URL.

Context-Sensitive Encoding

Different output contexts require different encoding strategies. Applying the wrong encoding is equivalent to no encoding at all.

| Context | Example | Encoding Method | Risk if Wrong | |---------|---------|----------------|---------------| | HTML Body | `USER` | HTML entity encode | HTML injection | | HTML Attribute | `` | Attribute encode | Attribute break-out | | JavaScript | `` | JavaScript string encode | XSS | | URL | `[` | URL encode | Open redirect, XSS | | CSS | `` | CSS hex encode | CSS injection |

HTML Body Context

import html

def encode_html_body(user_input):

"""Encode for HTML element content context."""

encoded = html.escape(user_input, quote=True)

# becomes

# <script>alert(1)</script>

return encoded

HTML Attribute Context

def encode_html_attribute(user_input):

"""Encode for HTML attribute values."""

# More aggressive than body encoding

replacements = {

'&': '&',

'"': '"',

"'": ''',

'<': '<',

'>': '>',

'/': '/', # Prevents attribute closing

'`': '`', # Backtick can close attributes in some browsers

}

for char, replacement in replacements.items():

user_input = user_input.replace(char, replacement)

return user_input

JavaScript Context

import json

import re

def encode_javascript_string(user_input):

"""Encode for JavaScript string context."""

# JSON encoding is safe for JS string literals

encoded = json.dumps(user_input, ensure_ascii=False)

# Additional hardening for in script blocks

encoded = encoded.replace('

return encoded

def encode_javascript_variable(user_input):

"""Hex encode each character for JS variable context."""

encoded = ''

for char in user_input:

encoded += f'\\x{ord(char):02x}'

return encoded

URL Context

from urllib.parse import quote, urlencode

def encode_url_param(user_input):

"""Encode for URL parameter values."""

return quote(user_input, safe='')

def encode_url_path(component):

"""Encode for URL path segments."""

return quote(component, safe='')

# Defense against javascript: protocol

def sanitize_url_for_href(user_input):

"""Block dangerous URL schemes."""

allowed_schemes = {'http', 'https', 'mailto', 'tel'}

user_input = user_input.strip()

# Lowercase for scheme check

for scheme in allowed_schemes:

if user_input.lower().startswith(f'{scheme}:'):

return quote(user_input, safe=':/?#[]@!$&\'()*+,;=-._~')

# Dangerous schemes

dangerous = ['javascript:', 'data:', 'vbscript:', 'file:']

for scheme in dangerous:

if scheme in user_input.lower():

return '' # Remove dangerous URLs

return quote(user_input, safe=':/?#[]@!$&\'()*+,;=-._~')

Template Engine Auto-Escaping

Modern template engines provide auto-escaping, which encodes output based on context.

Jinja2 (Python)

{# Jinja2 auto-escape enabled by default #}

Hello <script>alert(1)</script>

{# Safe filter marks as trusted HTML #}

{# WARNING: only use safe for trusted content #}

{# Escaping for specific attributes #}

{# Manual escaping functions #}

{% set encoded = user.name|e %}

React JSX

// React automatically escapes by default

function UserProfile({ user }) {

return (

{/* Automatic escaping — XSS safe */}

{user.bio}

{/* DangerouslySetInnerHTML bypasses escaping */}

{/* Manual URL encoding for href */}

Website

);

}

Common Pitfalls

Double Encoding

Input:

Output Encoding

Output Encoding

Related Articles