Output Encoding


Introduction

Output encoding transforms data before rendering it in a specific context to prevent injection attacks. While input validation filters dangerous content, output encoding ensures that even if dangerous characters slip through, they are rendered as data rather than code. Context-sensitive encoding is critical — the same string must be encoded differently depending on whether it appears in HTML, JavaScript, CSS, or a URL.

Context-Sensitive Encoding

Different output contexts require different encoding strategies. Applying the wrong encoding is equivalent to no encoding at all.

| Context | Example | Encoding Method | Risk if Wrong | |---------|---------|----------------|---------------| | HTML Body | `USER` | HTML entity encode | HTML injection | | HTML Attribute | `` | Attribute encode | Attribute break-out | | JavaScript | `` | JavaScript string encode | XSS | | URL | `[` | URL encode | Open redirect, XSS | | CSS | `` | CSS hex encode | CSS injection |

HTML Body Context




import html




def encode_html_body(user_input):


"""Encode for HTML element content context."""


encoded = html.escape(user_input, quote=True)


# becomes


# <script>alert(1)</script>


return encoded





HTML Attribute Context




def encode_html_attribute(user_input):


"""Encode for HTML attribute values."""


# More aggressive than body encoding


replacements = {


'&': '&',


'"': '"',


"'": ''',


'<': '<',


'>': '>',


'/': '/', # Prevents attribute closing


'`': '`', # Backtick can close attributes in some browsers


}


for char, replacement in replacements.items():


user_input = user_input.replace(char, replacement)


return user_input





JavaScript Context




import json


import re




def encode_javascript_string(user_input):


"""Encode for JavaScript string context."""


# JSON encoding is safe for JS string literals


encoded = json.dumps(user_input, ensure_ascii=False)




# Additional hardening for in script blocks


encoded = encoded.replace('


return encoded




def encode_javascript_variable(user_input):


"""Hex encode each character for JS variable context."""


encoded = ''


for char in user_input:


encoded += f'\\x{ord(char):02x}'


return encoded





URL Context




from urllib.parse import quote, urlencode




def encode_url_param(user_input):


"""Encode for URL parameter values."""


return quote(user_input, safe='')




def encode_url_path(component):


"""Encode for URL path segments."""


return quote(component, safe='')




# Defense against javascript: protocol


def sanitize_url_for_href(user_input):


"""Block dangerous URL schemes."""


allowed_schemes = {'http', 'https', 'mailto', 'tel'}


user_input = user_input.strip()




# Lowercase for scheme check


for scheme in allowed_schemes:


if user_input.lower().startswith(f'{scheme}:'):


return quote(user_input, safe=':/?#[]@!$&\'()*+,;=-._~')




# Dangerous schemes


dangerous = ['javascript:', 'data:', 'vbscript:', 'file:']


for scheme in dangerous:


if scheme in user_input.lower():


return '' # Remove dangerous URLs




return quote(user_input, safe=':/?#[]@!$&\'()*+,;=-._~')





Template Engine Auto-Escaping

Modern template engines provide auto-escaping, which encodes output based on context.

Jinja2 (Python)




{# Jinja2 auto-escape enabled by default #}


{{ user.bio }}


{#

Hello <script>alert(1)</script>

#}




{# Safe filter marks as trusted HTML #}


{{ user.bio|safe }}


{# WARNING: only use safe for trusted content #}




{# Escaping for specific attributes #}





{# Manual escaping functions #}


{% set encoded = user.name|e %}





React JSX




// React automatically escapes by default


function UserProfile({ user }) {


return (



{/* Automatic escaping — XSS safe */}


{user.bio}




{/* DangerouslySetInnerHTML bypasses escaping */}





{/* Manual URL encoding for href */}


Website



);


}





Common Pitfalls

Double Encoding




Input: