The Risk of File Uploads
File upload functionality is one of the most dangerous features an application can expose. An unrestricted file upload can lead to remote code execution, malware distribution, data breaches, and server compromise. Every file upload endpoint must be treated as a critical attack surface.
Threat Model
| Attack | Description |
|--------|-------------|
| Malicious file upload | Attacker uploads a PHP shell or executable |
| File size DoS | Huge files exhaust disk space or memory |
| Path traversal | Filename manipulates directory traversal |
| MIME type spoofing | File extension does not match content |
| Malware distribution | Legitimate-looking files containing malware |
| Zip bombs | Compressed archive that expands to enormous size |
| SSRF via file processing | Server-side parsing of attacker-controlled files |
Validation Strategy
1. Validate File Extension
Allowlist-based validation is essential. Blocklisting (e.g., rejecting `.exe` files) will always miss edge cases.
ALLOWED_EXTENSIONS = {
# Images
'.jpg', '.jpeg', '.png', '.gif', '.webp', '.svg',
# Documents
'.pdf', '.doc', '.docx', '.xls', '.xlsx',
# Other
'.txt', '.csv'
}
def validate_extension(filename):
ext = os.path.splitext(filename)[1].lower()
if ext not in ALLOWED_EXTENSIONS:
raise ValueError(f"Extension {ext} not allowed")
2. Validate MIME Type
Never trust the `Content-Type` header from the client. Inspect the actual file content:
import magic
def validate_mime(file_stream):
mime = magic.from_buffer(file_stream.read(2048), mime=True)
file_stream.seek(0)
ALLOWED_MIMES = {
'image/jpeg', 'image/png', 'image/gif',
'image/webp', 'application/pdf',
'text/plain', 'text/csv'
}
if mime not in ALLOWED_MIMES:
raise ValueError(f"MIME type {mime} not allowed")
3. Validate File Size
Enforce strict limits at multiple layers:
// Express middleware
const multer = require('multer');
const upload = multer({
limits: {
fileSize: 10 * 1024 * 1024, // 10 MB
files: 1
},
fileFilter: (req, file, cb) => {
const allowed = ['image/jpeg', 'image/png', 'application/pdf'];
if (!allowed.includes(file.mimetype)) {
return cb(new Error('Invalid file type'), false);
}
cb(null, true);
}
});
4. Validate File Content
For images, attempt to re-process them. This strips embedded metadata and breaks hidden payloads:
from PIL import Image
import io
def sanitize_image(file_bytes):
"""Re-encode image to strip metadata and break embedded payloads."""
img = Image.open(io.BytesIO(file_bytes))
# Convert to ensure clean output
img = img.convert('RGB')
output = io.BytesIO()
img.save(output, format='PNG')
return output.getvalue()
Secure Storage
Never Store in Webroot
Storing uploaded files inside the web server's document root is dangerous. If the filename or path is guessable, files can be accessed directly.
# UNSAFE: Stored in webroot
upload_dir = '/var/www/html/uploads/'
# SAFE: Outside webroot
upload_dir = '/data/uploads/' # Served through app logic
Generate Safe Filenames
Never use user-provided filenames. Generate random filenames:
import uuid
import os
def safe_filename(original_name):
ext = os.path.splitext(original_name)[1].lower()
return f"{uuid.uuid4().hex}{ext}"
Object Storage
For production systems, use object storage with pre-signed URLs:
import boto3
s3 = boto3.client('s3')
def upload_file(file_bytes, content_type):
key = f"uploads/{uuid.uuid4().hex}.pdf"
s3.put_object(
Bucket='my-app-uploads',
Key=key,
Body=file_bytes,
ContentType=content_type
)
return key
def get_download_url(key, expires=3600):
return s3.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-app-uploads', 'Key': key},
ExpiresIn=expires
)
Antivirus Scanning
Integrate virus scanning into the upload pipeline:
import subprocess
def scan_file(file_path):
result = subprocess.run(
['clamscan', '--stdout', file_path],
capture_output=True,
text=True
)
if 'FOUND' in result.stdout:
os.remove(file_path)
raise SecurityError("Malware detected in uploaded file")
Storage Limits and Quotas
| Level | Limit | Mitigation |
|-------|-------|------------|
| Per file | 10 MB | Reject on client and server |
| Per user | 500 MB | Track in database, reject when exceeded |
| Per day (total) | 5 GB | Aggregate monitoring, rate limit |
| Disk usage | 80% capacity | Alert, stop accepting uploads |
Server Configuration
Prevent uploaded files from being executed by the web server:
# Nginx configuration
location /uploads/ {
# Only serve specific file types
location ~* \.(jpg|jpeg|png|gif|pdf)$ {
add_header Content-Disposition 'attachment';
expires 30d;
}
# Deny everything else
location ~* \. {
deny all;
return 404;
}
}
Summary
Secure file upload requires defense in depth. Validate extensions and MIME types server-side, sanitize images by re-encoding them, generate random filenames, store files outside the webroot or in object storage, enforce strict size limits at multiple layers, and scan for malware. Never trust client-provided metadata, and process uploaded files with minimal privileges in isolated environments.