Basic prompt engineering — "be specific" and "give examples" — gets you to 70% quality. The remaining 30% requires advanced techniques that most developers never learn. This guide covers the techniques that actually move the needle in production: structured prompts with XML tags, chain-of-thought orchestration, few-shot example design, and multi-turn conversation strategies. Each technique includes before/after comparisons with real code generation outputs.
Advanced Techniques Overview
| Technique | Quality Gain | Cost Impact | Best For |
|---|---|---|---|
| XML-structured prompts | +15-25% | +10% token overhead | Complex instructions, multi-step tasks |
| Few-shot with curated examples | +10-30% | +20-50% input tokens | Style matching, format adherence |
| Chain-of-thought orchestration | +20-40% | +30-80% output tokens | Complex reasoning, debugging |
| Multi-turn refinement | +15-25% | +50-200% total tokens | Iterative code reviews, design refinement |
| System prompt engineering | +10-20% | Negligible | Consistent behavior across sessions |
| Self-consistency (multiple samples) | +5-15% | 3-5x cost | High-stakes decisions, critical code |
XML-Structured Prompts
Best for: Separating instructions, context, examples, and output format when the LLM needs to process multiple types of information. Why it works: LLMs trained on HTML/XML data treat XML tags as structural delimiters, reducing confusion between different prompt components.
<system>
You are an expert code reviewer specializing in security vulnerabilities.
</system>
<context>
The codebase is a Next.js 15 SaaS app handling payment processing.
</context>
<task>
Review this code for security issues. Focus on: SQL injection, XSS, auth bypass, CSRF.
</task>
<code>
{todo: paste_code_here}
</code>
<output_format>
For each vulnerability:
- SEVERITY: Critical/High/Medium/Low
- LINE: affected line numbers
- ISSUE: what is wrong
- FIX: exact code to fix it
</output_format>
Few-Shot Example Design
The quality of few-shot examples matters more than quantity. 3 perfect examples outperform 10 mediocre ones:
- Match the target distribution: If your users ask about Python 80% of the time, your examples should be 80% Python
- Include edge cases: Show at least one example where the answer is "I do not know" or "this is not possible" to prevent hallucination
- Show your formatting in examples: If you want code blocks with language tags, your examples must include them
- Progressive complexity: Order examples from simple to complex — LLMs pay more attention to the last example
Chain-of-Thought for Code Generation
For complex coding tasks, explicitly ask the model to plan before writing:
Before writing any code, first output a plan with:
1. What files need to be created or modified
2. What libraries/dependencies are needed
3. The data flow from request to response
4. Error states to handle
5. Testing approach
Then write the code. For each file, explain:
- Why this file exists (its responsibility)
- What it depends on
- One edge case it handles
Bottom line: Advanced prompt engineering is about structure, not magic words. XML delimiters, curated examples, and explicit reasoning steps produce the biggest quality gains. The best prompt engineers treat prompts like code — version controlled, tested, and iteratively improved with A/B comparisons. See also: Prompt Engineering Basics and AI API Integration Guide.