Introduction
Natural language to SQL (NL2SQL) is one of the most impactful applications of LLMs in data analytics. It enables non-technical users to query databases using plain English, democratizing data access across organizations. While early NL2SQL systems were fragile, modern LLMs can generate accurate SQL for complex analytical queries — when properly configured with schema context, safety guards, and validation.
How NL2SQL Works
The core architecture is straightforward:
User Question → Context Assembly → LLM SQL Generation → Query Validation → Execution → Result Formatting
The critical step is **context assembly** — providing the LLM with enough database metadata to generate correct SQL.
Schema Context
The LLM needs to understand your database schema. A well-structured schema prompt includes:
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
email VARCHAR(255),
signup_date DATE,
country VARCHAR(50),
lifetime_value DECIMAL(10,2)
);
CREATE TABLE orders (
order_id INTEGER PRIMARY KEY,
customer_id INTEGER REFERENCES customers(customer_id),
order_date TIMESTAMP,
total_amount DECIMAL(10,2),
status VARCHAR(20) -- pending, shipped, delivered, cancelled
);
CREATE TABLE order_items (
item_id INTEGER PRIMARY KEY,
order_id INTEGER REFERENCES orders(order_id),
product_name VARCHAR(200),
quantity INTEGER,
unit_price DECIMAL(10,2)
);
**Key additions for better accuracy:**
def build_schema_context(tables):
context_parts = []
for table in tables:
ddl = f"CREATE TABLE {table.name} (\n"
for col in table.columns:
ddl += f" {col.name} {col.type} -- {col.description}\n"
ddl += ");"
context_parts.append(ddl)
# Add sample rows for context on data patterns
if table.sample_rows:
context_parts.append(f"-- Sample row: {table.sample_rows[0]}")
return "\n\n".join(context_parts)
Few-Shot Examples
For complex schemas, provide question-SQL pairs relevant to the query:
few_shot_examples = [
{
"question": "Show me the top 5 customers by total spending",
"query": """
SELECT
c.first_name || ' ' || c.last_name AS customer_name,
SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status != 'cancelled'
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_spent DESC
LIMIT 5
"""
},
{
"question": "What was the monthly revenue for 2025?",
"query": """
SELECT
DATE_TRUNC('month', o.order_date) AS month,
SUM(o.total_amount) AS revenue
FROM orders o
WHERE o.status = 'delivered'
AND o.order_date >= '2025-01-01'
AND o.order_date < '2026-01-01'
GROUP BY month
ORDER BY month
"""
}
]
**Dynamic example retrieval** using embedding similarity improves accuracy significantly. Store question-query pairs in a vector database and retrieve the most relevant ones for each new question.
SQL Generation Prompt
The complete prompt combines schema, examples, and safety instructions:
def build_nl2sql_prompt(question, schema, examples, dialect="postgresql"):
prompt = f"""You are a {dialect} SQL expert. Convert natural language questions into SQL queries.
Database Schema:
{schema}
Rules:
1. Return ONLY the SQL query, no explanations
2. Use proper JOINs based on foreign key relationships
3. Handle NULLs appropriately with COALESCE or IS NULL checks
4. Use appropriate aggregate functions (COUNT, SUM, AVG) when needed
5. Always include filters to exclude irrelevant data
6. Use LIMIT for result size control
7. Order results when ordering is implied
8. Never use DELETE, UPDATE, INSERT, DROP, or ALTER statements
Examples:
"""
for ex in examples:
prompt += f"\nQ: {ex['question']}\nSQL: {ex['query']}\n"
prompt += f"\nQ: {question}\nSQL:"
return prompt
Query Validation
Before executing generated SQL, validate it:
def validate_sql(query, allowed_tables):
errors = []
# Safety check: no mutations
forbidden_keywords = ["DELETE", "UPDATE", "INSERT", "DROP", "ALTER", "TRUNCATE", "CREATE"]
for keyword in forbidden_keywords:
if keyword in query.upper():
errors.append(f"Forbidden operation: {keyword}")
# Check table references
parsed = sqlparse.parse(query)[0]
used_tables = extract_tables(parsed)
for table in used_tables:
if table not in allowed_tables:
errors.append(f"Unauthorized table: {table}")
# Syntax validation
try:
conn = get_dummy_connection(dialect)
conn.execute(f"EXPLAIN {query}")
except Exception as e:
errors.append(f"SQL syntax error: {str(e)}")
return errors
Advanced Techniques
Self-Correction Loop
When the generated SQL fails or returns empty results, the system can self-correct:
def nl2sql_with_correction(question, schema, max_attempts=3):
for attempt in range(max_attempts):
query = generate_sql(question, schema)
errors = validate_sql(query)
if not errors:
try:
results = execute_sql(query)
if results:
return format_results(question, query, results)
else:
feedback = "The query returned no results. Try a broader search."
except Exception as e:
feedback = f"Execution error: {str(e)}"
else:
feedback = f"Validation errors: {', '.join(errors)}"
question = f"Original question: {question}\nPrevious attempt failed: {feedback}\nPlease fix the SQL query."
Schema Linking
For large schemas with many tables, first identify relevant tables before generating SQL:
def identify_relevant_tables(question, all_tables, table_descriptions):
prompt = f"""
Question: {question}
Available tables with descriptions:
{table_descriptions}
List only the tables needed to answer this question, comma-separated:
"""
response = call_llm(prompt)
return [t.strip() for t in response.split(",")]
This dramatically reduces context size and improves accuracy on large databases.
Production Deployment
**Security**: Always use a read-only database user, set statement timeouts, and implement rate limiting per user.
**Caching**: Cache common NL2SQL conversions to reduce LLM costs, but invalidate when schema changes.
**Monitoring**: Log all generated queries and user feedback. Track query success rate, execution time, and correction frequency.
**User experience**: Show the generated SQL to users (with a "view query" option), and allow them to provide feedback on results.
Conclusion
Natural language to SQL with LLMs is production-ready for analytical queries. The key success factors are high-quality schema context, relevant few-shot examples, robust query validation, and a self-correction loop. Start with a limited set of tables, measure query accuracy, and expand as you gain confidence. This technology is transforming data access, making self-serve analytics a reality for non-technical team members.