Introduction


Natural language to SQL (NL2SQL) is one of the most impactful applications of LLMs in data analytics. It enables non-technical users to query databases using plain English, democratizing data access across organizations. While early NL2SQL systems were fragile, modern LLMs can generate accurate SQL for complex analytical queries — when properly configured with schema context, safety guards, and validation.


How NL2SQL Works


The core architecture is straightforward:



User Question → Context Assembly → LLM SQL Generation → Query Validation → Execution → Result Formatting


The critical step is **context assembly** — providing the LLM with enough database metadata to generate correct SQL.


Schema Context


The LLM needs to understand your database schema. A well-structured schema prompt includes:



CREATE TABLE customers (

    customer_id INTEGER PRIMARY KEY,

    first_name VARCHAR(100),

    last_name VARCHAR(100),

    email VARCHAR(255),

    signup_date DATE,

    country VARCHAR(50),

    lifetime_value DECIMAL(10,2)

);



CREATE TABLE orders (

    order_id INTEGER PRIMARY KEY,

    customer_id INTEGER REFERENCES customers(customer_id),

    order_date TIMESTAMP,

    total_amount DECIMAL(10,2),

    status VARCHAR(20) -- pending, shipped, delivered, cancelled

);



CREATE TABLE order_items (

    item_id INTEGER PRIMARY KEY,

    order_id INTEGER REFERENCES orders(order_id),

    product_name VARCHAR(200),

    quantity INTEGER,

    unit_price DECIMAL(10,2)

);


**Key additions for better accuracy:**


  • Column descriptions: `-- customer's two-letter ISO country code`
  • Value examples: `-- status options: pending, shipped, delivered, cancelled`
  • Relationship hints: `-- join via customer_id to get customer details`
  • Index hints: `-- indexed on order_date for range queries`

  • 
    def build_schema_context(tables):
    
        context_parts = []
    
        for table in tables:
    
            ddl = f"CREATE TABLE {table.name} (\n"
    
            for col in table.columns:
    
                ddl += f"    {col.name} {col.type} -- {col.description}\n"
    
            ddl += ");"
    
            context_parts.append(ddl)
    
            # Add sample rows for context on data patterns
    
            if table.sample_rows:
    
                context_parts.append(f"-- Sample row: {table.sample_rows[0]}")
    
        return "\n\n".join(context_parts)
    
    

    Few-Shot Examples


    For complex schemas, provide question-SQL pairs relevant to the query:


    
    few_shot_examples = [
    
        {
    
            "question": "Show me the top 5 customers by total spending",
    
            "query": """
    
    SELECT
    
        c.first_name || ' ' || c.last_name AS customer_name,
    
        SUM(o.total_amount) AS total_spent
    
    FROM customers c
    
    JOIN orders o ON c.customer_id = o.customer_id
    
    WHERE o.status != 'cancelled'
    
    GROUP BY c.customer_id, c.first_name, c.last_name
    
    ORDER BY total_spent DESC
    
    LIMIT 5
    
    """
    
        },
    
        {
    
            "question": "What was the monthly revenue for 2025?",
    
            "query": """
    
    SELECT
    
        DATE_TRUNC('month', o.order_date) AS month,
    
        SUM(o.total_amount) AS revenue
    
    FROM orders o
    
    WHERE o.status = 'delivered'
    
      AND o.order_date >= '2025-01-01'
    
      AND o.order_date < '2026-01-01'
    
    GROUP BY month
    
    ORDER BY month
    
    """
    
        }
    
    ]
    
    

    **Dynamic example retrieval** using embedding similarity improves accuracy significantly. Store question-query pairs in a vector database and retrieve the most relevant ones for each new question.


    SQL Generation Prompt


    The complete prompt combines schema, examples, and safety instructions:


    
    def build_nl2sql_prompt(question, schema, examples, dialect="postgresql"):
    
        prompt = f"""You are a {dialect} SQL expert. Convert natural language questions into SQL queries.
    
    
    
    Database Schema:
    
    {schema}
    
    
    
    Rules:
    
    1. Return ONLY the SQL query, no explanations
    
    2. Use proper JOINs based on foreign key relationships
    
    3. Handle NULLs appropriately with COALESCE or IS NULL checks
    
    4. Use appropriate aggregate functions (COUNT, SUM, AVG) when needed
    
    5. Always include filters to exclude irrelevant data
    
    6. Use LIMIT for result size control
    
    7. Order results when ordering is implied
    
    8. Never use DELETE, UPDATE, INSERT, DROP, or ALTER statements
    
    
    
    Examples:
    
    """
    
        for ex in examples:
    
            prompt += f"\nQ: {ex['question']}\nSQL: {ex['query']}\n"
    
    
    
        prompt += f"\nQ: {question}\nSQL:"
    
        return prompt
    
    

    Query Validation


    Before executing generated SQL, validate it:


    
    def validate_sql(query, allowed_tables):
    
        errors = []
    
    
    
        # Safety check: no mutations
    
        forbidden_keywords = ["DELETE", "UPDATE", "INSERT", "DROP", "ALTER", "TRUNCATE", "CREATE"]
    
        for keyword in forbidden_keywords:
    
            if keyword in query.upper():
    
                errors.append(f"Forbidden operation: {keyword}")
    
    
    
        # Check table references
    
        parsed = sqlparse.parse(query)[0]
    
        used_tables = extract_tables(parsed)
    
        for table in used_tables:
    
            if table not in allowed_tables:
    
                errors.append(f"Unauthorized table: {table}")
    
    
    
        # Syntax validation
    
        try:
    
            conn = get_dummy_connection(dialect)
    
            conn.execute(f"EXPLAIN {query}")
    
        except Exception as e:
    
            errors.append(f"SQL syntax error: {str(e)}")
    
    
    
        return errors
    
    

    Advanced Techniques


    Self-Correction Loop


    When the generated SQL fails or returns empty results, the system can self-correct:


    
    def nl2sql_with_correction(question, schema, max_attempts=3):
    
        for attempt in range(max_attempts):
    
            query = generate_sql(question, schema)
    
            errors = validate_sql(query)
    
    
    
            if not errors:
    
                try:
    
                    results = execute_sql(query)
    
                    if results:
    
                        return format_results(question, query, results)
    
                    else:
    
                        feedback = "The query returned no results. Try a broader search."
    
                except Exception as e:
    
                    feedback = f"Execution error: {str(e)}"
    
            else:
    
                feedback = f"Validation errors: {', '.join(errors)}"
    
    
    
            question = f"Original question: {question}\nPrevious attempt failed: {feedback}\nPlease fix the SQL query."
    
    

    Schema Linking


    For large schemas with many tables, first identify relevant tables before generating SQL:


    
    def identify_relevant_tables(question, all_tables, table_descriptions):
    
        prompt = f"""
    
        Question: {question}
    
        Available tables with descriptions:
    
        {table_descriptions}
    
    
    
        List only the tables needed to answer this question, comma-separated:
    
        """
    
        response = call_llm(prompt)
    
        return [t.strip() for t in response.split(",")]
    
    

    This dramatically reduces context size and improves accuracy on large databases.


    Production Deployment


    **Security**: Always use a read-only database user, set statement timeouts, and implement rate limiting per user.


    **Caching**: Cache common NL2SQL conversions to reduce LLM costs, but invalidate when schema changes.


    **Monitoring**: Log all generated queries and user feedback. Track query success rate, execution time, and correction frequency.


    **User experience**: Show the generated SQL to users (with a "view query" option), and allow them to provide feedback on results.


    Conclusion


    Natural language to SQL with LLMs is production-ready for analytical queries. The key success factors are high-quality schema context, relevant few-shot examples, robust query validation, and a self-correction loop. Start with a limited set of tables, measure query accuracy, and expand as you gain confidence. This technology is transforming data access, making self-serve analytics a reality for non-technical team members.