Natural Language to SQL: How AI Translates Your Questions Into Database Queries
SQL has been the standard language for querying relational databases since the 1970s. It is powerful, precise, and utterly inaccessible to the vast majority of business professionals. Natural language to SQL (NL-to-SQL) changes this by using AI to convert plain English questions into executable SQL queries — making databases accessible to anyone who can type a question.
What Is Natural Language to SQL?
Natural language to SQL is a technology that takes a question written in plain English (or any human language) and converts it into a valid SQL query that can run against a relational database. For example:
Natural language:
“What were our top 10 customers by total spend last year?”
Generated SQL:
SELECT customer_name, SUM(order_total) AS total_spend FROM orders WHERE order_date >= '2025-01-01' AND order_date < '2026-01-01' GROUP BY customer_name ORDER BY total_spend DESC LIMIT 10;
The user never sees or writes SQL. They ask a question, the AI generates and executes the query, and results are returned as formatted tables, charts, or narrative summaries.
How NL-to-SQL Works Under the Hood
Modern NL-to-SQL systems use large language models (LLMs) as their core engine. The process involves several stages:
1. Schema Injection
The AI receives a description of the database schema — table names, column names, data types, foreign key relationships, and sometimes sample values. This gives the model the vocabulary it needs to generate valid queries. Effective schema injection is one of the most important factors in NL-to-SQL accuracy.
2. Intent Parsing
The model analyzes the user's question to identify the analytical intent: Is this an aggregation? A filter? A comparison between groups? A time-series trend? A join across tables? The model must also resolve ambiguity — when a user says “last quarter,” the model needs to calculate the correct date range based on the current date.
3. SQL Generation
The LLM generates a SQL query that answers the question. This is the most visible step but depends entirely on the quality of steps 1 and 2. The generated SQL must be syntactically correct for the target database dialect (PostgreSQL, MySQL, and SQL Server all have slightly different syntax), semantically correct (querying the right tables and columns), and operationally safe (no mutations, no excessive resource consumption).
4. Validation and Execution
Before execution, well-designed systems validate the generated query. This includes checking that only SELECT statements are generated (preventing writes or deletes), verifying that referenced tables and columns exist, applying query timeouts to prevent runaway operations, and ensuring the query respects access permissions.
Accuracy Considerations
NL-to-SQL accuracy has improved dramatically since 2023. On the Spider benchmark — the standard academic benchmark for text-to-SQL — top models now achieve above 85% execution accuracy on complex queries. However, real-world accuracy depends on several factors:
- Schema quality. Clear, descriptive column names significantly improve accuracy. A column named “rev” is harder for AI to interpret than “monthly_revenue_usd.”
- Question complexity. Simple aggregations and filters achieve near-perfect accuracy. Multi-table joins, subqueries, and window functions are more challenging.
- Domain context. Business-specific terminology (“churn rate,” “ARR,” “MoM growth”) requires the AI to understand domain definitions, not just SQL syntax.
- Ambiguity handling. Questions like “Show me recent sales” are inherently ambiguous. Good systems ask for clarification or make reasonable assumptions and state them explicitly.
How DEX Implements NL-to-SQL
DEX AI uses Claude Sonnet 4.6 via Azure AI Foundry as its NL-to-SQL engine. When a user asks a question in Slack, Teams, or the DEX web interface, the system:
- Retrieves the schema of the connected database (PostgreSQL, MySQL, or SQL Server) or the structure of the uploaded dataset (CSV or Excel columns and types).
- Constructs a prompt that includes the schema, the user's question, any relevant semantic definitions configured by the team (custom business terms), and conversation context for follow-up questions.
- Generates a SQL query (for databases) or equivalent data operations (for file-based datasets).
- Validates the query — ensuring it is read-only, references valid schema elements, and includes appropriate limits.
- Executes the query with connection-level timeouts and row limits.
- Returns results with the generated SQL shown for transparency, a chart if the data lends itself to visualization, and a narrative summary of the findings.
Semantic Definitions
One of DEX's distinguishing features is support for semantic definitions — also called an ontology layer. Teams can define business terms that map to specific database calculations. For example, you might define “churn rate” as “the number of customers who cancelled divided by the total number of active customers at the start of the period.” When any team member asks about churn rate, DEX applies the organization's official definition consistently.
Example Translations
“How many new users signed up each month this year?”
SELECT DATE_TRUNC('month', created_at) AS month,
COUNT(*) AS new_users
FROM users
WHERE created_at >= '2026-01-01'
GROUP BY month
ORDER BY month;“What percentage of orders are from returning customers?”
SELECT
ROUND(100.0 * COUNT(CASE WHEN order_count > 1 THEN 1 END)
/ COUNT(*), 1) AS returning_pct
FROM (
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
) sub;“Which products have declining sales over the last 3 months?”
WITH monthly AS (
SELECT product_id, product_name,
DATE_TRUNC('month', order_date) AS month,
SUM(quantity) AS units
FROM order_items
JOIN products USING (product_id)
WHERE order_date >= CURRENT_DATE - INTERVAL '3 months'
GROUP BY product_id, product_name, month
)
SELECT product_name
FROM monthly
GROUP BY product_id, product_name
HAVING REGR_SLOPE(units, EXTRACT(EPOCH FROM month)) < 0;Benefits for Non-Technical Users
The primary value of NL-to-SQL is democratization. In most organizations, fewer than 10% of employees know SQL. The remaining 90% depend on those who do — creating bottlenecks, delays, and frustration. NL-to-SQL eliminates this bottleneck. A product manager can ask about feature adoption. A CFO can ask about budget variance. A customer success lead can ask about renewal rates. None of them need to learn SQL, submit a ticket, or wait for an analyst.
Security Considerations
NL-to-SQL introduces unique security considerations that responsible platforms must address:
- Read-only enforcement. Generated queries must be restricted to SELECT statements. DEX enforces this at both the AI prompt level and the database connection level.
- Injection prevention. The AI must not be tricked into generating malicious SQL through adversarial prompts. DEX uses parameterized query validation to prevent SQL injection.
- Access control. Not every user should be able to query every table. DEX's role-based access control ensures users only see data they are authorized to access.
- Audit logging. Every generated query is logged with the user who requested it, the question asked, the SQL generated, and the timestamp. This creates a complete audit trail for compliance.
- Credential encryption. Database credentials stored in DEX are encrypted with 256-bit AES. Credentials are never exposed in logs, error messages, or API responses.
Query your database in plain English
Connect PostgreSQL, MySQL, or SQL Server to DEX and start asking questions in Slack or Teams. Free plan available.
Try DEX free