Blueprinting Text-to-SQL Assistants in Regulated Finance

Building conversational analytics systems in regulated finance requires balancing innovation with rigorous compliance. This article details the architectural decisions, technical challenges, and operational learnings from deploying AuroraQL Analyst Copilot at Stellantis Financial Services.

The Challenge

Financial analysts spend countless hours writing SQL queries to extract insights from complex datasets. For Stellantis Financial Services, this meant:

40+ analysts struggling with complex ledger queries
Decision cycles taking days instead of hours
Knowledge silos preventing efficient data exploration
Compliance requirements demanding auditability and governance

Architecture Overview

The AuroraQL system combines several key components:

1. Semantic Parsing Layer

We implemented a retrieval-augmented generation (RAG) pipeline that:

Converts natural language to SQL with context awareness
Maintains schema understanding through vector embeddings
Validates queries against predefined security policies

def parse_query(natural_language_query: str) -> SQLQuery:
    # Retrieve relevant schema context
    schema_context = vector_store.retrieve(natural_language_query)

    # Generate SQL with LLM
    sql_candidate = llm.generate(
        prompt=natural_language_query,
        context=schema_context,
        temperature=0.2
    )

    # Validate and sanitize
    validated_sql = validator.check(sql_candidate)

    return validated_sql

2. Evaluation Framework

Achieving 85.6% validated query accuracy required continuous evaluation:

Unit tests for common query patterns
Regression testing on production queries
Human feedback loops for edge cases
Automated accuracy metrics tracking

3. Compliance Guardrails

In regulated finance, every query must be:

Auditable: Full logging of inputs, outputs, and transformations
Secure: Role-based access control at query execution
Explainable: Transparency in how SQL was generated

Technical Stack

AWS Bedrock: Foundation model hosting
LangChain: Orchestration and prompt engineering
AWS Lambda: Serverless compute for query processing
OpenSearch: Vector database for schema retrieval
Python: Backend orchestration and validation

Key Results

After 6 months in production:

✅ 85.6% query accuracy measured through automated validation
✅ Sub-20 second response times for complex analytical queries
✅ 40+ analysts onboarded with compliance approvals
✅ 27% reduction in decision-making cycles

Lessons Learned

1. Start with Schema Quality

The foundation of any text-to-SQL system is schema understanding. We invested heavily in:

Comprehensive schema documentation
Semantic annotations for column meanings
Example queries for common patterns

2. Build Evaluation First

Before optimizing accuracy, establish clear metrics:

class QueryEvaluator:
    def evaluate(self, generated_sql: str, expected_result: DataFrame) -> float:
        actual_result = execute_sql(generated_sql)
        return self.calculate_similarity(actual_result, expected_result)

3. Embrace Incremental Deployment

We rolled out in phases:

Phase 1: Read-only queries on non-sensitive data
Phase 2: Expanded table access with audit logging
Phase 3: Full production deployment with monitoring

4. Human-in-the-Loop is Critical

Even with 85.6% accuracy, we maintain:

Query preview before execution
Confidence scores displayed to users
Easy escalation to SQL experts

Future Directions

We're exploring:

Multi-turn conversations for complex analytical workflows
Query optimization suggestions based on execution patterns
Anomaly detection for unusual data access patterns
Integration with BI tools for seamless analyst experience

Conclusion

Building text-to-SQL assistants for regulated finance demands more than just LLM integration. It requires thoughtful architecture, rigorous evaluation, and deep respect for compliance requirements.

The AuroraQL system proves that with the right guardrails, conversational analytics can empower financial analysts while maintaining the security and auditability that regulated industries demand.

Want to learn more? Check out the GitHub repository for code samples and architectural diagrams.

Questions or feedback? Connect with me on LinkedIn or read the full article on Medium.