Finance AI

Blueprinting Text-to-SQL Assistants in Regulated Finance

January 15, 20258 min readBy Jaafar Benabderrazak

Lessons from deploying conversational analytics at Stellantis Financial Services with latency, accuracy, and compliance guardrails.

Blueprinting Text-to-SQL Assistants in Regulated Finance

Building conversational analytics systems in regulated finance requires balancing innovation with rigorous compliance. This article details the architectural decisions, technical challenges, and operational learnings from deploying AuroraQL Analyst Copilot at Stellantis Financial Services.

The Challenge

Financial analysts spend countless hours writing SQL queries to extract insights from complex datasets. For Stellantis Financial Services, this meant:

  • 40+ analysts struggling with complex ledger queries
  • Decision cycles taking days instead of hours
  • Knowledge silos preventing efficient data exploration
  • Compliance requirements demanding auditability and governance

Architecture Overview

The AuroraQL system combines several key components:

1. Semantic Parsing Layer

We implemented a retrieval-augmented generation (RAG) pipeline that:

  • Converts natural language to SQL with context awareness
  • Maintains schema understanding through vector embeddings
  • Validates queries against predefined security policies
def parse_query(natural_language_query: str) -> SQLQuery:
    # Retrieve relevant schema context
    schema_context = vector_store.retrieve(natural_language_query)

    # Generate SQL with LLM
    sql_candidate = llm.generate(
        prompt=natural_language_query,
        context=schema_context,
        temperature=0.2
    )

    # Validate and sanitize
    validated_sql = validator.check(sql_candidate)

    return validated_sql

2. Evaluation Framework

Achieving 85.6% validated query accuracy required continuous evaluation:

  • Unit tests for common query patterns
  • Regression testing on production queries
  • Human feedback loops for edge cases
  • Automated accuracy metrics tracking

3. Compliance Guardrails

In regulated finance, every query must be:

  • Auditable: Full logging of inputs, outputs, and transformations
  • Secure: Role-based access control at query execution
  • Explainable: Transparency in how SQL was generated

Technical Stack

  • AWS Bedrock: Foundation model hosting
  • LangChain: Orchestration and prompt engineering
  • AWS Lambda: Serverless compute for query processing
  • OpenSearch: Vector database for schema retrieval
  • Python: Backend orchestration and validation

Key Results

After 6 months in production:

  • 85.6% query accuracy measured through automated validation
  • Sub-20 second response times for complex analytical queries
  • 40+ analysts onboarded with compliance approvals
  • 27% reduction in decision-making cycles

Lessons Learned

1. Start with Schema Quality

The foundation of any text-to-SQL system is schema understanding. We invested heavily in:

  • Comprehensive schema documentation
  • Semantic annotations for column meanings
  • Example queries for common patterns

2. Build Evaluation First

Before optimizing accuracy, establish clear metrics:

class QueryEvaluator:
    def evaluate(self, generated_sql: str, expected_result: DataFrame) -> float:
        actual_result = execute_sql(generated_sql)
        return self.calculate_similarity(actual_result, expected_result)

3. Embrace Incremental Deployment

We rolled out in phases:

  1. Phase 1: Read-only queries on non-sensitive data
  2. Phase 2: Expanded table access with audit logging
  3. Phase 3: Full production deployment with monitoring

4. Human-in-the-Loop is Critical

Even with 85.6% accuracy, we maintain:

  • Query preview before execution
  • Confidence scores displayed to users
  • Easy escalation to SQL experts

Future Directions

We're exploring:

  • Multi-turn conversations for complex analytical workflows
  • Query optimization suggestions based on execution patterns
  • Anomaly detection for unusual data access patterns
  • Integration with BI tools for seamless analyst experience

Conclusion

Building text-to-SQL assistants for regulated finance demands more than just LLM integration. It requires thoughtful architecture, rigorous evaluation, and deep respect for compliance requirements.

The AuroraQL system proves that with the right guardrails, conversational analytics can empower financial analysts while maintaining the security and auditability that regulated industries demand.


Want to learn more? Check out the GitHub repository for code samples and architectural diagrams.

Questions or feedback? Connect with me on LinkedIn or read the full article on Medium.

Enjoyed this article?

Check out more technical deep dives on AI systems, or connect with me to discuss your AI initiatives.

AI copilot

Ask about Jaafar’s AI projects, articles, or experiments.

AI copilot

Ask about Jaafar’s AI projects, articles, or experiments.