Mobility

How Retrieval-Augmented LLMs Transform Fleet Diagnostics

January 10, 20257 min readBy Jaafar Benabderrazak

A field report on pairing sensor data with knowledge bases to power explainable maintenance copilots for connected vehicles.

How Retrieval-Augmented LLMs Transform Fleet Diagnostics

Connected vehicles generate terabytes of sensor data daily. The challenge isn't collecting this data—it's transforming it into actionable maintenance insights that technicians can understand and trust. This article explores how we built VectraDrive Diagnostics at Stellantis R&D.

The Problem Space

Modern vehicles have hundreds of sensors monitoring everything from engine temperature to tire pressure. When something goes wrong:

  • Technicians receive cryptic error codes
  • Manuals span thousands of pages
  • Diagnosis takes hours of investigative work
  • Misdiagnosis costs €1.1M annually in warranty claims

Solution Architecture

VectraDrive combines real-time telemetry with LLM-powered reasoning to provide explainable diagnostics.

Component 1: Telemetry Ingestion

class TelemetryProcessor:
    def process_stream(self, vehicle_id: str, sensor_data: Dict) -> Anomaly:
        # Normalize sensor readings
        normalized = self.normalize(sensor_data)

        # Detect anomalies using statistical models
        anomalies = self.detect_anomalies(normalized)

        # Enrich with vehicle context
        enriched = self.add_vehicle_context(vehicle_id, anomalies)

        return enriched

Component 2: Knowledge Retrieval

We indexed technical manuals, repair procedures, and historical diagnostics in a vector database:

  • 50,000+ repair procedures
  • 2,000+ diagnostic workflows
  • Historical fault patterns from similar vehicles
def retrieve_repair_context(anomaly: Anomaly) -> RepairContext:
    # Create semantic query from anomaly
    query = f"""
    Vehicle: {anomaly.make} {anomaly.model}
    Error Code: {anomaly.code}
    Symptoms: {anomaly.description}
    Sensor Readings: {anomaly.sensor_values}
    """

    # Retrieve relevant procedures
    procedures = vector_db.search(
        query=query,
        top_k=5,
        filters={"vehicle_model": anomaly.model}
    )

    return RepairContext(procedures)

Component 3: LLM Reasoning

The LLM synthesizes sensor data and retrieved knowledge into natural language diagnostics:

def generate_diagnosis(anomaly: Anomaly, context: RepairContext) -> Diagnosis:
    prompt = f"""
    You are an automotive diagnostic expert. Based on the following information:

    Anomaly Details:
    {anomaly.to_dict()}

    Relevant Repair Procedures:
    {context.format_procedures()}

    Provide:
    1. Root cause analysis
    2. Recommended repair steps
    3. Expected repair time
    4. Confidence level
    """

    diagnosis = llm.generate(prompt, temperature=0.3)

    return Diagnosis(diagnosis)

Technical Stack

  • AWS SageMaker: Model hosting and inference
  • Vector Database: FAISS for embedding search
  • AWS CDK: Infrastructure as code
  • Python: Backend processing
  • AWS Bedrock: LLM inference

Results & Impact

After 6 months deployment across pilot garages:

Efficiency Gains

  • 24.5% faster troubleshooting across pilot garages
  • 72.8% anomaly detection accuracy
  • €1.1M annual savings in warranty claims

Technician Feedback

"VectraDrive turned hours of manual searching into seconds of clear guidance. The natural language explanations make complex diagnostics accessible." — Senior Technician, Stellantis Service Center

Key Challenges & Solutions

Challenge 1: False Positives

Problem: Initial system flagged too many benign sensor variations

Solution:

  • Implemented statistical thresholds based on normal operating ranges
  • Added vehicle-specific baselines
  • Incorporated driving conditions context

Challenge 2: Trust & Adoption

Problem: Technicians skeptical of "AI recommendations"

Solution:

  • Always show retrieved source documents
  • Provide confidence scores
  • Enable feedback loops for continuous improvement

Challenge 3: Latency

Problem: Real-time diagnostics needed <5 second response times

Solution:

# Optimize with caching and pre-computation
class CachedDiagnostics:
    def __init__(self):
        self.cache = RedisCache()
        self.precomputed_patterns = self.load_common_patterns()

    def diagnose(self, anomaly: Anomaly) -> Diagnosis:
        # Check cache first
        cache_key = anomaly.generate_key()
        if cached := self.cache.get(cache_key):
            return cached

        # Use pre-computed patterns for common issues
        if pattern := self.match_pattern(anomaly):
            return pattern.diagnosis

        # Fall back to LLM for complex cases
        return self.llm_diagnose(anomaly)

Architecture Patterns

RAG Pipeline Design

┌─────────────┐
│   Sensors   │
└──────┬──────┘
       │
       ▼
┌─────────────────┐
│   Anomaly       │
│   Detection     │
└──────┬──────────┘
       │
       ▼
┌─────────────────┐      ┌──────────────┐
│   Vector DB     │◄─────┤   Manuals    │
│   Retrieval     │      │   Procedures │
└──────┬──────────┘      └──────────────┘
       │
       ▼
┌─────────────────┐
│   LLM           │
│   Synthesis     │
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│   Technician    │
│   Dashboard     │
└─────────────────┘

Lessons Learned

1. Domain Expertise is Critical

LLMs augment, not replace, automotive knowledge. Our best results came from:

  • Close collaboration with master technicians
  • Incorporating tribal knowledge into embeddings
  • Continuous validation against real repair outcomes

2. Explainability Builds Trust

Technicians need to understand why the system recommends specific actions:

class ExplainableDiagnosis:
    def explain(self, diagnosis: Diagnosis) -> Explanation:
        return Explanation(
            root_cause=diagnosis.cause,
            confidence=diagnosis.confidence,
            evidence=diagnosis.retrieved_docs,
            similar_cases=diagnosis.historical_matches,
            reasoning_steps=diagnosis.llm_chain_of_thought
        )

3. Start Small, Scale Gradually

We began with:

  1. Single vehicle model (most common in fleet)
  2. Most frequent error codes (80/20 rule)
  3. Pilot garage (friendly technicians)
  4. Gradual expansion based on feedback

Future Roadmap

We're exploring:

  • Predictive maintenance: Forecasting failures before they occur
  • Multi-modal inputs: Incorporating photos, audio from engines
  • Technician AR integration: Overlaying repair guidance in real-time
  • Fleet-wide pattern analysis: Learning from all connected vehicles

Conclusion

Retrieval-augmented LLMs bridge the gap between raw sensor data and actionable maintenance insights. By combining statistical anomaly detection, semantic knowledge retrieval, and natural language synthesis, VectraDrive empowers technicians to diagnose issues faster and more accurately.

The key is treating LLMs as reasoning engines that augment—not replace—human expertise.


Explore the code: GitHub Repository

Read more: Medium Article

Connect: LinkedIn

Enjoyed this article?

Check out more technical deep dives on AI systems, or connect with me to discuss your AI initiatives.

AI copilot

Ask about Jaafar’s AI projects, articles, or experiments.

AI copilot

Ask about Jaafar’s AI projects, articles, or experiments.