How Retrieval-Augmented LLMs Transform Fleet Diagnostics

Connected vehicles generate terabytes of sensor data daily. The challenge isn't collecting this data—it's transforming it into actionable maintenance insights that technicians can understand and trust. This article explores how we built VectraDrive Diagnostics at Stellantis R&D.

The Problem Space

Modern vehicles have hundreds of sensors monitoring everything from engine temperature to tire pressure. When something goes wrong:

Technicians receive cryptic error codes
Manuals span thousands of pages
Diagnosis takes hours of investigative work
Misdiagnosis costs €1.1M annually in warranty claims

Solution Architecture

VectraDrive combines real-time telemetry with LLM-powered reasoning to provide explainable diagnostics.

Component 1: Telemetry Ingestion

class TelemetryProcessor:
    def process_stream(self, vehicle_id: str, sensor_data: Dict) -> Anomaly:
        # Normalize sensor readings
        normalized = self.normalize(sensor_data)

        # Detect anomalies using statistical models
        anomalies = self.detect_anomalies(normalized)

        # Enrich with vehicle context
        enriched = self.add_vehicle_context(vehicle_id, anomalies)

        return enriched

Component 2: Knowledge Retrieval

We indexed technical manuals, repair procedures, and historical diagnostics in a vector database:

50,000+ repair procedures
2,000+ diagnostic workflows
Historical fault patterns from similar vehicles

def retrieve_repair_context(anomaly: Anomaly) -> RepairContext:
    # Create semantic query from anomaly
    query = f"""
    Vehicle: {anomaly.make} {anomaly.model}
    Error Code: {anomaly.code}
    Symptoms: {anomaly.description}
    Sensor Readings: {anomaly.sensor_values}
    """

    # Retrieve relevant procedures
    procedures = vector_db.search(
        query=query,
        top_k=5,
        filters={"vehicle_model": anomaly.model}
    )

    return RepairContext(procedures)

Component 3: LLM Reasoning

The LLM synthesizes sensor data and retrieved knowledge into natural language diagnostics:

def generate_diagnosis(anomaly: Anomaly, context: RepairContext) -> Diagnosis:
    prompt = f"""
    You are an automotive diagnostic expert. Based on the following information:

    Anomaly Details:
    {anomaly.to_dict()}

    Relevant Repair Procedures:
    {context.format_procedures()}

    Provide:
    1. Root cause analysis
    2. Recommended repair steps
    3. Expected repair time
    4. Confidence level
    """

    diagnosis = llm.generate(prompt, temperature=0.3)

    return Diagnosis(diagnosis)

Technical Stack

AWS SageMaker: Model hosting and inference
Vector Database: FAISS for embedding search
AWS CDK: Infrastructure as code
Python: Backend processing
AWS Bedrock: LLM inference

Results & Impact

After 6 months deployment across pilot garages:

Efficiency Gains

✅ 24.5% faster troubleshooting across pilot garages
✅ 72.8% anomaly detection accuracy
✅ €1.1M annual savings in warranty claims

Technician Feedback

"VectraDrive turned hours of manual searching into seconds of clear guidance. The natural language explanations make complex diagnostics accessible." — Senior Technician, Stellantis Service Center

Key Challenges & Solutions

Challenge 1: False Positives

Problem: Initial system flagged too many benign sensor variations

Solution:

Implemented statistical thresholds based on normal operating ranges
Added vehicle-specific baselines
Incorporated driving conditions context

Challenge 2: Trust & Adoption

Problem: Technicians skeptical of "AI recommendations"

Solution:

Always show retrieved source documents
Provide confidence scores
Enable feedback loops for continuous improvement

Challenge 3: Latency

Problem: Real-time diagnostics needed <5 second response times

Solution:

# Optimize with caching and pre-computation
class CachedDiagnostics:
    def __init__(self):
        self.cache = RedisCache()
        self.precomputed_patterns = self.load_common_patterns()

    def diagnose(self, anomaly: Anomaly) -> Diagnosis:
        # Check cache first
        cache_key = anomaly.generate_key()
        if cached := self.cache.get(cache_key):
            return cached

        # Use pre-computed patterns for common issues
        if pattern := self.match_pattern(anomaly):
            return pattern.diagnosis

        # Fall back to LLM for complex cases
        return self.llm_diagnose(anomaly)

Architecture Patterns

RAG Pipeline Design

┌─────────────┐
│   Sensors   │
└──────┬──────┘
       │
       ▼
┌─────────────────┐
│   Anomaly       │
│   Detection     │
└──────┬──────────┘
       │
       ▼
┌─────────────────┐      ┌──────────────┐
│   Vector DB     │◄─────┤   Manuals    │
│   Retrieval     │      │   Procedures │
└──────┬──────────┘      └──────────────┘
       │
       ▼
┌─────────────────┐
│   LLM           │
│   Synthesis     │
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│   Technician    │
│   Dashboard     │
└─────────────────┘

Lessons Learned

1. Domain Expertise is Critical

LLMs augment, not replace, automotive knowledge. Our best results came from:

Close collaboration with master technicians
Incorporating tribal knowledge into embeddings
Continuous validation against real repair outcomes

2. Explainability Builds Trust

Technicians need to understand why the system recommends specific actions:

class ExplainableDiagnosis:
    def explain(self, diagnosis: Diagnosis) -> Explanation:
        return Explanation(
            root_cause=diagnosis.cause,
            confidence=diagnosis.confidence,
            evidence=diagnosis.retrieved_docs,
            similar_cases=diagnosis.historical_matches,
            reasoning_steps=diagnosis.llm_chain_of_thought
        )

3. Start Small, Scale Gradually

We began with:

Single vehicle model (most common in fleet)
Most frequent error codes (80/20 rule)
Pilot garage (friendly technicians)
Gradual expansion based on feedback

Future Roadmap

We're exploring:

Predictive maintenance: Forecasting failures before they occur
Multi-modal inputs: Incorporating photos, audio from engines
Technician AR integration: Overlaying repair guidance in real-time
Fleet-wide pattern analysis: Learning from all connected vehicles

Conclusion

Retrieval-augmented LLMs bridge the gap between raw sensor data and actionable maintenance insights. By combining statistical anomaly detection, semantic knowledge retrieval, and natural language synthesis, VectraDrive empowers technicians to diagnose issues faster and more accurately.

The key is treating LLMs as reasoning engines that augment—not replace—human expertise.

Explore the code: GitHub Repository

Read more: Medium Article

Connect: LinkedIn

How Retrieval-Augmented LLMs Transform Fleet Diagnostics

The Problem Space

Solution Architecture

Component 1: Telemetry Ingestion

Component 2: Knowledge Retrieval

Component 3: LLM Reasoning

Technical Stack

Results & Impact

Efficiency Gains

Technician Feedback

Key Challenges & Solutions

Challenge 1: False Positives

Challenge 2: Trust & Adoption

Challenge 3: Latency

Architecture Patterns

RAG Pipeline Design

Lessons Learned

1. Domain Expertise is Critical

2. Explainability Builds Trust

3. Start Small, Scale Gradually

Future Roadmap

Conclusion

Enjoyed this article?