Fraud Detection Playbooks for National Rail
Fare evasion costs France's national railway €300M+ annually. For SNCF's TER division, reducing fraud while maintaining passenger experience required sophisticated ML systems, careful experimentation, and cross-functional alignment.
This article details how RailGuard Fraud Studio reduced fraud rates from 7.2% to 5.9% through predictive analytics and interpretable dashboards.
The Business Context
The Challenge
SNCF TER (regional trains) faces unique fraud challenges:
- Open-access platforms: No ticket gates in most stations
- High passenger volume: Millions of journeys monthly
- Limited inspectors: Can't check every passenger
- Customer experience: Must balance enforcement with service quality
The Opportunity
Existing fraud detection relied on:
- Manual inspector intuition
- Random checking patterns
- Reactive rather than predictive approaches
Goal: Build ML systems to predict high-risk journeys and optimize inspector deployment.
Solution Architecture
Phase 1: Data Foundation
We consolidated data from multiple sources:
class FraudDataPipeline:
def __init__(self):
self.sources = {
'ticket_sales': TicketDatabase(),
'inspection_logs': InspectionSystem(),
'passenger_patterns': AnalyticsDB(),
'train_schedules': ScheduleAPI()
}
def build_training_data(self, start_date: str, end_date: str) -> DataFrame:
# Merge ticket sales with inspection outcomes
tickets = self.sources['ticket_sales'].query(start_date, end_date)
inspections = self.sources['inspection_logs'].query(start_date, end_date)
# Join and create labels
merged = tickets.merge(inspections, on='journey_id', how='left')
merged['fraud'] = merged['valid_ticket'] == False
# Feature engineering
features = self.engineer_features(merged)
return features
Phase 2: Feature Engineering
We identified key fraud indicators:
Temporal Features:
- Time of day (late evening = higher risk)
- Day of week (weekends = different patterns)
- Holiday periods
Journey Features:
- Route popularity
- Ticket purchase timing (last-minute = higher risk)
- Purchase channel (station vs online)
Behavioral Features:
- Historical pattern analysis
- Passenger frequency indicators
- Unusual travel patterns
def engineer_features(df: DataFrame) -> DataFrame:
features = df.copy()
# Temporal
features['hour'] = features['departure_time'].dt.hour
features['is_weekend'] = features['departure_time'].dt.dayofweek >= 5
features['is_holiday'] = features['departure_time'].isin(HOLIDAYS)
# Journey
features['route_popularity'] = features.groupby('route')['journey_id'].transform('count')
features['minutes_before_departure'] = (
features['departure_time'] - features['purchase_time']
).dt.total_seconds() / 60
# Risk scoring
features['purchase_channel_risk'] = features['channel'].map(CHANNEL_RISK_SCORES)
return features
Phase 3: Model Development
We experimented with multiple approaches:
Random Forest (Baseline)
from sklearn.ensemble import RandomForestClassifier
rf_model = RandomForestClassifier(
n_estimators=100,
max_depth=10,
class_weight='balanced', # Handle imbalanced data
random_state=42
)
rf_model.fit(X_train, y_train)
Results: AUC-ROC 0.83, good interpretability
XGBoost (Production Model)
import xgboost as xgb
xgb_model = xgb.XGBClassifier(
n_estimators=200,
max_depth=8,
learning_rate=0.1,
scale_pos_weight=20, # Fraud is rare (~7%)
eval_metric='auc'
)
xgb_model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
early_stopping_rounds=10
)
Results: AUC-ROC 0.87, selected for production
LSTM for Time Series Patterns
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
lstm_model = Sequential([
LSTM(64, return_sequences=True, input_shape=(sequence_length, n_features)),
Dropout(0.3),
LSTM(32),
Dropout(0.3),
Dense(16, activation='relu'),
Dense(1, activation='sigmoid')
])
lstm_model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['AUC']
)
Results: Best for temporal patterns, used in ensemble
Phase 4: Volume Forecasting
Beyond fraud detection, we built forecasting models to predict:
- Expected passenger volume per route
- Optimal inspector allocation
from sklearn.ensemble import GradientBoostingRegressor
volume_forecaster = GradientBoostingRegressor(
n_estimators=150,
learning_rate=0.05,
max_depth=6
)
volume_forecaster.fit(X_train, y_train_volume)
Results: RMSE 14.7% on volume forecasting
Deployment & Integration
Inspector Dashboard
We built an interpretable dashboard showing:
class InspectorDashboard:
def get_high_risk_journeys(self, date: str, inspector_zone: str) -> List[Journey]:
# Get scheduled trains
trains = self.get_trains(date, inspector_zone)
# Predict fraud probability for each
predictions = []
for train in trains:
features = self.extract_features(train)
fraud_prob = self.model.predict_proba(features)[0][1]
predictions.append({
'train_id': train.id,
'route': train.route,
'departure': train.departure_time,
'fraud_probability': fraud_prob,
'expected_volume': self.volume_model.predict(features),
'priority_score': self.calculate_priority(fraud_prob, volume)
})
# Sort by priority
return sorted(predictions, key=lambda x: x['priority_score'], reverse=True)
Model Monitoring
class FraudModelMonitor:
def track_performance(self, predictions: List, actuals: List):
# Calculate metrics
metrics = {
'auc_roc': roc_auc_score(actuals, predictions),
'precision': precision_score(actuals, predictions > 0.5),
'recall': recall_score(actuals, predictions > 0.5),
'false_positive_rate': self.calculate_fpr(actuals, predictions)
}
# Alert if degradation
if metrics['auc_roc'] < 0.80:
self.send_alert("Model performance degraded")
# Log for tracking
self.log_metrics(metrics)
Results & Impact
Quantitative Outcomes
- ✅ Fraud rate reduced: 7.2% → 5.9% (18% relative reduction)
- ✅ Model accuracy: AUC-ROC above 0.87 for risk scoring
- ✅ Forecast accuracy: RMSE 14.7% on volume predictions
- ✅ Inspector efficiency: 32% more fraudulent tickets caught per inspector
Qualitative Outcomes
Inspector Feedback:
"The dashboard transformed how we work. Instead of guessing which trains to check, we have data-driven priorities." — SNCF Inspector Team Lead
Stakeholder Alignment:
- Product owners understood model decisions through feature importance
- Inspectors trusted the system due to explainability
- Legal teams approved due to audit trails
Technical Stack
- Python: Data processing and modeling
- XGBoost: Primary fraud detection model
- LSTM (TensorFlow): Temporal pattern analysis
- Scikit-learn: Volume forecasting
- PostgreSQL: Data warehouse
- Streamlit: Inspector dashboard
- MLflow: Experiment tracking
Key Lessons
1. Class Imbalance Matters
With fraud at ~7%, naive models predict "no fraud" for 93% accuracy:
# Solution: Balanced sampling + calibrated thresholds
from imblearn.over_sampling import SMOTE
smote = SMOTE(sampling_strategy=0.3)
X_resampled, y_resampled = smote.fit_resample(X_train, y_train)
# Custom threshold tuning
optimal_threshold = find_optimal_threshold(
y_val, y_pred_proba,
target_metric='f1' # Balance precision/recall
)
2. Interpretability Builds Trust
Feature importance helped inspectors understand predictions:
import shap
explainer = shap.TreeExplainer(xgb_model)
shap_values = explainer.shap_values(X_test)
# Show top features for specific prediction
shap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])
3. Change Management is Critical
Technical excellence means nothing without adoption:
- Training sessions with inspector teams
- Pilot program with friendly zones first
- Feedback loops for continuous improvement
- Success stories shared across teams
Future Enhancements
We're exploring:
- Real-time prediction: Moving from batch to stream processing
- Multi-modal data: Incorporating CCTV and turnstile data
- Anomaly detection: Identifying new fraud patterns automatically
- Reinforcement learning: Optimizing inspector patrol routes dynamically
Conclusion
Fraud detection at national scale requires more than just accurate models. It demands:
- Thoughtful feature engineering grounded in domain expertise
- Interpretable models that stakeholders can trust
- Practical deployment that fits existing workflows
- Continuous monitoring and improvement
RailGuard Fraud Studio demonstrates that ML can deliver real business value when combined with rigorous experimentation and change management.
Code & Notebooks: GitHub Repository
Full Article: Medium
Let's Connect: LinkedIn