Machine Learning / Data Security

Unsupervised Fraud Detection System

Building an ensemble anomaly detection system that catches sophisticated fraud patterns without labeled training data

🎯 Business Problem

Traditional rule-based fraud detection systems face a fundamental limitation: they can only catch fraud patterns they've been explicitly programmed to recognize. This creates a critical vulnerability to zero-day attacks—novel fraud techniques that bypass existing rule sets.

The Detection Gap

Analysis of historical fraud cases revealed a disturbing pattern:

  • 40% of confirmed fraud passed all existing validation rules
  • 6-8 week lag between first occurrence and rule creation
  • €180K average loss per undetected fraud pattern
  • Manual review overload: 85% of flagged transactions were false positives
"We were always one step behind fraudsters. By the time we wrote rules for the last attack, they'd already moved to the next technique."
— Head of Risk Management

💡 Solution: Unsupervised Ensemble Detection

Instead of relying on labeled fraud examples, I built an unsupervised anomaly detection system that learns normal transaction patterns and flags statistical outliers—regardless of whether they match known fraud signatures.

Why Unsupervised Learning?

  • No Training Labels Required: Fraud is rare (0.17% of transactions), making labeled datasets insufficient
  • Adaptive to New Patterns: Detects never-before-seen fraud techniques automatically
  • Reduced False Positives: Focuses on statistical deviations instead of rigid rules

Ensemble Architecture

I combined two complementary algorithms to maximize detection coverage:

1. Isolation Forest (Primary Detector)

  • Identifies anomalies by measuring how easily data points can be "isolated" from normal clusters
  • Excels at catching extreme outliers in high-dimensional data
  • Fast training: O(n log n) complexity

2. PCA-Based Reconstruction Error (Secondary Validator)

  • Reduces transaction data to principal components
  • Flags transactions with high reconstruction error (poor fit to normal patterns)
  • Catches subtle multi-feature anomalies that Isolation Forest might miss

Implementation

from sklearn.ensemble import IsolationForest
from sklearn.decomposition import PCA
import numpy as np

# Isolation Forest
iso_forest = IsolationForest(
    contamination=0.002,  # Expected fraud rate
    n_estimators=200,
    max_features=10,
    random_state=42
)
iso_scores = iso_forest.fit_predict(transaction_features)

# PCA Reconstruction
pca = PCA(n_components=0.95)  # 95% variance explained
transformed = pca.fit_transform(transaction_features)
reconstructed = pca.inverse_transform(transformed)
reconstruction_error = np.sum((transaction_features - reconstructed)**2, axis=1)

# Ensemble Decision
anomaly_threshold = np.percentile(reconstruction_error, 99.5)
final_anomalies = (iso_scores == -1) & (reconstruction_error > anomaly_threshold)

📊 Results & Impact

Detection Performance

  • 0.17% Anomaly Rate: Flagged 850 suspicious transactions from 500,000 total
  • 40% Lift Over Rules: Caught 340 anomalies that passed all existing validation checks
  • 68% Precision: Manual review confirmed 578 of 850 flags as genuine fraud/errors
  • Sub-2-Second Latency: Real-time scoring for incoming transactions

Novel Fraud Patterns Discovered

  1. Micro-Transaction Probing:
    • Detected automated bots making 50+ small transactions to test stolen card validity
    • Pattern: High transaction velocity + low amounts + sequential merchant IDs
  2. Geographic Velocity Violations:
    • Same card used in Portugal and Brazil within 2-hour window (physically impossible)
    • Rules only checked country mismatches, not time-distance feasibility
  3. Behavioral Deviation:
    • Long-term customers suddenly purchasing high-value electronics (account takeover indicator)
    • Model learned typical spending categories per customer segment

Business Impact

  • €420K Prevented Loss: Estimated fraud value blocked in first 6 months
  • 72% Faster Investigation: Pre-scored risk levels reduced manual review time
  • Compliance Win: Enhanced PCI-DSS audit scores through proactive fraud controls

🔬 Technical Deep Dive

Feature Engineering

Created 23 behavioral features across 4 categories:

Transaction Characteristics (8 features)

  • Amount (raw + z-score normalized)
  • Transaction hour (cyclical encoding: sin/cos)
  • Merchant category code
  • Currency + cross-border flag

Velocity Metrics (6 features)

  • Transactions in last 1h, 24h, 7 days
  • Total spend in last 24h, 7 days
  • Unique merchants in last 7 days

Behavioral Patterns (5 features)

  • Deviation from user's average transaction amount
  • Time since last transaction
  • Typical transaction hour consistency score
  • Merchant category diversity index

Geographic Features (4 features)

  • Distance from user's home location
  • Country mismatch with billing address
  • IP geolocation consistency

Model Optimization

Hyperparameter tuning focused on two competing metrics:

  • Contamination Rate: Tested 0.001 to 0.005 (0.002 optimal)
  • n_estimators: Diminishing returns after 200 trees
  • max_samples: Auto (√n) provided best generalization

🚧 Challenges & Solutions

Challenge 1: Defining "Normal"

Problem: Legitimate high-value transactions (e.g., luxury purchases) were flagged as anomalies

Solution: Implemented customer segmentation—separate models for retail vs. premium cardholders

Challenge 2: Concept Drift

Problem: User behavior changes over time (e.g., summer travel increases geographic diversity)

Solution: Rolling 90-day training window with weekly model retraining

Challenge 3: Explainability Gap

Problem: Compliance team needed to explain why transactions were flagged

Solution: Added SHAP values to show top contributing features for each anomaly

import shap

# Generate SHAP explanations
explainer = shap.TreeExplainer(iso_forest)
shap_values = explainer.shap_values(suspicious_transaction)

# Top 5 anomaly drivers
feature_importance = pd.DataFrame({
    'feature': feature_names,
    'impact': np.abs(shap_values)
}).sort_values('impact', ascending=False).head(5)

📈 Monitoring & Continuous Improvement

Production Metrics Dashboard

  • Daily Anomaly Rate: Track for sudden spikes (fraud campaigns) or drops (model degradation)
  • False Positive Feedback Loop: Manual reviewers label flagged transactions → retrain model monthly
  • Feature Drift Detection: Alert when feature distributions shift >2 standard deviations

A/B Testing Results

Controlled experiment: 50% of transactions scored by ensemble, 50% by rules-only

  • 28% more fraud caught in ensemble group
  • 15% reduction in false positive manual reviews
  • Cost-benefit: €4.20 saved per €1 invested in system development

🎓 Key Learnings

What Worked

  • Ensemble Approach: Combining Isolation Forest + PCA caught 23% more anomalies than either alone
  • Feature Engineering Matters: Velocity metrics were 3x more predictive than transaction amount
  • Operational Integration: Embedded in transaction approval flow (not post-hoc analysis)

What I'd Do Differently

  • Start with Simpler Model: Initial Random Forest attempt was overkill—Isolation Forest simpler and faster
  • Involve Fraud Team Earlier: Their domain knowledge improved feature selection significantly
  • Automate Feedback Loop: Manual labeling is bottleneck—should integrate with case management system

🚀 Future Roadmap

  1. Graph-Based Fraud Networks: Detect organized fraud rings through transaction graph analysis
  2. Real-Time Feature Streaming: Replace batch ETL with Kafka for sub-100ms scoring
  3. Active Learning Pipeline: Prioritize human review of highest-uncertainty predictions
  4. Multi-Channel Integration: Expand beyond card transactions to ACH, wire transfers, cryptocurrency