Fraud Radar — Real-time card-fraud detection
A production-style fraud-monitoring platform that mirrors how tier-1 financial institutions decide, in milliseconds, whether a card transaction goes through. FastAPI backend with a layered architecture (api → service → repository → models), SQLAlchemy 2.0 + Alembic, Pydantic v2 schemas, and Stripe-pattern Idempotency-Key ingestion. A six-rule deterministic engine fronts an XGBoost classifier trained on 50,010 synthetic transactions across six injected fraud patterns; every decision is explained at inference time via a cached SHAP TreeExplainer. PR-AUC 0.9327 and Recall@1% FPR 0.9785 on a chronological held-out test fold, with service-layer p50 / p95 of 3.7 / 5.8 ms. React 18 + TypeScript + Tailwind dashboard with TanStack Query consumes the same typed schemas the backend emits.
Technology stack
Problem statement
Card fraud costs the global payments industry tens of billions of dollars a year. The platforms that fight it have to score transactions in single-digit milliseconds, produce explanations that hold up under regulatory scrutiny, and maintain immutable audit trails that survive compliance reviews years after the fact. The goal here was to reproduce those constraints — idempotent ingestion, hybrid rules + ML scoring, per-decision explanations, append-only audit log, decimal-precision money — inside a scope one engineer can build and reason about end to end.
Dataset & data
50,010 synthetic transactions across 500 customers and 200 merchants at a 1.52% fraud rate, fully reproducible from seed=42. The generator injects six real-world fraud patterns: card testing (rapid small-value bursts validating stolen numbers), geo-velocity (physically impossible travel), account takeover (long dormancy then a sudden high-value charge), high-amount anomalies, off-hours clustering, and merchant concentration. The feature extractor converts each transaction into a 17-dimensional vector covering amount, time-of-day, geographic mismatch, velocity over rolling windows, customer history, and merchant context.
Architecture & design
Layered FastAPI monolith with deliberate seams: api/v1 routers do thin orchestration, services own business logic, repositories hold SQLAlchemy 2.0 data access, models are typed ORM with Decimal money typing throughout. Ingestion lives at POST /api/v1/transactions with a required Idempotency-Key header following the Stripe pattern — same key + same body returns the cached response with X-Idempotency-Replay: true; same key + different body returns 409. The hash key is a SHA-256 over the normalized Pydantic dump rather than raw HTTP bytes, so it survives whitespace and field reordering. The scoring pipeline runs a six-rule deterministic engine first, then a calibrated XGBoost classifier, then a cached SHAP TreeExplainer that attaches the top contributing features to every decision. SQLite in dev, Postgres-compatible schemas via SQLAlchemy — a single DATABASE_URL change plus an Alembic run to swap engines.
Training pipeline
Chronological train/val/test split — the last 15% of transactions by created_at, no shuffling, because fraud patterns drift in time and a shuffled split would leak future information into the training set. Hyperparameters are selected by a 25-iteration RandomizedSearchCV scored on PR-AUC; the final fit uses early_stopping_rounds=50 against the validation fold. The training script calls the production FeatureExtractor directly — slower than a vectorized pandas implementation, but byte-identical bytes enter XGBoost during training and at inference, eliminating train/serve skew as a class of bug.
Results & performance
On the held-out chronological test fold: PR-AUC 0.9327, ROC-AUC 0.9989, Recall @ 1% FPR 0.9785, Recall @ 5% FPR 1.0000. At the selected operating threshold of 0.7431, precision is 0.61 and recall is 0.96. Latency: service-layer score_transaction() runs at p50 3.7 ms / p95 5.8 ms; the full HTTP round-trip including FastAPI routing, Pydantic validation, DB transaction commit, and JSON serialization sits at p50 16 ms / p95 20 ms on a developer laptop. 146 tests cover the chronological splitter, SHAP additivity, force/waterfall plot rendering, the six-rule engine with boundary parametrization, Stripe-pattern idempotency, and the /explain and /transactions endpoints via TestClient.