Deuna × Aidaptive

Phase 1 — Development

Smartrouter AI/ML Integration

All productionization complete. 2 models (LogReg + DNN) for Volaris smart routing, continuous daily training from Snowflake, GPU-accelerated, full CI/CD, and served in production.

Phase

Production

All systems live

Models

LogReg + DNN

Latency Target

p95 <200ms

Model inference <50ms

Gaps Closed

13/14

~101d of ~104.5d saved

⚡TL;DR Summary

Everything important in one place — read this first

What This Is

All productionization is complete. 2 ML models (Logistic Regression + Deep Neural Network) for Volaris smart routing are running in production with continuous daily training from Snowflake data, GPU-accelerated training (NVIDIA L4), full CI/CD pipeline, and automated model promotion with quality gates.

The Problem

Static routing rules → suboptimal acceptance rates
No intelligent PSP failover during outages
Retries not optimized by timing, route, or message
No feedback loop — outcomes don't improve future decisions

The Solution (5 Use Cases)

P-01 — PSP outage detection & failover
P-02 — Optimize existing static routing rules
P-03 — Per-transaction processor ranking
P-04 — Authorization message manipulation
P-05 — Retry optimization

Phase 1 Target — Volaris Merchant

Worldpay

ID: 76

MIT

ID: 85

Elavon

Cards

Amex

Amex cards

~3.5d

Remaining (lineage polish)

~101d saved of ~104.5d original

97%

Gaps closed

13 of 14 gaps fully Done

Production

All systems live

Daily training · GPU · CI/CD · Monitoring

✅ Production — What's Live

✓ 2 ML models — LogReg (4 per-processor) + DNN (multi-output, 4 heads) both in production
✓ Daily automated training — Lambda + EventBridge (2 AM UTC), GitHub Actions cron
✓ GPU-accelerated — g6.2xlarge (NVIDIA L4, 24GB VRAM), mixed precision, DNN ~13 min
✓ Deep ClearML integration — metrics, ROC/PR curves, confusion matrices, hyperparams
✓ 5-gate quality suite — min_data, AUC≥0.65, regression, stability, completeness
✓ Full CI/CD — CodePipeline + CodeDeploy + 127 training + 138 Go + 15 sidecar tests
✓ Model serving sidecar — FastAPI on port 8081, hot-reload, 4 processors, 140-feature encoder
✓ A/B testing — control groups, shadow mode, deterministic bucketing, multi-model-type experiments
✓ S3 versioned storage — sequential versions (v1, v2, ...), rollback manifests, sidecar bucket mirroring
✓ OTEL + Prometheus + Grafana — 15+ metric types, distributed tracing, CloudWatch dashboard
✓ Terraform infrastructure — Spacelift-managed, dev + prod, GPU instances, ClearML ECS/Fargate
✓ Snowflake ingestion — memory-efficient streaming, S3 parquet caching, 12-week lookback

🔨 Remaining (~3.5d — polish only)

✗ Lineage tracking queryable store — G-10 (~1.5d remaining)
✗ Lineage documentation — G-10 (~0.5d)

📊 Production Architecture

Snowflake → S3 parquet cache
  ↓
LogReg Pipeline (CPU)
DNN Pipeline (GPU g6.2xlarge)
  ↓
5-Gate Quality Suite
  ↓
S3 Versioned Models
  ↓ (sync to EFS)
Sidecar (FastAPI :8081)
  ↓
Go API (:8080) → Production

DATA-Athena-Snowflake Testing

127 tests · 16 test files · Maturity 4/5
Comprehensive coverage: pipeline orchestration, DNN training (masked BCE, GPU OOM), evaluator, quality gates, promoter, data ingestion, preprocessing, ClearML tracker, rollback, model config. Synthetic fixtures for reproducibility.

athena-platform Testing

138 Go + 15 Python tests · Maturity 4/5
Domain services + repositories well covered. Model registry, experiment assignment, bucketing, shadow mode tested. Sidecar: smart router strategy, model types, encoders. CI enforced on every PR.

Current Blockers

Item	Owner	Status
Deuna corp accounts — Rakesh & Naoki	TBD	Not needed
Code / repo access — Naoki	TBD	Done
Are ATHIA_* tables live in Deuna's Snowflake?	Israel	Confirmed ✓
Are SageMaker endpoints live today?	Rakesh	Resolved ✓
Payment volume through routing engine?	Israel	Open question
GPU instance for training	Deuna	Resolved ✓ — g6.2xlarge (NVIDIA L4) deployed
AWS resource access for Rakesh	Deuna	Resolved ✓ — full access granted

✈️ See full Volaris delivery task list — 65 tasks across 8 phases (Production — all systems live) →

🏠Project Overview

What this project is and what success looks like

🎯

Purpose

Assess the effort required to integrate Athia AI/ML into Deuna's payment routing. Produce a clear work breakdown and estimate before any implementation begins.

✅

Phase 0 Deliverables

Full schema & data understanding
Effort estimate per workstream
Risks and open questions resolved
Recommended build order

🏆

Long-Term Success

Measurable approval lift
Stability during PSP outages
Latency: p95 < 200ms
Closed feedback/learning loop

✅ In Scope (Phase 0)

Understand Deuna's data, schema, routing rules
Assess Athia platform gaps vs. what's needed
Size effort for P-01 through P-05 use cases
Identify all dependencies, blockers, risks

🚫 Out of Scope (Phase 0)

Any implementation or code delivery
3DS optimization (Phase 2)
User-facing messaging (Phase 3)
Installment optimization

📅Timeline & Phases

Project phases from assessment to full delivery

🔍

Phase 0 — Assess Level of Effort Done ✓

2 days · $6K budget · Completed 2026-02-19
Nail down all the work required. Produce a detailed estimate with confidence before committing to delivery.

🚀

Phase 1 — Model in Production Pending

2 weeks · Core delivery
Model running in production for 2 processors with basic feature store. Target merchant: Volaris.

📊

Phase 2 — Monitoring + Experimentation Pending

Week 3 · Add monitoring and integrate with A/B experimentation infrastructure.

⚙️

Phase 3 — Drift Detection, CI/CD, Ramp-Up Pending

TBD · Drift detection, CI/CD pipeline, experiment ramp-up, additional model techniques.

Phase 1 Delivery Plan — Volaris Merchant

65 tasks · ~64 person-days · 5–6 weeks with 3 engineers · TensorFlow ecosystem

Sub-Phase	Focus	Tasks	Effort	Key Notes
0 — Service Architecture	Design + scaffold 7 TF service shells	11	14.5d	NEW Rakesh: architecture, API contracts, TF integration plan. Team: 7 service shells
1 — Discovery & EDA	Understand Volaris data	10	7.5d	Approval rates per PSP, retry patterns, routing rules, DYNAMIC_ROUTING_DETAIL JSON, A/B sample size
2 — Feature Engineering	Build ML features (Feature Service + TF Transform)	9	8.5d	Card BIN/brand, RFM, retry context, rolling health scores, Amex hard-rule bypass
3 — Model Development	Train models (tf.keras)	8	9d	TF DNN, wide-and-deep, TF Decision Forests via Training Pipeline + Eval Service
4 — Outage Detection	P-01: failover for 4 PSPs	6	6.5d	Rolling health score (5–15 min window), auto-failover, recovery detection (1–2% sampling), alerts
5 — Message Manipulation	P-04: CIT/MIT experiment	5	5.5d	CIT/MIT audit, approval delta by toggle × processor × card type, new athena-platform endpoint, A/B test
6 — Platform Integration	Register models, wire Deuna	8	6.5d	Triton ✓ ExperimentService one-call API + built-in shadow mode. Requires Deuna eng coordination.
7 — Monitoring & Feedback	Dashboards, retraining, review	8	6d	2 tasks already done: ATHIA tables confirmed live, ATHIA_STAGE_OUTCOMES deployed
Total		65	~64d	Phase 0 first (Rakesh design ∥ team scaffolding); Phases 1–2 sequential; 3–5 parallel after Phase 2

👥Team

People involved and their roles

Deuna (Client)

Name	Role
Reks	CEO & Co-Founder
Chema	Co-Founder
Pablo	CTO — Executive Sponsor
Israel	Data POC — Snowflake & Data Access
Farhan	Claude / LLM Access POC
Mark Walick	Product Management Lead

Aidaptive (Contractor)

Name	Role
Rakesh	CEO
Naoki	Solutions Architect
Rene	ML Engineer
Kedar	Backend / Data Engineer

🎯Use Cases (P-01 to P-05)

The five P0 use cases to be delivered

🏢

Phase 1 Target Merchant: Volaris Decided 2026-02-19

Volaris selected over Cinépolis. Known PSPs: Worldpay (ID: 76), MIT (ID: 85), Elavon (cards), Amex (Amex cards) — 4 processors total with routing policies per currency. Cinépolis deferred: only shows Cybersource (a gateway), actual processor unknown.

P-01

Outage Detection & Failover

Detect PSP failures via persistent timeout codes. Auto fail-over and fail-back using random sampling of downed PSP to detect recovery.

P-02

Routing Optimizer

Optimize Deuna's existing static routing rules based on historical outcomes. Build on existing rules engine rather than starting from scratch.

P-03

Per-Transaction Route Selection

Rank top 3 payment processors per transaction in real time based on prior outcomes, card signals, and merchant context.

P-04

Message Manipulation

Toggle CIT/MIT, AVS, MCC variables in authorization request messages. Provide top 3 configuration recommendations per transaction.

P-05

Retry Optimization

Optimize when, how, and where to retry declined transactions. MIT/subs focused. Enterprise darktime reduction. Delayed retry based on processor reputation.

🗄️Data & Schema

Snowflake database overview — extracted 2026-02-18

Connection

VLTAXPW-RMONTES

Database: PAYMENT_ML

Access: Read-only

ABTESTING Schema

Denormalized flat join of all views. Best starting point for EDA. No complex joins needed.

ALL_VIEWS_FLAT ALL_PAYMENT_EVENTS_FLAT

SOURCES Schema

15 clean views: orders, payments, attempts, events, user profiles, routing logs, merchant rules, airline data.

View	Why It Matters	Use Cases
`VW_ATHENA_PAYMENT_ATTEMPT`	Full retry chain per payment; processor, error codes, hard/soft decline, `DYNAMIC_ROUTING_DETAIL` JSON	P-03 P-05
`VW_SMART_ROUTING_ATTEMPTS`	Live routing engine log: algorithm type, latency, skip reasons — direct latency signal for p95 <200ms	P-01 P-02
`VW_ROUTING_MERCHANT_RULE`	Existing static rules engine — foundation for routing optimizer. SHADOW_MODE column suggests testing infrastructure exists.	P-02
`ABTESTING.ALL_VIEWS_FLAT`	Everything joined in one table — best for initial EDA	EDA

Feature Group	Key Columns	Use
Retry history	`NUM_ATTEMPTS_ORDER`, `PREVIOUS_ORDER_ERROR_CODE`, `AVG_SEC_BETWEEN_PAYMENT_ATTEMPS`	P-05
Error signals	`ERROR_CODE`, `ERROR_CATEGORY`, `HARD_SOFT`	P-03, P-05
Card signals	`CARD_BIN`, `CARD_BRAND`, `BANK`, `CARD_COUNTRY`	P-03
User behavior	`TARGET_USER_FRAUD_RATE_COHORT`, `TOTA_MINUTES_BROWSING`, RFM values	P-03
Message config	`MCI_MSI_TYPE`, `ORDER_MCI_MSI_TYPE`, `PAYMENT_ATTEMPT_METHOD_TYPE`	P-04
Geo & Device	`ORDER_COUNTRY_CODE`, `TARGET_USER_BROWSER`, `TARGET_USER_DEVICE`	P-03

🔍Repository Analysis

Findings from both Deuna GitHub repos — re-analyzed 2026-04-19 · Both repos in production · DATA-Athena-Snowflake: 2 training pipelines (LogReg + DNN) · athena-platform: ML serving with hot-reload sidecar

DATA-Athena-Snowflake

github.com/DUNA-E-Commmerce/DATA-Athena-Snowflake · Production — 2 training pipelines (LogReg + DNN), GPU-accelerated, daily automated training

Python / TensorFlow / ClearML

✅ Production Status (Updated 2026-04-19)

2 training pipelines in production: LogReg (4 per-processor models) + DNN (multi-output neural net with 4 heads). Daily automated training via Lambda + EventBridge. GPU-accelerated on g6.2xlarge (NVIDIA L4). Deep ClearML integration with metrics, ROC/PR curves, confusion matrices. 5-gate quality suite blocks bad model promotion. 127 tests. S3 versioned model storage with rollback manifests.

LLM Workflows (11 stimuli)

Workflow	Status
Acceptance rate analysis	Done (v0_1, v1_0)
Fraud card analysis	Done
Metrics anomaly detection	Done
Chatbot / data analyst	Done
Strategy generation director	Partial — Matcher has `exit()`
Cost optimization	Early stage
Retry optimization	Missing (P-05 gap)

ML Training Platform (now on main)

Service	Status
Training pipeline (Snowflake ML)	Done
LLM training orchestrator (GPT-4 + RAG)	Done
LLM experiment designer (GPT-4 + RAG)	Done
Data quality validator	Done
Model registry (auto table creation)	Done
Feature extractor	Done
Feedback collector (webhook/API/batch)	Done
Schema discovery (ChromaDB)	Done
Athia event ingestion	Done
Model deployer → athena-platform	Done — S3 promotion + sidecar mirror + hot-reload

Architecture — Multi-Agent Pattern

FastAPI + LangGraph · Stimulus-response orchestration · LLM backends: Claude (primary), GPT-4 (fallback)

Request → StimulusRegistry → OrchestratorWorkflow → Branch (DAG of Nodes)
        → AgentWorkflow (LangGraph StateGraph) → Response

11 stimuli: acceptance_rate_analysis · fraud_card_analysis · metrics_anomaly
            user_question · data_analyst · researcher_assistance · deep_exploration
            element_edition · knowledge_expert · strategy_generation · cost_optimization

End-to-End Training Flow (Production)

Daily 2 AM UTC (Lambda + EventBridge)
  → Snowflake Data Ingestion (memory-efficient streaming, S3 parquet cache)
  → Preprocessing (z-score normalization + one-hot encoding, 140 features)
  ├── LogReg Pipeline (4 per-processor models, CPU, ~1.5 min)
  └── DNN Pipeline (multi-output 64→32→4 heads, GPU g6.2xlarge, ~13 min)
  → 5-Gate Quality Suite (min_data, AUC≥0.65, regression, stability, completeness)
  → ClearML Metrics Logging (ROC/PR curves, confusion matrices, hyperparams)
  → S3 Model Promotion (versioned: v1, v2, ... + rollback manifest)
  → Sidecar Bucket Mirror → EFS → Hot-Reload

Training Pipeline Architecture

Pipeline flow: Training Decision → Data Prep → Feature Extraction → Validation → Experiment Design → Training → Model Selection → Deployment

Services

Service	Purpose
`TrainingPipeline`	Full training execution (plan/run/deploy)
`LLMTrainingOrchestrator`	LLM + RAG decision engine (RETRAIN_NOW / SCHEDULED / SKIP)
`LLMExperimentDesigner`	Designs 5-10 experiments using GPT-4/Claude with RAG
`ModelDeployer`	Exports to EFS, registers deployment, creates canary config
`TrainingPlanner`	Dry-run mode ("terraform plan" for ML)
`FeatureExtractor`	Auto-extracts features, creates training dataset views
`DataQualityValidator`	Schema, statistical, temporal bias, drift validation
`FeedbackCollector`	Webhook, API polling, batch feedback collection
`ModelRegistry`	Model CRUD, prediction/feedback schema management
`SchemaDiscovery`	Auto-discovers training tables via LLM
`LLMProvider`	Unified Claude/GPT-4 interface with auto-fallback

API Endpoints

Endpoint	Purpose
`POST /api/v1/training/plan/{model_type}`	Dry-run plan
`POST /api/v1/training/run/{model_type}`	Execute training
`POST /api/v1/training/decision/{model_type}`	LLM decision
`POST /api/v1/experiments/design/{model_type}`	Design experiments

Resolved Since Last Update

✓ Deployment fully automated — S3 promotion + sidecar bucket mirror + hot-reload (was: manual)
✓ Rollback capability complete — versioned S3 + rollback manifests + sidecar hot-reload (was: partial)
✓ GPU training deployed — g6.2xlarge with mixed precision (was: CPU-only, 3hr runs)
✓ CI/CD complete — GitHub Actions + CodePipeline + CodeDeploy (was: partial)
✓ DNN pipeline added — multi-output neural net with 2-18% AUC improvement over LogReg

Remaining (minor)

✗ Formal lineage queryable store — G-10 (~1.5d)

🧪 Testing Coverage

127

Training tests

Test files

4/5

Maturity

Layer	Coverage	Notes
Pipeline orchestration	Comprehensive	test_pipeline.py, test_multi_output_pipeline.py — stage execution, metric logging
DNN training	Comprehensive	test_multi_output_trainer.py — masked BCE, batched validation, GPU OOM
Quality gates	Comprehensive	test_quality_gates.py — all 5 gates tested
Model promotion	Comprehensive	test_promoter.py — S3 upload, versioning, rollback manifests
Data ingestion	Comprehensive	test_data_ingestion.py — Snowflake streaming, S3 caching
Preprocessing	Comprehensive	test_preprocessing.py — z-score, OHE, config management
ClearML tracker	Comprehensive	test_tracker.py — task creation, offline mode, metrics
Lambda handlers	Comprehensive	test_multi_output_handler.py — instance lifecycle, SSM commands

Maturity: 4/5 — Strong. Comprehensive training pipeline coverage with synthetic fixtures for reproducibility. Unit + integration tests with markers (slow, integration, unit). Coverage tracking via pytest-cov.

athena-platform

github.com/DUNA-E-Commmerce/athena-platform

Go / Gin

✅ Production Status (Updated 2026-04-19)

Full ML serving platform in production. Serves both LogReg and DNN models via FastAPI sidecar (port 8081) with hot-reload from EFS. 4 processors (worldpay, elavon, mit_bulk, amex), 140-feature encoder, strategy pattern (XGBoost LTR / TF per-processor). A/B testing with control groups, shadow mode, deterministic bucketing. Multi-model-type experiments (LogReg vs DNN). 138 Go + 15 Python tests. OTEL + Prometheus + Grafana monitoring.

✅ Key Capabilities — All Production-Ready

Model Serving: FastAPI sidecar, EFS models, hot-reload (POST /models/reload). A/B Testing: Deterministic bucketing, control groups, shadow mode, auto-winner. Monitoring: 15+ OTEL metrics, Prometheus, Grafana, CloudWatch. Model Registry: ExperimentService one-call API, model type in version (logreg/dnn). Event Logging: Async Snowflake ingestion. Auto-Winner: Statistical significance, auto-promotion.

ML Inference Types (already in registry)

Type	Maps To
`processor_selector`	P-03
`retry_predictor`	P-05
`retry_sequence`	P-05
`installment_optimizer`	Out of scope

Snowflake Tables

Table	Status
`ATHIA_PREDICTIONS`	Active
`ATHIA_FEEDBACK`	Active
`ATHIA_TRAINING_DATASET`	Active
`ATHIA_EXPERIMENT_LIFT`	Active
`ATHIA_STAGE_OUTCOMES`	Deployed (feat/ATH-0000)
`ATHIA_SESSION_SUMMARY`	Deployed (feat/ATH-0000)
`ATHIA_MULTI_STAGE_ANALYSIS`	New (feat/ATH-0000)
`ATHIA_MODEL_METRICS`	New (feat/ATH-0000)
`ML_MODEL_REGISTRY`	New (feat/ATH-0000)

Architecture — Clean Architecture (Go/Gin)

PostgreSQL (RDS Multi-AZ) + Redis (ElastiCache) + EFS · mTLS enforced on /api/v1/ml/predict/*

REST Handlers (V1 + V2, Gin)  ← mTLS on /ml/predict/*
        ↓
Controllers (~30 implementations)
        ↓
Domain Services (44 packages)  ← constructor injection throughout
        ↓
Repositories (43 GORM implementations)  ← in-memory SQLite for tests
        ↓
PostgreSQL (RDS Multi-AZ) + Redis (ElastiCache) + EFS (model storage)

A/B Experimentation — Auto-Winner Guardrails

Stats

p-value < 0.05
Min 1000 samples/variant
Min 7 days runtime

Lift

Min 1% absolute lift
Deterministic bucketing
SHA256(transaction_id)

Guardrails

≤10% latency regression
≥−5% revenue regression
Dry-run mode (safe default)

🧪 Testing Coverage

138

Go test files

Python test files

4/5

Maturity

Layer	Coverage	Notes
Domain services (44)	44/44	All have test files
Repositories (43)	~43/43	In-memory SQLite isolation
V1 REST handlers (18)	15/18 (83%)	agent, workspaces, elements missing
Auth middleware	Tested	JWT + API key covered
V2 REST handlers (18)	0/18 (0%)	Entire new API version untested
Bedrock client	0%	Excluded from coverage config
auth, bedrock, element, workspace services	0%	4 domain services with no tests
Bootstrap / DI graph	Skipped	TODO: testcontainers

Maturity: ~4/5 — Strong. Domain and repository layers well covered. Triton merge adds 32 new tests. V2 API (18 handlers) and Bedrock ML inference path are still untested. CI threshold is only 20% and internal/clients/ is excluded from coverage entirely.

Testing Comparison — Both Repos

Metric	DATA-Athena-Snowflake	athena-platform
Test files	16 training test files	138 Go + 15 Python
Test count	127 tests	138+ Go test files
Core product tested?	Yes — all pipeline stages, DNN training, quality gates	Yes — domain, repositories, model registry, experiments
CI enforced?	Yes — daily cron + PR checks	Yes — every PR
Fixtures	Synthetic data fixtures for reproducibility	In-memory SQLite isolation
Maturity	4/5 — Strong	4/5 — Strong

⚡Training Platform Gaps

14 gaps — ~104.5d original estimate · ~101d saved · ~3.5d remaining (lineage polish only)

Status

Production

All systems live and running daily

Architecture

2 models (LogReg + DNN) · GPU training · Daily automated retraining · Full CI/CD

Total Effort

~3.5d remaining

of ~104.5d original · ~101d saved (97%)

Progress Summary — 2026-04-19

Gaps fully done

G-01–G-09, G-11–G-14

Nearly done

G-10 (Lineage)

Partial

—

Not started

—

Stage	Gap	Category	Priority	Original	Status
1 – Foundation	G-02 Orchestration	Infrastructure	High	8d	Done ✓ — Lambda/SSM/EC2 + SageMaker + daily schedules + GPU lifecycle
1 – Foundation	G-08 Feature Store	ML Infra	High	13.5d	Done ✓ — 140-feature encoder + preprocess_config + sidecar serving
1 – Foundation	G-04 Data Validation	Data Quality	High	7d	Done ✓ — 5-gate quality suite
2 – Automation	G-03 CI/CD Pipeline	DevOps	High	9d	Done ✓ — GitHub Actions + CodePipeline + CodeDeploy. Dev + Prod
2 – Automation	G-06 Deployment Automation	Automation	High	7.5d	Done ✓ — Full automated deployment with quality gates
2 – Automation	G-07 Model Registration	Automation	Medium	5.5d	Done ✓ — ExperimentService + S3 versioning
2 – Automation	G-01 Automated Retraining	Automation	High	10d	Done ✓ — Daily automated training (LogReg + DNN)
3 – Governance	G-13 Versioning Workflow	Governance	High	5d	Done ✓ — Sequential S3 versioning + model type in version
3 – Governance	G-10 Lineage Tracking	Governance	Medium	6.5d	Nearly Done (~1.5d left) — ClearML task hierarchy + S3 manifests. Queryable store TODO
3 – Governance	G-14 Rollback Capability	Reliability	High	5d	Done ✓ — S3 versioned + rollback manifests + sidecar hot-reload
4 – Observability	G-05 Model Monitoring	Observability	High	8d	Done ✓ — OTEL + Prometheus + Grafana + ClearML
4 – Observability	G-09 Drift Detection	Observability	Medium	7d	Done ✓ — Feature + prediction + concept drift
5 – ML Quality	G-11 Hyperparameter Tuning	ML Quality	Medium	5.5d	Done ✓ — Configurable via env vars, per-env tuning
5 – ML Quality	G-12 Algorithm Comparison	ML Quality	Medium	7d	Done ✓ — LogReg vs DNN via A/B testing

Remaining Effort by Stage

Stage 1 – Foundation

0d ✓

Stage 2 – Automation

0d ✓

Stage 3 – Governance

~1.5d (G-10 lineage)

Stage 4 – Observability

0d ✓

Stage 5 – ML Quality

0d ✓

📄Statement of Work — Delivery

Volaris Phase 1 · 65 tasks · ~64 person-days · 5–6 weeks (3 engineers) · TensorFlow ecosystem

Engagement Summary

Full implementation of Athia AI/ML smartrouting for Deuna — Volaris merchant. 7 delivery phases covering P-01 (outage failover), P-03 (per-transaction routing), P-04 (message manipulation), and P-05 (retry optimization). ~61.5 person-days of prior Deuna codebase work reduces original scope significantly.

Delivery tasks

~51d

Person-days

5–6 wks

Calendar time

Aidaptive Team — Roles & Responsibilities

Name	Role	Responsibilities	Days
Rakesh	Project Lead & Strategy	Client coordination (Pablo, Israel), architecture decisions, Phase 6 oversight, post-launch review	5.5d
Naoki	Solutions Architect	athena-platform Go dev, outage detection, message manipulation API, model serving (Triton), CI/CD, Phase 6 integration	14.6d
Rene	ML Engineer	Feature engineering, model training (processor_selector, retry_predictor, retry_sequence), data quality, drift detection, retraining pipeline	15d
Kedar	Data & Backend Engineer	Snowflake EDA, data pipelines, training datasets, feature feeds, Grafana dashboards, monitoring	16d
Total			~51.1d

Effort by Phase

Phase	Focus	Owners	Days	Milestone
1 — Discovery & EDA	Understand Volaris data	Kedar (4.5d) · Rene (2d) · Rakesh (0.5d) · Naoki (0.5d)	7.5d	Kick-off (20%)
2 — Feature Engineering	Build ML feature set	Rene (4d) · Kedar (3d) · Naoki (1d) · Rakesh (0.5d)	8.5d	Phase 2 complete (20%)
3 — Model Development	Train P-03 + P-05 models	Rene (6d) · Kedar (2d) · Naoki (1d)	9d	Phase 3 complete (20%)
4 — Outage Detection	P-01: failover for 4 PSPs	Naoki (4.5d) · Rakesh (1d) · Kedar (1d)	6.5d	Phase 6 complete (30%)
5 — Message Manipulation	P-04: CIT/MIT experiment	Naoki (2d) · Rene (1.5d) · Kedar (1.5d) · Rakesh (0.5d)	5.5d	Phase 6 complete (30%)
6 — Platform Integration	Register models, wire Deuna +G-06 close	Naoki (5.1d) · Rakesh (2d) · Kedar (1d)	8.1d
7 — Monitoring & Feedback	Dashboards, retraining, review	Kedar (3d) · Rene (1.5d) · Rakesh (1d) · Naoki (0.5d)	6d	Phase 7 complete (10%)
Total			~51d

Delivery Timeline (6-week plan)

Week	Days	Phases Active	Who	Key Milestone
Week 1	1–5	Phase 1 (EDA) · Phase 2 start Day 3	Kedar · Rene · Rakesh (Day 1)	EDA complete; feature schema draft
Week 2	6–10	Phase 2 (Features) · Phase 3 start Day 8	Rene · Kedar · Naoki	Feature set locked; training dataset built
Week 3	11–15	Phase 3 (Models) · Phase 4 (Outage) parallel	Rene (models) · Naoki (outage)	Models packaged; outage detection built
Week 4	16–20	Phase 4 tail · Phase 5 (CIT/MIT) · Phase 6 prep	Naoki · Rene · Kedar · Rakesh	API contract with Deuna eng signed
Week 5	21–25	Phase 6 (Integration)	Naoki · Rakesh	⚠ Triton branch must be merged by Day 18 · Integration live in shadow mode
Week 6	26–30	Phase 7 (Monitoring & Review)	Kedar · Rene · Rakesh	Dashboards live · retraining scheduled · post-launch report

Critical path: Phases 1–2 sequential. Phases 3–5 can run in parallel. Phase 6 requires (a) models complete, (b) Triton branch merged, (c) 1-week Deuna engineering lead time for API contract. Phase 7 requires Phase 6 live.

Assumptions

Snowflake access (PAYMENT_ML) remains available read-only
Deuna eng available for API contract in Week 4 (Pablo / Israel)
Triton branch merged to main by end of Week 3
Staging environment available for Phase 6 integration tests
ATHIA_PREDICTIONS + ATHIA_FEEDBACK remain live throughout

Success Criteria

processor_selector live for ≥1 Volaris PSP
≥1% absolute approval rate lift (A/B test at significance)
≥5% retry success rate improvement vs. baseline
PSP failover within 1 routing cycle of threshold breach
p95 latency <200ms end-to-end (model inference <50ms)
48h shadow run complete with documented comparison

✈️Volaris Smartrouting — Delivery Tasks

65 tasks across 8 phases — All productionization complete. 2 models (LogReg + DNN) running daily in production.

✅ Production Status (2026-04-19): All Systems Live

2 ML models in production: LogReg (4 per-processor models) + DNN (multi-output neural net, 4 heads). Daily automated training from Snowflake via Lambda + EventBridge. GPU-accelerated (g6.2xlarge NVIDIA L4). 5-gate quality suite. Full CI/CD. ClearML experiment tracking. S3 versioned model storage with rollback.

7 Service Shells
Data Pipelines · Feature Service · Training Pipelines · Model Management · Eval Service · Evaluation Framework · Experiment System

Design Tasks (Rakesh)
V-D01: Service architecture · V-D02: API contracts · V-D03: TF ecosystem integration plan

Updated Totals
65 tasks (was 54) · ~64d total (was ~49.5d) · Phase 0 adds ~14.5d · 3 engineers ~5–6 weeks

PSPs (Worldpay · MIT · Elavon · Amex)

Total tasks +11 new

~64d

Total effort +14.5d (Phase 0)

5–6 wks

3 engineers parallel

Phase 0 — NEW

Service Architecture & Shell Setup

Design 7 service boundaries (Rakesh), define API contracts, scaffold all service shells using TensorFlow ecosystem: Data Pipelines (TFX), Feature Service (TF Transform), Training Pipelines (TFX Trainer), Model Management, Eval Service (TFMA), Evaluation Framework, Experiment System.

📐 V-D01–D03 (Rakesh design) + V-S01–S08 (team scaffolding)

11 tasks14.5dTensorFlow

Phase 1

Discovery & EDA

Understand Volaris transaction data — approval rates per PSP, retry patterns, routing rules, DYNAMIC_ROUTING_DETAIL JSON, sample size for A/B test.

10 tasks7.5d

Phase 2

Feature Engineering

Card BIN/brand, transaction context, user RFM, retry history, rolling processor health scores, Amex hard-rule bypass, training dataset build.

9 tasks8.5d

Phase 3

Model Development

Train processor_selector, retry_predictor, retry_sequence for Volaris 4 PSPs using tf.keras. DNN vs. wide-and-deep vs. TF Decision Forests comparison. TFMA per-slice evaluation.

🔧 TensorFlow ecosystem: tf.keras training via Training Pipeline service, TFMA evaluation via Eval Service, SavedModel export via Model Management.

8 tasks9dInfra ready

Phase 4 — P-01

Outage Detection

Rolling health score per PSP, failover to next-best Volaris processor, recovery detection via 1–2% sampling, alerts on state changes.

6 tasks6.5d

Phase 5 — P-04

Message Manipulation

CIT/MIT audit for Volaris, approval delta by toggle × processor × card type, experiment design, new athena-platform endpoint, A/B test.

5 tasks5.5d

Phase 6

Platform Integration

Register models in athena-platform, create Volaris-scoped experiment, API contract with Deuna eng, shadow mode validation before live traffic.

✅ Triton branch: ExperimentService one-call API (V-39–41) + built-in shadow mode (V-46) reduce effort by ~1d.

8 tasks6.5d ~~7.5d~~Triton ✓

Phase 7

Monitoring & Feedback Loop

V-47
Confirm ATHIA_PREDICTIONS + ATHIA_FEEDBACK are live in Deuna's Snowflake
Done ✓ — confirmed data live in Deuna's Snowflake (2026-02-24)

V-48
Deploy ATHIA_STAGE_OUTCOMES table
Done ✓ — deployed in feat/ATH-0000

V-49–50
Approval rate + model performance Grafana dashboards

V-51–54
Retraining trigger, scheduled pipeline, auto-winner, post-launch review
Ready ✓ — LLM orchestrator + training pipeline built

8 tasks6d2 tasks done/ready

All 65 Tasks

#	Task	Phase	Owner	Effort	Status
V-D01	Design overall service architecture — 7 service boundaries, data flow, inter-service communication	0 – Architecture	Rakesh	2d	Design
V-D02	Define API contracts for all 7 services — OpenAPI specs, error handling, versioning	0 – Architecture	Rakesh	1.5d	Design
V-D03	Design TensorFlow ecosystem integration — map TFX components to services, TF Serving format, TFDV/TFMA	0 – Architecture	Rakesh	1d	Design
V-S01	Scaffold Data Pipeline service — TFX ExampleGen + StatisticsGen, Snowflake adapter, TFDV schema	0 – Shell	Kedar	1.5d
V-S02	Scaffold Feature Service — TF Transform `preprocessing_fn`, feature store API, real-time endpoint	0 – Shell	Rene	1.5d
V-S03	Scaffold Training Pipeline service — TFX Trainer + tf.keras, Keras Tuner, training history	0 – Shell	Rene	1.5d
V-S04	Scaffold Model Management service — registry CRUD, SavedModel storage, lifecycle, version comparison	0 – Shell	Naoki	1d
V-S05	Scaffold Eval Service — TFMA integration, per-slice metrics, model blessing/rejection API	0 – Shell	Rene	1d
V-S06	Scaffold Evaluation Framework — A/B stat engine, winner detection, latency/revenue guardrails	0 – Shell	Naoki	1.5d
V-S07	Scaffold Experiment System — experiment CRUD, traffic splitting, variants, shadow mode orchestration	0 – Shell	Naoki	1.5d
V-S08	Set up shared TF dependencies — tensorflow, tfx, tf-transform, tfma, tfdv, keras-tuner + Docker base	0 – Shell	Kedar	0.5d
V-01	Filter Volaris transactions — date range, volume, monthly trend	1 – EDA	Kedar	0.5d
V-02	Per-processor approval rates (Worldpay, MIT, Elavon, Amex) by card type, currency, amount	1 – EDA	Kedar	1d
V-03	Retry pattern analysis — attempts per order, processor retry-to, 1st/2nd/3rd attempt success rates	1 – EDA	Kedar	1d
V-04	Explore `DYNAMIC_ROUTING_DETAIL` JSON — extract all keys and values	1 – EDA	Kedar	1d
V-05	Map Volaris routing rules from `VW_ROUTING_MERCHANT_RULE*` views	1 – EDA	Kedar	0.5d
V-06	Analyze smart routing log — algorithm types, skip rates, p95 latency baseline	1 – EDA	Kedar	0.5d
V-07	Hard vs. soft decline distribution by processor and error code	1 – EDA	Rene	1d
V-08	Profile airline-specific features — flight, passenger, booking window signal	1 – EDA	Rene	0.5d
V-09	A/B test sample size check — daily volume per processor ≥ 1000/variant in 7 days?	1 – EDA	Rene	0.5d
V-10	EDA summary report — approval rates, error taxonomy, processor share, correlations	1 – EDA	Rene + Rakesh	1d
V-11	Define Volaris feature schema — all features, types, sources, compute latency	2 – Features	Rene + Naoki	1d
V-12	Card-level features — BIN, brand, bank, type, country; historical approval rate per BIN × processor	2 – Features	Rene	1d
V-13	Transaction-level features — amount, currency, CIT/MIT, MCC, flight order type	2 – Features	Rene	1d
V-14	User-level features — RFM, fraud rate cohort, tenure, browsing signals	2 – Features	Rene	0.5d
V-15	Retry-context features — previous processor, error code, time since attempt, attempt number	2 – Features	Kedar	1d
V-16	Processor-state features — rolling approval/timeout/decline rate at 15-min, 1h, 24h windows	2 – Features	Kedar	1.5d
V-17	Amex hard-rule — always route Amex cards to Amex processor; bypass ML	2 – Features	Naoki	0.5d
V-18	Build training dataset — join features onto labeled outcomes; train/val/test split	2 – Features	Kedar	1d	Ready ✓ `ATHIA_TRAINING_DATASET` view + `feature_extractor.py`
V-19	Feature quality validation — nulls, skew, leakage risk, outcome correlation	2 – Features	Rene	1d	Ready ✓ `data_quality_validator.py` (834 lines)
V-20	Train `processor_selector` v1 — rank 4 PSPs by approval probability (tf.keras DNN)	3 – Models	Rene	2d	TF Training Pipeline service
V-21	Evaluate `processor_selector` — AUC, lift vs. static rules, per-processor accuracy, latency	3 – Models	Rene	1d	Ready ✓ Metrics auto-calculated by pipeline
V-22	Train `retry_predictor` v1 — predict retry approval probability	3 – Models	Rene	1.5d	Ready ✓ Training pipeline supports `retry_predictor` type
V-23	Train `retry_sequence` v1 — optimal processor order for retry	3 – Models	Rene	1.5d	Ready ✓ Training pipeline supports `retry_sequence` type
V-24	Evaluate retry models — success rate lift, processor fatigue patterns	3 – Models	Rene	1d	Ready ✓ Evaluation framework in pipeline
V-25	Architecture comparison — DNN vs. wide-and-deep vs. TF Decision Forests; select champion	3 – Models	Rene	1d	TF Replaces XGBoost vs. LR comparison
V-26	Inference latency test — all models under 50ms budget	3 – Models	Naoki	0.5d
V-27	Package models — serialize, write model card (schema, features, metrics)	3 – Models	Kedar	0.5d	Ready ✓ `model_registry.py` auto-creates tables + stores metadata
V-28	Define outage signal — timeout/error code thresholds for PSP-down detection	4 – P-01	Rakesh + Naoki	1d
V-29	Rolling processor health score — sliding 5–15 min window per PSP	4 – P-01	Naoki	1.5d
V-30	Failover logic — skip degraded PSP, route to next-best Volaris processor	4 – P-01	Naoki	1.5d
V-31	Recovery detection — 1–2% sampling of down PSP; auto-restore on consecutive wins	4 – P-01	Naoki	1d
V-32	Outage simulation tests — inject failures per PSP; verify failover + recovery	4 – P-01	Naoki	1d
V-33	Outage alerting — Slack/PagerDuty on PSP state changes	4 – P-01	Kedar	0.5d
V-34	Audit CIT/MIT usage for Volaris — current distribution across PSPs	5 – P-04	Kedar	0.5d
V-35	Approval delta by CIT vs MIT per processor — statistical test	5 – P-04	Rene	1d
V-36	Design message manipulation experiment — CIT/MIT × processor × card type matrix	5 – P-04	Rene + Rakesh	1d
V-37	Implement message recommendation API in athena-platform	5 – P-04	Naoki	2d
V-38	Run A/B test — approval rate with vs. without message recommendations	5 – P-04	Kedar	1d
V-39	Register `processor_selector` in MODEL_ARTIFACTS (version, Triton backend ref, feature schema)	6 – Integration	Naoki	0.3d	Ready ✓ `POST /api/v1/ml/models` (Triton branch ExperimentService)
V-40	Register `retry_predictor` + `retry_sequence` in MODEL_ARTIFACTS	6 – Integration	Naoki	0.3d	Ready ✓ Same — ExperimentService handles all 3 model types
V-41	Create Volaris-scoped experiment — merchant filter, 10% treatment split, shadow mode, guardrails	6 – Integration	Naoki	0.5d	Ready ✓ `POST /api/v1/ml/experiments` — variants + models in one call (Triton branch)
V-42	Validate experiment assignment — SHA256 bucketing determinism for Volaris	6 – Integration	Naoki	0.5d
V-43	API contract with Deuna engineering — define `POST /api/v1/ml/predict` request/response for Volaris	6 – Integration	Rakesh	1d
V-44	Deuna payment service integration — Deuna calls athena-platform at routing decision point	6 – Integration	Rakesh + Naoki	2d
V-45	End-to-end integration test — full flow: Deuna → athena-platform → model → ranked PSPs	6 – Integration	Naoki + Kedar	1d
V-46	Shadow mode — 48h logging without acting; compare predicted vs. actual outcomes	6 – Integration	Kedar	0.5d	Ready ✓ `is_shadow_mode=true` built-in (Triton branch); set up + monitor only
V-47	Confirm ATHIA_PREDICTIONS + ATHIA_FEEDBACK are live in Deuna's Snowflake	7 – Monitoring	Kedar	0.5d	Done ✓ Confirmed data live in Deuna's Snowflake (2026-02-24)
V-48	Deploy ATHIA_STAGE_OUTCOMES table in Snowflake	7 – Monitoring	Kedar	0.5d	Done ✓ Deployed in feat/ATH-0000 SQL
V-49	Volaris approval rate dashboard — daily/hourly per PSP vs. baseline	7 – Monitoring	Kedar	1d
V-50	Model performance dashboard — prediction confidence, rank accuracy, retry lift	7 – Monitoring	Kedar	1d
V-51	Define retraining trigger — approval rate drop or AUC drop thresholds	7 – Monitoring	Rene	0.5d	Ready ✓ `llm_training_orchestrator.py` makes RETRAIN_NOW / SCHEDULED / SKIP decisions
V-52	Schedule weekly retraining — auto-register new version from latest ATHIA_TRAINING_DATASET	7 – Monitoring	Rene	1d	Ready ✓ `training_pipeline.py` + orchestrator built; configure for Volaris cadence
V-53	Confirm auto-winner worker runs for Volaris experiment with correct guardrails	7 – Monitoring	Naoki	0.5d
V-54	Post-launch review — 2-week lift analysis: approval rate, outage response, retry success	7 – Monitoring	Rakesh	1d

💡Codebase Improvement Suggestions

What needs to change in both repos to reach production-grade quality

🔴 Critical — Do These First

Action	Repo	Effort
Build `retry_optimization_requested` stimulus — P-05 is entirely missing from LLM platform	DATA-Athena-Snowflake	3d
Complete Strategy Director — replace `exit()` placeholder & dummy ranker prompts	DATA-Athena-Snowflake	2d
Add tests for all 18 V2 REST handlers — 0% coverage on new API version	athena-platform	4d
Add tests for Bedrock client + Bedrock domain service — production-critical, currently excluded	athena-platform	1.5d
Add route, service & client layer tests — all 14 routes, 13 services, 3 clients at 0%	DATA-Athena-Snowflake	7d
Remove `internal/clients/` from coverage exclusions in CI	athena-platform	0.5d

Python / LangGraph DATA-Athena-Snowflake

Testing

✗ Unit test all route handlers with FastAPI TestClient + mocked services
✗ Unit test all 13 services — mock Snowflake sessions and clients
✗ Unit test core multi-agent framework: AgentWorkflow, AgentStrategy, node/edge composition
✗ Add per-branch tests for all 11 stimulus branches (mock LLM responses with fixtures)
✗ Unify CI into a single pytest run — replace fragmented per-domain workflows
✗ Enable pytest-cov with 60% minimum threshold enforced in CI

Architecture

✗ Circuit breaker in AgentWorkflow — isolate node failures, prevent cascade
✗ Enable OpenTelemetry tracing — already in codebase, just commented out
✗ Replace hardcoded thresholds (15% drop, 60–80 min windows) with configurable params
✗ Add LLM prompt injection guards — sanitize user inputs before system prompts
✗ Standardize tool definition — unify @create_tool vs. manual; add versioning
✗ Centralize config — replace scattered load_dotenv with Pydantic Settings schema

Go / Gin athena-platform

Testing

✗ Add tests for all 18 V2 handlers — entire new API version at 0%
✗ Test Bedrock client & service — excluded from coverage, production-critical
✗ Raise CI threshold 20% → 60%; remove internal/clients/ exclusion
✗ Bootstrap integration test with testcontainers-go — verify DI graph
✗ Benchmark tests for /ml/predict, /feedback, experiment assignment
✗ Contract tests for Snowflake & Bedrock APIs — catch schema drift early

Architecture

✗ Event-driven model registry cache invalidation — remove 24h stale assignment risk
✗ Experiment context middleware — auto-propagate session/experiment IDs per request
✗ Abstract *gin.Context from controllers — transport-agnostic, easier to test
✗ Deploy ATHIA_STAGE_OUTCOMES + ATHIA_SESSION_SUMMARY Snowflake tables
✗ SageMaker model warm-up — cold starts can breach p95 < 200ms target
✗ Production Grafana dashboards + alerts — config exists locally, not deployed

Full Priority Order

Priority	Action	Repo	Effort
Critical	Build retry optimization stimulus (P-05)	DATA-Athena-Snowflake	3d
Critical	Complete Strategy Director matcher + ranker	DATA-Athena-Snowflake	2d
Critical	Add tests for all 18 V2 REST handlers	athena-platform	4d
Critical	Add tests for Bedrock client + service	athena-platform	1.5d
Critical	Add route, service & client tests (all at 0%)	DATA-Athena-Snowflake	7d
Critical	Remove `internal/clients/` from coverage exclusions	athena-platform	0.5d
High	Multi-agent framework + branch tests (11 branches)	DATA-Athena-Snowflake	6d
High	Circuit breaker in AgentWorkflow	DATA-Athena-Snowflake	2d
High	Enable OpenTelemetry tracing	DATA-Athena-Snowflake	1.5d
High	Raise CI coverage threshold to 60%	athena-platform	0.5d
High	Bootstrap integration test (testcontainers)	athena-platform	1.5d
High	Event-driven model registry cache invalidation	athena-platform	1.5d
High	Deploy ATHIA_STAGE_OUTCOMES + SESSION_SUMMARY tables	athena-platform	1d
High	SageMaker model warm-up (latency target risk)	athena-platform	1d
Medium	Adaptive thresholds (replace hardcoded values)	DATA-Athena-Snowflake	2d
Medium	Experiment context middleware	athena-platform	2d
Medium	Production Grafana dashboards + alert rules	athena-platform	2d
Medium	Benchmark tests for hot endpoints	athena-platform	1d
Medium	Unified CI test suite + coverage enforcement	DATA-Athena-Snowflake	1.5d

📝Daily Updates

Aidaptive engineering activity across both codebases

2026-06-11

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`0114867` Add BIN 546759 → MIT_BULK override; fix ECS poll; `36e2a2c` Add refresh_bin_overrides.py; `4227ebb` Add serving-time WORLDPAY overrides; `1ecc87c` Add MIT_BULK neg-weight; `e1fae15` Fix override detection; `965adc4` Revert Adyen processor merge (PR #1389)

Summary: BIN-level routing override automation — managing specific BINs causing mis-routing; reverted Adyen DNN head (needs more data)

2026-06-10

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`8fd7755` fix: update Snowflake account to VLTAXPW-YN70854

Summary: Migrated all scripts to new Snowflake instance VLTAXPW-YN70854

2026-06-09

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`7c07eae` feat(ATH-1272): BIN routing override config and automation script; `7433b34` fix: correct warehouse in routing scripts; `410e512` fix: PAYMENTS_ML warehouse in update_bin_routing_rules

Summary: BIN routing override automation wired up; warehouse config fixes across pipeline scripts

2026-06-08

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`96e572e` feat(ATH-1375): add Adyen as 5th processor head in Volaris DNN; `7eb8b4e` feat: persist BIN neg-weight config across retrains; `aec0094` test(ATH-1352): neg-weight config tests; `4a6af72` fix(ATH-1277): align Snowflake warehouse defaults
Rakesh	athena-platform	`c0cb96d` feat(ATH-1375): add Adyen BIN/brand approval rate fields to serving layer; `583522a` fix(ATH-1277): close DNN confidence gap — channel, timing, flight data

Summary: Added Adyen as 5th DNN processor head (training + serving); BIN neg-weight config persisted; DNN confidence gap closed with new features

2026-06-04

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`c5b40d9` fix(volaris): correct bin_processor_rates formula
Rakesh	athena-platform	`8abd1b4` test(ATH-1352): expand BIN rate penalty coverage to 26 tests

Summary: Formula fix for BIN processor rates; expanded serving-layer BIN penalty test coverage

2026-06-02

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`db2d8ea` fix(ATH-1277): write version.json to latest/ prefix on every promotion; `42a3887` feat(ATH-1277): restore automatic latest/ writes

Summary: Fixed version.json promotion so latest/ pointer always stays current on S3

2026-06-01

Engineer	Repo	Commits
Rakesh	DATA-Athena-Snowflake	`dde18a4` feat(training): BIN neg-weight boosting enabled by default for 481516/416916 → WORLDPAY; `e0dec88` feat(training): add BIN-level negative signal boosting for mis-routed processor pairs
Rakesh	athena-platform	`6691322` feat(ATH-1352): Thompson Sampling for Volaris smart router; `a74d8a5` feat(ATH-1352): serving-layer BIN approval rate penalty

Summary: Thompson Sampling merged to smart router; BIN approval rate penalty live in serving; BIN neg-weight training improvement for mis-routed WORLDPAY BINs

2026-06-08

Goal: Analyze Volaris DNN model performance in production; identify new processor impact; stabilize to best-performing model version.

Engineer	Work Done
Rakesh	Deep analysis of Volaris DNN transactions. Identified that Adyen was added as a new processor in mid-May, which the model has no training history for — model needs retraining once 12 weeks of Adyen data is available. Extensive comparison of model versions v54 vs v56 — v54 is the stronger performer. Coordinated with Jose to revert production serving to v54 and disabled daily automated training to enable controlled manual training cadence going forward.

Engineer

Work Done

Rakesh

Deep analysis of Volaris DNN transactions. Identified that Adyen was added as a new processor in mid-May, which the model has no training history for — model needs retraining once 12 weeks of Adyen data is available. Extensive comparison of model versions v54 vs v56 — v54 is the stronger performer. Coordinated with Jose to revert production serving to v54 and disabled daily automated training to enable controlled manual training cadence going forward.

2026-06-07

Goal: Audit BIN-level penalty configuration and improve model calibration for underperforming card BINs.

Engineer	Work Done
Rakesh	Audited all BIN-level penalty configurations and calibrated the model to correctly handle bad BINs (card BINs with historically low approval rates). This calibration work was the key driver behind v54 and its improved production performance.

2026-06-06

Goal: Build a dedicated transaction analysis tool for debugging Volaris DNN model performance.

Engineer	Work Done
Rakesh	Created a dedicated Transaction Explorer tool (transactions.html) — a live Snowflake-backed analyzer for inspecting individual transactions across the Deuna ML ecosystem. Displays all ML features fed to the DNN model, raw request/response payloads from the sidecar, payment attempt lifecycle (first try + retries), and feedback/outcome records. Supports sampling by processor, model version (DNN vs heuristic), and outcome correctness — especially useful for debugging Volaris DNN model performance discrepancies.

Engineer

Work Done

Rakesh

Created a dedicated Transaction Explorer tool (transactions.html) — a live Snowflake-backed analyzer for inspecting individual transactions across the Deuna ML ecosystem. Displays all ML features fed to the DNN model, raw request/response payloads from the sidecar, payment attempt lifecycle (first try + retries), and feedback/outcome records. Supports sampling by processor, model version (DNN vs heuristic), and outcome correctness — especially useful for debugging Volaris DNN model performance discrepancies.

2026-06-05

Goal: Restore ML training pipelines after Snowflake migration; refresh all dashboards on new instance.

Engineer	Work Done
Rakesh	Fixed all ML training pipelines failing due to access issues post Snowflake migration to Ohio region (new instance: VLTAXPW-YN70854). Updated all scripts and config to point to new account/credentials. Verified pipeline connectivity and ran full metrics refresh — confirmed 884K attempts, 80.4% approval rate across all dashboards.

2026-06-04

Goal: Validate Volaris DNN model progression and debug bin-level penalty not taking effect in production.

Engineer	Work Done
Rakesh	Analyzed Volaris DNN model versions v43–v48 to confirm iterative improvement across versions. Investigated bin-level penalty not applying in production — current hypothesis is that the bin-level penalty config has been loaded onto the sidecar rather than the serving path; debugging in progress.

2026-06-03

Goal: Refresh all metrics dashboards with live Snowflake data; analyze MCO model readiness for general merchant rollout.

Engineer	Work Done
Rakesh	Ran full Snowflake refresh across metrics and merchant dashboards (822K attempts, 80.8% approval rate). Added daily $ lift chart for Volaris DNN vs Heuristic comparison. Fixed model comparison chart rendering (Chart.js scale ID issue). Analyzed MCO model to confirm it is ready for regular (non-Volaris) merchant MCO rollout — confirmed bias removal logic is still functioning correctly.

2026-05-31

Goal: Ensure training data quality and model promotion safety for Volaris DNN pipeline.

Engineer	Work Done
Rakesh	Deep analysis of data and schema — identified Elavon and Worldpay transactions stuck in "processing" status, polluting training data with ambiguous outcomes. Pinging Deuna team to resolve. Added quality metric gates to model promotion pipeline — model will not be promoted if it does not outperform the currently serving model in offline evaluation.

2026-05-28

Goal: Ship serving change for MCO model on Volaris transactions.

Engineer	Work Done
Rakesh	Added serving change for calling MCO model for Volaris transactions using flight data. Verified MCO model shadow mode for Volaris transactions — fixing serving code.

2026-05-27

Goal: Deep-dive Volaris and MCO metrics, resolve sampling bias in model comparison.

Engineer	Work Done
Rakesh	Added detailed analysis for Volaris and MCO on metrics dashboard and scripts. Extensive analysis to identify and resolve sampling bias in DNN vs heuristic comparison — implemented stratified amount-bin matching. Audited MCO for Volaris and identified serving change needed to route Volaris transactions through MCO model using flight data.

2026-05-25

Goal: Investigate and fix model version logging issue affecting dashboards.

Engineer	Work Done
Rakesh	Identified bug where "latest" was being logged as the model version instead of the actual version (e.g. v35) — causing confusion in dashboards.

2026-05-24

Goal: Validate best Volaris DNN model version and explore simulation options.

Engineer	Work Done
Rakesh	Did extensive analysis to confirm that v34 is the best model for Volaris DNN. Explored offline simulation but determined it would be tricky due to lack of counterfactuals.

2026-05-20

Goal: Quality improvements to Volaris DNN model to close gap vs control.

Engineer	Work Done
Rakesh	Analyzed and added multiple minor quality enhancements to Volaris DNN model — getting close to consistently beating control, even if by a small margin.

2026-05-19

Goal: Temperature-based calibration for Volaris DNN + bias analysis on control/experiment groups.

Engineer	Work Done
Rakesh	Added temperature-based calibration to Volaris DNN training model along with serving changes to use calibration. Analyzed biases in Volaris control and experiment groups.

2026-05-18

Goal: Fix non-performing BINs via override rules; explore confidence score + temperature approach for model fallback.

Engineer	Work Done
Rakesh	Continued analyzing Volaris DNN model performance — got non-performing BINs across Visa and MC to go through adhoc override rules. Added detailed analysis to ticket ATH-1272. Next: tweaking model to use confidence score and temperature-based analysis — exploring per-processor confidence scaling with a threshold to fall back to heuristics when model is not confident.

2026-05-17

Goal: Fix underperforming BIN ranges in Volaris DNN and submit override rules PR.

Engineer	Work Done
Rakesh	Analyzed Volaris DNN model performance for specific BIN ranges where it is underperforming. Added adhoc rules to override model results in those failing BIN ranges. Submitted PR — scheduled to go out Monday.

2026-05-14

Goal: Extend MCO models for airline tasks and fix GPU OOM in training pipeline.

Engineer	Work Done
Rakesh	Added capability to MCO models to handle airline tasks (Volaris-specific flows). Fixed MCO model training pipeline so that GPU doesn't go OOM — training now finishes and generates a model successfully. Deep analysis on Volaris routing model performance — found niche error where 3 specific MC BIN numbers were 100% failing; added BIN-related features to training pipeline and trained a new model.

Engineer

Work Done

Rakesh

Added capability to MCO models to handle airline tasks (Volaris-specific flows). Fixed MCO model training pipeline so that GPU doesn't go OOM — training now finishes and generates a model successfully. Deep analysis on Volaris routing model performance — found niche error where 3 specific MC BIN numbers were 100% failing; added BIN-related features to training pipeline and trained a new model.

2026-05-13

Milestone: All done — latest model trained and deployed in prod, no more known issues.

Engineer	Work Done
Rakesh	Fixed ClearML integration issue. Deployed full training run end-to-end. Loaded model from S3 to EFS for serving. Added alerting check when daily pipeline run doesn't happen. Added scheduler to reload latest model version at 7am PST every day to ensure production always has the latest model. All previously discussed ideas completed — latest model trained and deployed in prod, no more known issues. Did all-nighter to fix everything that was broken for Volaris DNN — it was not getting trained daily with latest trends; fully resolved. Analyzed and fixed last mile of all breakages, verified everything working end-to-end. Fixed MCO training pipeline which was also failing. Confirmed Volaris routing is back to break even.

Engineer

Work Done

Rakesh

Fixed ClearML integration issue. Deployed full training run end-to-end. Loaded model from S3 to EFS for serving. Added alerting check when daily pipeline run doesn't happen. Added scheduler to reload latest model version at 7am PST every day to ensure production always has the latest model. All previously discussed ideas completed — latest model trained and deployed in prod, no more known issues. Did all-nighter to fix everything that was broken for Volaris DNN — it was not getting trained daily with latest trends; fully resolved. Analyzed and fixed last mile of all breakages, verified everything working end-to-end. Fixed MCO training pipeline which was also failing. Confirmed Volaris routing is back to break even.

2026-05-12

Goal: Debug MCO shadow mode and connect remaining model heads to production.

Engineer	Work Done
Rakesh	Debugged MCO model in shadow mode to ensure everything is hooked up correctly and working fine. Added the last remaining 2 heads of the MCO model — all heads are now being called in production now that everything is well connected.

2026-05-04

Goal: Confirm DNN back in production and validate data distribution matches training.

Engineer	Work Done
Rakesh	DNN Volaris model is back up and running in production, handling live traffic with correct data as expected. Full analysis completed confirming everything is hooked up and connected end-to-end. Data distribution matches what the model was trained on — strong signal that the model should perform well for Volaris. Added serving change to handle other heads of MCO model so it can be used in payment services for whichever flow.

Engineer

Work Done

Rakesh

DNN Volaris model is back up and running in production, handling live traffic with correct data as expected. Full analysis completed confirming everything is hooked up and connected end-to-end. Data distribution matches what the model was trained on — strong signal that the model should perform well for Volaris. Added serving change to handle other heads of MCO model so it can be used in payment services for whichever flow.

2026-05-03

Goal: Add MCO dashboard and verify full system connectivity.

Engineer	Work Done
Rakesh	Added MCO dashboard for monitoring. Completed full analysis confirming all components are hooked up and connected end-to-end.

2026-04-24

Goal: Get DNN model fully functional in production with 1-year training data.

Engineer	Work Done
Rakesh	DNN model finally enabled in production — fixed many bugs so that production DNN functions correctly. Training with 1 year of data now succeeds. DNN serving live traffic alongside LogReg in A/B experiment.

2026-04-23

Goal: Monitor DNN model in production, fix bugs, retrain with 1 year of data.

Engineer	Work Done
Rakesh	Added model comparison graphs to metrics dashboard (DNN vs LogReg vs Heuristic — accuracy, approval rate, latency, volume). DNN model went to production yesterday. Monitoring production DNN, fixed one bug in pipeline. Retraining with 1 year of data — will load to production once training completes.

2026-04-19

Milestone: All productionization complete — 2 models running daily in dev and prod.

Engineer	Work Done
Rakesh	All productionization complete. 2 models (LogReg + DNN) for Volaris smart routing running in production. Continuous daily training from Snowflake data. GPU-accelerated (g6.2xlarge NVIDIA L4). Full CI/CD (GitHub Actions + CodePipeline + CodeDeploy). 5-gate quality suite. Deep ClearML integration. 13 of 14 gaps closed (~101d of ~104.5d saved). Only lineage tracking polish remaining (~1.5d).

2026-04-15

Goal: Have both Log Reg and DNN pipelines running regularly in dev and prod, pushing models to S3 buckets.

Engineer	Work Done
Rakesh	Changed training pipelines to use GPU instances. Implemented deeper ClearML integration in training pipeline — extracting detailed metrics from each training run into ClearML for monitoring and debugging.

2026-04-14

Goal: Get ClearML integration working in dev and deploy DNN pipeline end-to-end.

Engineer	Work Done
Rakesh	Spent ~30 hours debugging dev servers with ClearML integration — blocked by access issues. Worked with team to resolve access, dev pipeline now working correctly. Attempted DNN pipeline deployment — ran for 3 hours and failed. Long turnaround time makes iteration impractical without GPU.

Blockers:

GPU needed — training requires GPU instance; CPU-based runs too slow for practical iteration (3hr+ per attempt)
AWS access for Rakesh — need direct access to AWS resources (console/CLI) to debug and iterate efficiently

2026-04-11

Goal: Have both model techniques (DNN and Log Reg) running e2e daily, pushing models in dev and prod.

Engineer	Work Done
Rakesh	Implemented DNN pipeline Terraform and deploying in dev. Need to get PRs submitted and merged to main/qa so pipelines can run via CI/CD

2026-04-10

Goal: Have end to end training pipeline for TFX running on AWS and integrated into Deuna stack for daily run.

Engineer	Work Done
Rakesh	Debugging why ClearML is not registering all metrics from each training run. Next: deploy DNN model and fix production integration to ensure every training run model reaches production automatically

2026-04-07

Goal: Have end to end training pipeline for TFX running on AWS and integrated into Deuna stack for daily run.

Engineer	Work Done
Rakesh	Fixing Volaris training pipeline run in dev environment. Helping team with prod deployment

2026-04-05

Goal: Both training pipelines (dolphin SageMaker + Volaris smart router) running daily with ClearML tracking.

Engineer

Work Done

Rakesh

Deployed e2e SageMaker dolphin pipeline (PreprocessData ✅, TrainModel ✅, EvaluateModel fix pushed). Deployed Volaris smart router daily training infra. ClearML integration confirmed working. Implemented Volaris pipeline CLI with Snowflake connection. Added 13 tests for setup_pipeline. Fixed multiple SageMaker issues: pipeline name mismatch, model.save(), RegisterModel, FrameworkProcessor for eval.

Blockers:

PR #1132 (DATA-Athena-Snowflake) needs approval to merge to qa — blocks CodePipeline deployment
CodeDeploy DeployEC2 stage failing — scripts need debugging after merge
SSO lacks sagemaker:CreatePipeline for local runs

Pending PRs:

#1132 DATA-Athena-Snowflake → qa (ClearML, Lambda handler, SageMaker fixes)
#26 terraform-athia → main (Dolphin SageMaker pipeline infra)
#27 terraform-athia → main (Volaris daily training infra)

2026-04-04

Goal: Have end to end training pipeline for TFX running on AWS and integrated into Deuna stack for daily run.

Engineer	Work Done
Rakesh	Deployed e2e SageMaker training pipeline via Spacelift. Consolidated PRs #24+#25 → #26. Fixed Spacelift project_root, cleaned orphaned state. Enabled daily training for both pipelines. Set up ClearML creds in Secrets Manager + EC2. Created Spacelift stack for Volaris. Renamed model-artifacts → volaris-model-artifacts.

2026-04-02

Goal: Have end to end training pipeline for TFX running on AWS and integrated into Deuna stack for daily run.

Engineer	Work Done
Rakesh	Fighting Terraform and Spacelift configuration issues. ClearML successfully deployed in dev environment

2026-04-01

Goal: Have end to end training pipeline for TFX running on AWS and integrated into Deuna stack for daily run.

Engineer	Work Done
Rakesh	Deployed ClearML using Terraform and Spacelift in dev environment and handed over to team. Full data analysis of Snowflake data with Rene — generated list of suggestions for team, published at metrics dashboard
Rene	Full data analysis of Snowflake data with Rakesh — generated list of suggestions for team

2026-03-31

Goal: Deploy entire training platform using TF and Spacelift.

Engineer	Work Done
Rakesh	Deploying entire training platform end to end. Learning Spacelift for infra deployment — finally got access. Deploying ClearML to monitor all training. Wrote analysis script to monitor model performance driving metrics dashboard directly from Snowflake — better for analysis and generating insights
Rene	Analyzing model performance with past data. Waiting for experiment to be enabled again — current data not significant enough

2026-03-30

Goal: Have clear idea of metrics for measuring model performance and deploy entire training platform.

Engineer	Work Done
Rakesh	Analyzing model performance metrics, working on DNN model optimization for Volaris smart router

2026-03-28

Goal: Train one more DNN model for Volaris smart router.

Engineer	Work Done
Rakesh	Training DNN model for Volaris smart router — also serves as example on how to use the TFX pipeline
Rene	Data analysis to identify patterns based on Volaris data

2026-03-27

Goal: Get TF LR model in production.

Engineer	Work Done
Rakesh	Verified everything working in production. Helping with questions about the model. Double-checked traffic ramp-up and analyzing how to do post-launch analysis

2026-03-26

Goal: Integrate new model in serving stack now that everything is working end to end.

Engineer	Work Done
Rakesh	Integrated model in serving stack with sidecar approach, loading models from S3/EFS. Removed all parallel serving libraries no longer needed in training directory

2026-03-25

Goal: Deploy everything in production for Volaris to get real data flowing.

Engineer	Work Done
Rene	Created and incorporated AMEX model, added to the serving mix
Rakesh	Built AMEX model with Rene. Working with Deuna team on deploying everything in production for Volaris

2026-03-24

Goal: Have TF model up and running in production integrated with the Go API.

Engineer	Work Done
Rakesh	Analyzing code with Naoki to determine serving approach — sidecar vs deploying servomatic. Building production flow to save and load models from S3 & deploying servomatic binary
Naoki	Evaluating sidecar vs servomatic deployment for integrating TF model with the Go API
Rene	Continuing to iterate on the TF model and data analysis. Converting LR model to TF format for use in servomatic binary for online eval

2026-03-23

Goal: Hook this model into production to have everything connected end to end.

Engineer	Work Done
Rene	First regression model trained on Volaris data — evaluating quality in offline mode, initial results look promising
Rakesh	Continuing to analyze approach to serve the model as servomatic with Naoki
Naoki	Analyzing serving architecture to connect trained model via servomatic in production

2026-03-20

Goal: Have one model trained on Volaris data.

Engineer	Work Done
Rene	Iterating on data shape analysis and building first model version
Rakesh	Working with Rene on first model; analyzing serving code with Naoki to plan production integration
Naoki	Analyzing serving code with Rakesh to determine how to connect model in production

2026-03-19

Goal: Have first model ready at least in offline mode in the coming days.

Engineer	Work Done
Rakesh	First analysis of the data with Rene — analyzing best approach to build processor selector model
Rene	Started on first model based on current understanding of data and features
Naoki	Working with Rene on integrating S3 file loading into TFX data loader for e2e training and eval

Blockers: None — have a few questions to confirm our understanding, will batch and ask together

2026-03-18

Goal: Have first ML model for selecting the right processor for every Volaris transaction.

Engineer	Work Done
Rene	Looking at data shape for Volaris to train first processor selector model
Kedar	Working on data pipeline
Naoki	Continues setting up good practices (code quality, CI, testing patterns)
Rakesh	Writing smartrouter service

Blockers cleared: AWS access granted · Code review done, all code merged to qa

2026-03-16

Goal: Submit everything and build data pipeline to extract Volaris data from Snowflake.

Engineer	Work Done
Rakesh	Iterated on experiment and metrics framework to make everything work locally and in tests
Naoki	Iterated on improving code and ramping up
Kedar	Looking at feature extraction from Snowflake
Rene	Working on simple first model

Blockers: PR review pending · AWS access for POC server (from 2026-03-13)

2026-03-15

Goal: Have training platform implemented in shape and be ready for feature engineering and training for Volaris model.

Engineer	Work Done
Rakesh	Added Evaluation Service (uses Model Service + Feature Service to evaluate TensorFlow models). Added e2e tests for all 3 services. Added experiment and metrics framework to track all training pipelines. Demo training pipeline working end to end. PR waiting for review

Blockers: PR review pending · AWS access for POC server (from 2026-03-13)

2026-03-14

Engineer	Work Done
Rakesh	Built foundational services: Model Service and Feature Service with tests and scaffoldings to support TensorFlow trained models

2026-03-13

Goal: Get everything running tests regularly and pushing to dev server automatically — getting comfortable with the current stack.

Engineer	Work Done
Naoki	Fixed broken tests to get everything running locally. Looking at setting up automated deployment in dev environment for services
Kedar	Got repo and environment access figured out. Looking into Snowflake data schema
Rene	Got repo and environment access figured out. Looking at training pipeline code
Rakesh	Updated deuna.aidaptive.com with latest repo analysis and refreshed task list. Synced athena-platform (v0.15.5, Triton merged)

Blocker: Waiting for AWS access to deploy on POC server

❓Open Questions

Items that need answers before effort estimates are finalized

Are ATHIA_PREDICTIONS / ATHIA_FEEDBACK tables populated in Deuna's Snowflake today? Confirmed ✓ 2026-02-24

Confirmed — data is live in Deuna's Snowflake (verified 2026-02-24).

Are SageMaker endpoints live for processor_selector / retry_predictor?

Or are they placeholders only? — Rakesh to confirm

Is there a live model in MODEL_ARTIFACTS that Deuna's payment service is calling today?

Rakesh to confirm

What is the current payment volume through the routing engine?

Minimum 1,000 transactions per variant needed for A/B test statistical validity — Ask Israel

Who owns the athena-platform Go repo deployments?

Aidaptive or Deuna infra? Affects Phase 1 deployment planning — Clarify with Pablo

When will `feature/llm-driven-ml-training` (Triton IS) merge to main? New

This PR closes G-06 and defines the production model serving backend (Triton vs. SageMaker). Its merge timeline directly sets the Phase 6 integration schedule — ask Pablo.

🔑Access & Blockers

Pending provisioning items

Item	Owner	Status
Snowflake access — Rakesh	Israel (Deuna)	✓ Done (2026-02-18)
Snowflake access — Naoki	Rakesh + Naoki	✓ Done (2026-02-19)
Code / repo access — Rakesh	Pablo (Deuna)	✓ Done (2026-02-19)
Claude / LLM access & budget	Pablo → Farhan	✓ Done (2026-02-19)
Code / repo access — Naoki	TBD	✓ Done
Deuna corp accounts — Rakesh & Naoki	TBD	Pending
Claude Code credits — Rakesh & Naoki	—	Not needed
Deploy ATHIA_STAGE_OUTCOMES + ATHIA_SESSION_SUMMARY in Snowflake	Rakesh	✓ Done (feat/ATH-0000)
Build retry_optimization_requested workflow	Rakesh	Pending

Quick Links

Service	URL	Details
AWS Console (SSO)	deunaio.awsapps.com	Deuna AWS account access
Snowflake	vltaxpw-rmontes.snowflakecomputing.com	Account: `VLTAXPW-RMONTES` · DB: `PAYMENT_ML` · Warehouse: `PAYMENT_ML` · Read-only
Athia Experiments Dashboard	insights.deuna.com	Model performance data for processor selector experiments
ClearML (Prod)	athia-ml.deuna.io	ML experiment tracking & training monitoring — production
ClearML (Dev)	athia-ml.dev.deuna.io	ML experiment tracking & training monitoring — dev environment
Spacelift	duna-e-commmerce.app.spacelift.io	Infrastructure governance & Terraform deployment
Terraform Repo	github.com/DUNA-E-Commmerce/terraform-athia	All Athia infrastructure as code

Development Rules

Rule	Details
AWS Resource Tags	All AWS resources must include: `CreatedBy=aidaptive`, `ServiceName=smartrouter`, `Environment=POC`
Infrastructure as Code	All infrastructure via Terraform only — no manual AWS console resource creation

Decisions Log

Date	Decision	Rationale	Made By
2026-04-19	All productionization complete — 2 models (LogReg + DNN) running daily in dev + prod	GPU-accelerated training, full CI/CD, 5-gate quality suite, ClearML tracking, S3 versioned storage. 13/14 gaps closed (~101d saved)	Rakesh
2026-03-24	Serving migrated from DATA-Athena-Snowflake to athia-model-server sidecar in athena-platform	Clean separation — training repo (Python) vs serving repo (Go + Python sidecar)	Rakesh
2026-03-13	Adopted TensorFlow ecosystem (TFX, TF Serving, TFDV, TFMA, TF Transform) for all ML work	Replaces Snowflake ML / XGBoost / scikit-learn. Unified training → validation → serving pipeline with production-grade tooling	Rakesh
2026-03-13	Added Phase 0 — 7 service shells + 3 design tasks (Rakesh) before Volaris feature work	Service architecture: Data Pipelines, Feature Service, Training Pipelines, Model Mgmt, Eval Service, Evaluation Framework, Experiment System	Rakesh
2026-03-13	Both repos switched to main branch — feat/ATH-0000 and Triton branches both merged	All ML training pipeline and Triton serving code now on main; no more feature branch tracking needed	Rakesh
2026-03-13	Triton branch merged to main — confirms deployment architecture	feature/llm-driven-ml-training merged; Triton IS, ExperimentService, shadow mode now in production codebase	Deuna Engineering
2026-02-18	Latency target updated: p95 <50ms → p95 <200ms	Revised from original SOW spec	Rakesh (w/ Pablo)
2026-02-19	Phase 1 target merchant set to Volaris (not Cinépolis)	Volaris has known PSPs (Worldpay ID:76, MIT ID:85, Elavon, Amex); Cinépolis only shows Cybersource gateway — processor unknown	Mark Walick
2026-02-20	Repo analysis scoped to branch `feat/ATH-0000-athia-ml-llm-schema-discovery` (not main)	This branch contains the active ML platform development; main does not reflect current capabilities	Pablo
2026-02-26	athena-platform `feature/llm-driven-ml-training` (Triton IS) identified as the production model serving path	Triton IS + shadow mode + ExperimentService provides complete training→serving pipeline; replaces manual SageMaker endpoint registration; closes G-06	Rakesh