Production RAG · Measured & regression-gated · EU-sovereign

Production GenAI is proven, not just demoed.

I build RAG and GenAI systems held to a production bar: retrieval measured against an expert-validated gold set, regression-gated, deterministic and auditable by design. When the data can't leave European jurisdiction, the stack stays sovereign end to end — self-hosted models, EU infrastructure, no US service in the path.

Try the sovereign demo See the demo's retrieval, measured

78% Recall@5 · enterprise gold set

0.66 MRR · enterprise gold set

5 pt regression gate

1B+ records · production ETL

Work

Selected projects

Production RAG Knowledge Pipeline

ragaws-bedrockvector-searchevaluationdata-engineering

End-to-end knowledge and retrieval layer for a production RAG assistant — a deterministic, rule-based ingestion and chunking pipeline (no ML, no LLM), structure-aware indexing for AWS Bedrock, two-stage retrieval with reranking, and a four-pillar evaluation harness.

78% Recall@5 (n=56)

0.66 MRR

~800 pages indexed

View details →

EU-Sovereign Agentic Coding Environment

llm-opsmulti-agentlitellmeu-sovereign

A controllable, EU-sovereign environment for agentic coding: LiteLLM tier-routing across on-device and EU-managed providers (any OpenAI-compatible backend swappable in), an adversarial reasoning/execution gate, and a single gateway guardrail.

5 model tiers

1 gateway guardrail

View details →

Bidirectional SAP-HANA Warehouse Engine

sap-hanacode-generationcompilertesting

A spec-driven engine that generates SAP-HANA warehouse objects from YAML and parses existing ones back — bound by a shared intermediate representation and byte-stable roundtrip tests.

~24k LOC

453 tests

byte-stable roundtrips

View details →

Governance Framework for AI-Agent Delivery

ai-governanceworkflow-dsldeterminismaudit

A deterministic framework that makes AI-agent software delivery reproducible, auditable, and approval-gated — a no-LLM kernel, single-shot agents, enforced invariants, and a tamper-evident audit log.

~14k LOC

132 tests

View details →

View all work →

Let's talk

Building an AI system that has to hold up in production, not just in a demo?

Whether you need a RAG system proven correct and kept that way as it changes, or a sovereignty constraint solved end to end — or you're a delivery partner who needs someone who has actually shipped both — happy to talk systems. No pitch needed.

— A RAG or retrieval system that has to prove its quality — and keep proving it as it changes
— A production AI system that has to be auditable and reproducible, not just demoable
— A sovereignty or compliance constraint blocking AI adoption
— Delivery partners needing proven, production-grade AI capacity

Building AI that has to be provably correct — or that has to stay in-jurisdiction? Get in touch.

Get in touch

GitHub ↗ LinkedIn ↗ CV (PDF) ↗