Open Source · Apache 2.0 · EU-Sovereign · Air-Gap Ready · Opt-in for regulated deployments

MoE Codex
EU Compliance Data Platform

From Latin codex — a book of laws, a manuscript of collected knowledge. An audit trail, a catalog, a lineage record — all in one sovereign stack. The open-source EU alternative to Palantir Foundry / AIP for regulated sectors: government, KritIS, healthcare, banking, and pharma compliance.

GitHub ↗ Documentation ↗

0 US API Keys Required

EU Data Jurisdiction

Apache 2.0 License

Air-Gap Deployable

The Concept

What is MoE Codex?

95 % of operators want a sovereign LLM gateway — that’s MoE Sovereign. The remaining 5 % operate in regulated sectors where AI deployments require documented risk classification, data lineage, approval workflows, and audit trails. That’s MoE Codex.

Data Catalog

Discover, classify, and annotate datasets and knowledge assets. Every source is tracked, every schema versioned.

Approval Workflows

No data enters AI pipelines without authorization. Multi-step approval gates, reviewer assignments, and documented decisions.

Data Lineage

End-to-end traceability from raw source to inference output. OpenLineage-compatible events for every pipeline run.

Drift Detection

Continuous monitoring of knowledge graph health and statistical data drift. Alerts when models see distribution shifts.

“BVerfG 2023 ruled the Palantir-based Hessendata platform unconstitutional. Every EU authority deploying AI infrastructure now needs a documented sovereign alternative. MoE Codex provides the compliance layer by construction.”
— EU Sovereignty Charter — MoE Sovereign Docs

Capabilities

Key Features

Built for compliance-driven operators. Deployable alongside any MoE Sovereign instance.

Data Catalog

Asset discovery, schema registry, tagging, and classification. Tracks every dataset from ingestion to model input. Integrates with OpenMetadata.

Approval Workflow

Multi-step authorization gates before data enters AI pipelines. Role-based reviewer assignment, timestamped decisions, and irrefutable audit trail.

Data Lineage (OpenLineage)

Every ETL run, NiFi pipeline, and AI inference emits OpenLineage events captured by Marquez. Trace any output back to its source data.

Data Versioning (lakeFS)

Git-style branches and commits for datasets. Roll back to any snapshot, isolate experiments, and ship reproducible knowledge bundles.

Drift Detection

Continuous health monitoring of knowledge graph metrics. Statistical drift alerts when incoming data deviates from baselines. Prometheus-native metrics.

Investigation Explorer

Graph-based query interface for compliance investigations. Link analysis, timeline views, and export-ready audit reports for regulators.

ETL Automation (NiFi)

Apache NiFi for visual data flow design. Ingest from legacy sources, normalize, and route to Kafka, Postgres, or the AI pipeline — without writing code.

EU Object Storage (Garage)

S3-compatible object store by Deuxfleurs (French non-profit, AGPLv3). EU-origin, Rust-based, MinIO drop-in. No MinIO — it was archived April 2026.

JupyterLab Notebook

Full JupyterLab environment proxied inside the sovereign perimeter. Reproducible data analysis and model evaluation without data leaving the stack.

Pipeline Builder (Kestra)

Declarative workflow orchestration for data-centric processes. Lighter-weight NiFi alternative for ETL chains, scheduled jobs, and event-driven pipelines.

Structured Forms (JSONForms)

Schema-driven data entry for compliance forms, risk assessments, and intake workflows. Rendered from JSON Schema; no frontend code required.

Charts & Analytics

Embedded pivot analysis and visualisation of catalog and lineage data. Slice metrics by time, entity type, or pipeline stage without leaving the compliance stack.

Federated Search (OpenSearch)

Multi-tenant full-text and vector search across catalog assets. Supports structured filters, relevance ranking, and field-level access control.

Timeline (vis-timeline)

Time-based visualisation of event chains across entities, pipeline runs, and data movements. Supports multi-lane views for parallel investigation tracks.

System Design

Architecture

MoE Codex is an opt-in layer that extends MoE Sovereign. Deploy it only where compliance obligations require it.

MoE Sovereign
LLM Gateway & GraphRAG
15 Expert Templates
OpenAI-compatible API

↕ REST API

MoE Codex

Catalog

Approval

Lineage

Versioning

ETL (NiFi)

Drift Monitor

JupyterLab

Kestra

Forms

Charts

Timeline

Object Storage (Garage S3 — EU-origin)

PostgreSQL · Neo4j · Valkey

Regulated Operators
Government · KritIS
Healthcare · Banking
Pharma · Energy

↕ Audit UI / API

Deployment model: MoE Sovereign runs autonomously without MoE Codex. Operators who deploy Codex get the full catalog & compliance layer on top. Only endpoint and credentials change in the application layer — no code changes.

Ingest

NiFi flows pull data from legacy sources, APIs, and file shares. Catalog entries are created automatically on ingestion.

Review & Approve

Curators classify datasets in the catalog. Approval workflows route assets through multi-step authorization before AI pipeline access.

Version & Track

lakeFS creates immutable snapshots of approved datasets. Every pipeline run emits OpenLineage events to Marquez.

Monitor & Audit

Drift metrics surface anomalies. Investigation explorer enables graph-based queries. Export audit reports for regulators on demand.

Regulatory Fit

EU Compliance by Construction

Three regulatory drivers make a sovereign compliance layer mandatory for AI deployments in regulated EU sectors.

EU AI Act (Reg. 2024/1689)

In force since 01.08.2024; high-risk obligations apply from 02.08.2026. MoE Codex provides the risk documentation, audit trail, and traceability required for Annex III high-risk AI systems.

NIS2 & NIS2UmsuCG

NIS2 transposition requires risk management, incident reporting, and supply-chain accountability for essential entities. MoE Codex’s audit trail and lineage records directly address these obligations.

GDPR Art. 35 DPIA

Data Protection Impact Assessments are mandatory for high-risk personal data processing. Catalog metadata and lineage records provide the inventory and flow documentation Art. 35 requires.

BSI Grundschutz & C5

BSI Grundschutz OPS baustein mappings and C5-certified EU hosting (Hetzner, IONOS, STACKIT, OVHcloud) ensure KritIS operators can achieve BSI certification without US-cloud dependencies.

Hessendata-Urteil (BVerfG 2023)

The Federal Constitutional Court ruled the Palantir-based Hessendata policing platform unconstitutional. MoE Codex is the documented, rights-compliant alternative for law enforcement AI use cases.

No US CLOUD Act Exposure

AWS, Azure, and GCP are subject to the US CLOUD Act regardless of EU region. Deployed on EU-jurisdiction hosting (see eu_sovereignty_charter), MoE Codex has zero CLOUD Act exposure.

Comparison: MoE Codex vs. Palantir Foundry/AIP

Concern	MoE Codex	Palantir Foundry/AIP	US-Cloud LLM APIs
Data leaves EU jurisdiction	❌ never	🟡 depends on contract	✅ always
Subject to US CLOUD Act	❌ no (EU host)	✅ yes	✅ yes
Source code auditable	✅ Apache 2.0	❌ proprietary	❌ proprietary
Air-gap deployment	✅ documented	🟡 contract option	❌ impossible
Vendor lock-in risk	❌ none (fork-able)	✅ high	✅ high
Per-token operator cost	€0 (own hardware)	metered + licence	metered

Under the Hood

Technology Stack

Every component is OSI-licensed, EU-deployable, and auditable. No BSL, SSPL, or ELv2 in the stack.

FastAPI

High-performance async Python API for catalog, approval, lineage, and versioning endpoints.

Apache NiFi

Visual ETL designer for data ingestion from legacy sources. Apache 2.0 licensed, containerized, no cloud dependency.

Marquez (OpenLineage)

OpenLineage-compatible lineage backend. Every data movement is recorded, queryable, and exportable for auditors.

lakeFS

Git for data: branch, commit, and merge datasets. Immutable snapshots ensure reproducibility across compliance audits.

Garage (S3, EU-origin)

Deuxfleurs French non-profit. AGPLv3, Rust-based, MinIO drop-in. EU-authored object storage replacing EOL MinIO (archived 2026-04-25).

PostgreSQL

Reliable relational storage for catalog metadata, approval state, and lineage records.

Neo4j

Graph database for knowledge assets and investigation queries. Semantic link analysis and traversal for compliance exploration.

Valkey (BSD-3)

Linux Foundation fork of Redis (replaces SSPL/RSALv2 Redis). Caching and rate limiting. Wire-protocol compatible.

Requirements

Docker and Docker Compose. 4 cores + 8 GB RAM (add-on to MoE Sovereign). Scales from single-node to Kubernetes cluster.

License Hygiene

Only Apache 2.0, MIT, BSD, AGPLv3 (separate container), and LGPL. Automated audit via scripts/audit-licenses.sh in CI.

EU Hosting

Recommended: Hetzner, IONOS, STACKIT, OVHcloud, Open Telekom Cloud. All BSI-C5 or SecNumCloud certified. AWS/Azure/GCP explicitly not recommended for sovereignty-critical deployments.

Ecosystem

Part of MoE Sovereign

MoE Codex is the compliance layer of the MoE Sovereign AI ecosystem, designed for regulated operators.

MoE Codex EU Compliance Data Platform

Data Catalog

Approval Workflows

Data Lineage

Drift Detection

Data Catalog

Approval Workflow

Data Lineage (OpenLineage)

Data Versioning (lakeFS)

Drift Detection

Investigation Explorer

ETL Automation (NiFi)

EU Object Storage (Garage)

JupyterLab Notebook

Pipeline Builder (Kestra)

Structured Forms (JSONForms)

Charts & Analytics

Federated Search (OpenSearch)

Timeline (vis-timeline)

Ingest

Review & Approve

Version & Track

Monitor & Audit

EU AI Act (Reg. 2024/1689)

NIS2 & NIS2UmsuCG

GDPR Art. 35 DPIA

BSI Grundschutz & C5

Hessendata-Urteil (BVerfG 2023)

No US CLOUD Act Exposure

Comparison: MoE Codex vs. Palantir Foundry/AIP

FastAPI

Apache NiFi

Marquez (OpenLineage)

lakeFS

Garage (S3, EU-origin)

PostgreSQL

Neo4j

Valkey (BSD-3)

Requirements

License Hygiene

EU Hosting

moe-sovereign.org ↗

moe-libris.org ↗

docs.moe-sovereign.org ↗

GitHub: MoE Codex ↗

MoE Codex
EU Compliance Data Platform