Open Source · Apache 2.0 · EU-Sovereign · Air-Gap Ready · Opt-in for regulated deployments

MoE Codex
EU Compliance Data Platform

From Latin codex — a book of laws, a manuscript of collected knowledge. An audit trail, a catalog, a lineage record — all in one sovereign stack. The open-source EU alternative to Palantir Foundry / AIP for regulated sectors: government, KritIS, healthcare, banking, and pharma compliance.

0 US API Keys Required
EU Data Jurisdiction
Apache 2.0 License
Air-Gap Deployable

What is MoE Codex?

95 % of operators want a sovereign LLM gateway — that’s MoE Sovereign. The remaining 5 % operate in regulated sectors where AI deployments require documented risk classification, data lineage, approval workflows, and audit trails. That’s MoE Codex.

Data Catalog

Discover, classify, and annotate datasets and knowledge assets. Every source is tracked, every schema versioned.

Approval Workflows

No data enters AI pipelines without authorization. Multi-step approval gates, reviewer assignments, and documented decisions.

Data Lineage

End-to-end traceability from raw source to inference output. OpenLineage-compatible events for every pipeline run.

Drift Detection

Continuous monitoring of knowledge graph health and statistical data drift. Alerts when models see distribution shifts.

“BVerfG 2023 ruled the Palantir-based Hessendata platform unconstitutional. Every EU authority deploying AI infrastructure now needs a documented sovereign alternative. MoE Codex provides the compliance layer by construction.”
— EU Sovereignty Charter — MoE Sovereign Docs

Key Features

Built for compliance-driven operators. Deployable alongside any MoE Sovereign instance.

Data Catalog

Asset discovery, schema registry, tagging, and classification. Tracks every dataset from ingestion to model input. Integrates with OpenMetadata.

Approval Workflow

Multi-step authorization gates before data enters AI pipelines. Role-based reviewer assignment, timestamped decisions, and irrefutable audit trail.

Data Lineage (OpenLineage)

Every ETL run, NiFi pipeline, and AI inference emits OpenLineage events captured by Marquez. Trace any output back to its source data.

Data Versioning (lakeFS)

Git-style branches and commits for datasets. Roll back to any snapshot, isolate experiments, and ship reproducible knowledge bundles.

Drift Detection

Continuous health monitoring of knowledge graph metrics. Statistical drift alerts when incoming data deviates from baselines. Prometheus-native metrics.

Investigation Explorer

Graph-based query interface for compliance investigations. Link analysis, timeline views, and export-ready audit reports for regulators.

ETL Automation (NiFi)

Apache NiFi for visual data flow design. Ingest from legacy sources, normalize, and route to Kafka, Postgres, or the AI pipeline — without writing code.

EU Object Storage (Garage)

S3-compatible object store by Deuxfleurs (French non-profit, AGPLv3). EU-origin, Rust-based, MinIO drop-in. No MinIO — it was archived April 2026.

Architecture

MoE Codex is an opt-in layer that extends MoE Sovereign. Deploy it only where compliance obligations require it.

Ingest

NiFi flows pull data from legacy sources, APIs, and file shares. Catalog entries are created automatically on ingestion.

Review & Approve

Curators classify datasets in the catalog. Approval workflows route assets through multi-step authorization before AI pipeline access.

Version & Track

lakeFS creates immutable snapshots of approved datasets. Every pipeline run emits OpenLineage events to Marquez.

Monitor & Audit

Drift metrics surface anomalies. Investigation explorer enables graph-based queries. Export audit reports for regulators on demand.

EU Compliance by Construction

Three regulatory drivers make a sovereign compliance layer mandatory for AI deployments in regulated EU sectors.

EU AI Act (Reg. 2024/1689)

In force since 01.08.2024; high-risk obligations apply from 02.08.2026. MoE Codex provides the risk documentation, audit trail, and traceability required for Annex III high-risk AI systems.

NIS2 & NIS2UmsuCG

NIS2 transposition requires risk management, incident reporting, and supply-chain accountability for essential entities. MoE Codex’s audit trail and lineage records directly address these obligations.

GDPR Art. 35 DPIA

Data Protection Impact Assessments are mandatory for high-risk personal data processing. Catalog metadata and lineage records provide the inventory and flow documentation Art. 35 requires.

BSI Grundschutz & C5

BSI Grundschutz OPS baustein mappings and C5-certified EU hosting (Hetzner, IONOS, STACKIT, OVHcloud) ensure KritIS operators can achieve BSI certification without US-cloud dependencies.

Hessendata-Urteil (BVerfG 2023)

The Federal Constitutional Court ruled the Palantir-based Hessendata policing platform unconstitutional. MoE Codex is the documented, rights-compliant alternative for law enforcement AI use cases.

No US CLOUD Act Exposure

AWS, Azure, and GCP are subject to the US CLOUD Act regardless of EU region. Deployed on EU-jurisdiction hosting (see eu_sovereignty_charter), MoE Codex has zero CLOUD Act exposure.

Comparison: MoE Codex vs. Palantir Foundry/AIP

Concern MoE Codex Palantir Foundry/AIP US-Cloud LLM APIs
Data leaves EU jurisdiction❌ never🟡 depends on contract✅ always
Subject to US CLOUD Act❌ no (EU host)✅ yes✅ yes
Source code auditable✅ Apache 2.0❌ proprietary❌ proprietary
Air-gap deployment✅ documented🟡 contract option❌ impossible
Vendor lock-in risk❌ none (fork-able)✅ high✅ high
Per-token operator cost€0 (own hardware)metered + licencemetered

Technology Stack

Every component is OSI-licensed, EU-deployable, and auditable. No BSL, SSPL, or ELv2 in the stack.

FastAPI

High-performance async Python API for catalog, approval, lineage, and versioning endpoints.

Apache NiFi

Visual ETL designer for data ingestion from legacy sources. Apache 2.0 licensed, containerized, no cloud dependency.

Marquez (OpenLineage)

OpenLineage-compatible lineage backend. Every data movement is recorded, queryable, and exportable for auditors.

lakeFS

Git for data: branch, commit, and merge datasets. Immutable snapshots ensure reproducibility across compliance audits.

Garage (S3, EU-origin)

Deuxfleurs French non-profit. AGPLv3, Rust-based, MinIO drop-in. EU-authored object storage replacing EOL MinIO (archived 2026-04-25).

PostgreSQL

Reliable relational storage for catalog metadata, approval state, and lineage records.

Neo4j

Graph database for knowledge assets and investigation queries. Semantic link analysis and traversal for compliance exploration.

Valkey (BSD-3)

Linux Foundation fork of Redis (replaces SSPL/RSALv2 Redis). Caching and rate limiting. Wire-protocol compatible.

Requirements

Docker and Docker Compose. 4 cores + 8 GB RAM (add-on to MoE Sovereign). Scales from single-node to Kubernetes cluster.

License Hygiene

Only Apache 2.0, MIT, BSD, AGPLv3 (separate container), and LGPL. Automated audit via scripts/audit-licenses.sh in CI.

EU Hosting

Recommended: Hetzner, IONOS, STACKIT, OVHcloud, Open Telekom Cloud. All BSI-C5 or SecNumCloud certified. AWS/Azure/GCP explicitly not recommended for sovereignty-critical deployments.

Part of MoE Sovereign

MoE Codex is the compliance layer of the MoE Sovereign AI ecosystem, designed for regulated operators.

moe-sovereign.org ↗

The core — distributed AI inference on your own hardware. API gateway, 15 expert templates, GraphRAG, MCP tools.

moe-libris.org ↗

Federated knowledge exchange between sovereign MoE instances. Voluntary, bilateral, no central authority.

docs.moe-sovereign.org ↗

Full documentation: architecture, API reference, EU sovereignty charter, license compliance guide.

GitHub: MoE Codex ↗

Source code, issues, and contribution guidelines. Apache 2.0. Fork-able without restriction.