ATANT: Ein Bewertungsrahmen für KI-Kontinuität

Zusammenfassung

Wir stellen ATANT (Automated Test for Acceptance of Narrative Truth) vor, ein offenes Evaluierungsframework zur Messung von Kontinuität in KI-Systemen: der Fähigkeit, bedeutungsvollen Kontext über die Zeit hinweg zu erhalten, zu aktualisieren, zu disambiguieren und zu rekonstruieren. Während die KI-Branche Speicherkomponenten (RAG-Pipelines, Vektordatenbanken, lange Kontextfenster, Profilschichten) entwickelt hat, existiert kein veröffentlichtes Framework, das formal definiert oder misst, ob diese Komponenten echte Kontinuität erzeugen. Wir definieren Kontinuität als Systemeigenschaft mit 7 erforderlichen Merkmalen, führen eine Evaluierungsmethodik mit 10 Checkpoints ein, die ohne einen LLM im Evaluierungszyklus arbeitet, und stellen einen narrativen Testkorpus von 250 Geschichten vor, der 1.835 Verifikationsfragen aus 6 Lebensbereichen umfasst. Wir evaluieren eine Referenzimplementierung über 5 Testsuiten-Iterationen, die sich von 58% (veraltete Architektur) auf 100% im isolierten Modus (250 Geschichten) und 100% im kumulativen 50-Geschichten-Modus verbessert, mit 96% auf der kumulativen 250-Geschichten-Skala. Das kumulative Ergebnis ist die primäre Messgröße: Wenn 250 verschiedene Lebensnarrative in derselben Datenbank koexistieren, muss das System die korrekte Information für den richtigen Kontext abrufen, ohne Kreuzkontamination. ATANT ist systemagnostisch, modellunabhängig und als sequenzierte Methodik zum Aufbau und zur Validierung von Kontinuitätssystemen konzipiert. Die Framework-Spezifikation, Beispielgeschichten und das Evaluierungsprotokoll sind unter https://github.com/Kenotic-Labs/ATANT verfügbar. Der vollständige 250-Geschichten-Korpus wird schrittweise veröffentlicht.

English

We present ATANT (Automated Test for Acceptance of Narrative Truth), an open evaluation framework for measuring continuity in AI systems: the ability to persist, update, disambiguate, and reconstruct meaningful context across time. While the AI industry has produced memory components (RAG pipelines, vector databases, long context windows, profile layers), no published framework formally defines or measures whether these components produce genuine continuity. We define continuity as a system property with 7 required properties, introduce a 10-checkpoint evaluation methodology that operates without an LLM in the evaluation loop, and present a narrative test corpus of 250 stories comprising 1,835 verification questions across 6 life domains. We evaluate a reference implementation across 5 test suite iterations, progressing from 58% (legacy architecture) to 100% in isolated mode (250 stories) and 100% in 50-story cumulative mode, with 96% at 250-story cumulative scale. The cumulative result is the primary measure: when 250 distinct life narratives coexist in the same database, the system must retrieve the correct fact for the correct context without cross-contamination. ATANT is system-agnostic, model-independent, and designed as a sequenced methodology for building and validating continuity systems. The framework specification, example stories, and evaluation protocol are available at https://github.com/Kenotic-Labs/ATANT. The full 250-story corpus will be released incrementally.

ATANT: Ein Bewertungsrahmen für KI-Kontinuität

ATANT: An Evaluation Framework for AI Continuity

Zusammenfassung

Support