ATANT: AI継続性の評価フレームワーク

要旨

本論文では、AIシステムにおける連続性（時間を超えて意味のある文脈を持続・更新・曖昧性解消・再構築する能力）を測定するための公開評価フレームワーク「ATANT（Automated Test for Acceptance of Narrative Truth）」を提案する。AI業界ではメモリ構成要素（RAGパイプライン、ベクトルデータベース、長文脈ウィンドウ、プロファイル層）が開発されているが、これらの構成要素が真の連続性を生み出しているかを正式に定義・測定する公開フレームワークは存在しない。我々は連続性を7つの必須特性からなるシステム特性として定義し、評価ループにLLMを使用しない10段階の評価方法論を導入し、6つの生活領域にわたる250の物語と1,835の検証質問からなる物語テストコーパスを構築した。参照実装を5回のテストスイート反復で評価した結果、単独モード（250物語）では58%（従来アーキテクチャ）から100%へ、累積モード（50物語）では100%、250物語累積規模では96%の性能を達成した。累積結果が主要評価指標である：250の異なる人生の物語が同一データベースに共存する場合、システムは文脈の混交なく、正しい文脈に対して正しい事実を検索できなければならない。ATANTはシステム非依存・モデル非依存で、連続性システムの構築と検証のための段階的方法論として設計されている。フレームワーク仕様、物語例、評価プロトコルはhttps://github.com/Kenotic-Labs/ATANTで公開されている。全250物語のコーパスは段階的に公開予定である。

English

We present ATANT (Automated Test for Acceptance of Narrative Truth), an open evaluation framework for measuring continuity in AI systems: the ability to persist, update, disambiguate, and reconstruct meaningful context across time. While the AI industry has produced memory components (RAG pipelines, vector databases, long context windows, profile layers), no published framework formally defines or measures whether these components produce genuine continuity. We define continuity as a system property with 7 required properties, introduce a 10-checkpoint evaluation methodology that operates without an LLM in the evaluation loop, and present a narrative test corpus of 250 stories comprising 1,835 verification questions across 6 life domains. We evaluate a reference implementation across 5 test suite iterations, progressing from 58% (legacy architecture) to 100% in isolated mode (250 stories) and 100% in 50-story cumulative mode, with 96% at 250-story cumulative scale. The cumulative result is the primary measure: when 250 distinct life narratives coexist in the same database, the system must retrieve the correct fact for the correct context without cross-contamination. ATANT is system-agnostic, model-independent, and designed as a sequenced methodology for building and validating continuity systems. The framework specification, example stories, and evaluation protocol are available at https://github.com/Kenotic-Labs/ATANT. The full 250-story corpus will be released incrementally.

ATANT: AI継続性の評価フレームワーク

ATANT: An Evaluation Framework for AI Continuity

要旨

Support