想起を超えて：AIパーソナライゼーションのための解釈層としての行動仕様

要旨

もしAIエージェントが個人に代わって意思決定を行うならば、その決定はユーザーと一致していなければならない。本稿では、システムが個人の解釈をどの程度忠実に捉えているかを測定するために、表現精度を導入する。解釈層は行動仕様として操作化される。我々の参照実装は、個人のデータを解釈パターンに積極的に圧縮し、それを言語モデルへのコンテキストとして提供する。行動仕様の評価は、校正された5人の審査員からなるLLMパネルによって評価される保持された行動予測のプロトタイプベンチマークで行う。これを独立して、また、完全な生コーパス、完全な抽出事実、および4つの商用メモリシステム（Mem0、Letta、Supermemory、Zep）を含む様々なコンテキスト条件との組み合わせでテストする。 14の公開自伝コーパスにわたって、行動仕様は表現精度を総合的に向上させ、モデルの回避をほぼ排除する。生コーパスが提供するものの大部分を、約25分の1のコンテキストコストで回復する。行動仕様は、事前学習ベースラインに関係なく、被験者を共通の予測レベルに引き上げる。したがって、絶対ポイントでの向上はベースラインが最も低い場合に最大となり、関連する集団は事前学習で適切に表現されていないすべての人であることを示唆する。向上は解釈が必要な質問において最大であり、解釈層を提供することで、抽出事実や生コーパスでは実現できないモデルの振る舞いが可能になる。逆に、想起が必要な質問では、この層が妨害となる場合がある。結論として、表現精度は想起とは異なり、人間-AIの整合性はユーザーがどれだけ正確に表現されているかに依存する。表現精度はその整合性をテスト可能にする。

English

If an AI agent makes decisions on a person's behalf, those decisions must align with its user. We introduce representational accuracy to measure how faithfully a system captures a person's interpretation. An interpretive layer is operationalized as a Behavioral Specification. Our reference implementation aggressively compresses a person's data into interpretive patterns, served as context to a language model. We evaluate the Specification on a prototype benchmark of held-out behavioral predictions scored by a calibrated 5-judge LLM panel. We test it independently and in composition with a range of context conditions: full raw corpus, full extracted facts, and four commercial memory systems (Mem0, Letta, Supermemory, Zep). Across 14 public-domain autobiographical corpora, the Specification lifts representational accuracy in aggregate and nearly eliminates model hedging. It recovers most of what the raw corpus delivers, at ~25x less context cost. The Specification lifts subjects toward a common predictive level regardless of pretraining baseline; the lift in absolute points is therefore largest where the baseline is lowest, suggesting the population of relevance is anyone not adequately represented in pretraining. Lift is greatest on interpretation-required questions, where providing an interpretive layer enables model behavior that extracted facts or raw corpus do not. Conversely, on recall-required questions, this layer can interfere rather than help. We conclude that representational accuracy is distinct from recall and that human-AI alignment is dependent on how accurately the user is represented. Representational accuracy makes that alignment testable.