EviMem:證據差距驅動的迭代檢索方法用於長期對話記憶
EviMem: Evidence-Gap-Driven Iterative Retrieval for Long-Term Conversational Memory
April 30, 2026
作者: Yuyang Li, Yime He, Zeyu Zhang, Dong Gong
cs.AI
摘要
長期會話記憶需要從分散於多個對話輪次的證據中進行檢索,然而單次檢索在處理時間性與多跳問題時成效不彰。現有的迭代方法透過生成內容或文件層級訊號來優化查詢,但皆未明確診斷證據缺口——亦即累積檢索結果中缺失的環節,導致查詢優化缺乏目標性。我們提出 EviMem,結合 IRIS(透過不足訊號進行迭代檢索)——一種透過充分性評估檢測證據缺口、診斷缺失內容並驅動目標導向查詢優化的閉環框架——與 LaceMem(階層式會話證據記憶架構)——一種由粗略到精細的記憶階層,支援細粒度缺口診斷。在 LoCoMo 資料集上,EviMem 在時間性問題(73.3% 提升至 81.6%)與多跳問題(65.9% 提升至 85.2%)的評判準確率上優於 MIRIX,且延遲降低 4.5 倍。程式碼:https://github.com/AIGeeksGroup/EviMem。
English
Long-term conversational memory requires retrieving evidence scattered across multiple sessions, yet single-pass retrieval fails on temporal and multi-hop questions. Existing iterative methods refine queries via generated content or document-level signals, but none explicitly diagnoses the evidence gap, namely what is missing from the accumulated retrieval set, leaving query refinement untargeted. We present EviMem, combining IRIS (Iterative Retrieval via Insufficiency Signals), a closed-loop framework that detects evidence gaps through sufficiency evaluation, diagnoses what is missing, and drives targeted query refinement, with LaceMem (Layered Architecture for Conversational Evidence Memory), a coarse-to-fine memory hierarchy supporting fine-grained gap diagnosis. On LoCoMo, EviMem improves Judge Accuracy over MIRIX on temporal (73.3% to 81.6%) and multi-hop (65.9% to 85.2%) questions at 4.5x lower latency. Code: https://github.com/AIGeeksGroup/EviMem.