TIDE: テンプレート誘導反復によるプロアクティブな複数問題発見

要旨

エージェントは、ドキュメント、ツール、コードに対するアシスタントとして広く活用されています。しかし、通常は明示的なユーザーの要求にのみ応じて動作するため、ユーザーが気づいた問題だけが表面化します。一方、ユーザーの広範なコンテキストの中には、他にも多くの重要な問題が「見えているのに気づかれない」状態で共存しており、その総数は事前に把握できません。本研究では、これを「コンテキストから複数の隠れた問題を発見するタスク」として位置づけます。ここでは、共存する問題を明らかにし、裏付けとなる証拠に基づいて具体化し、実行可能なアクションと結びつける必要があります。この目的のために、我々はTIDE（テンプレート誘導型反復フレームワーク）を導入します。このフレームワークは、相補的な2つのメカニズムを備えています。具体的には、単一パスの予測が最も顕著なケースに焦点を当て、汎用的な主張しか生成できないという観察に基づき、以下の2つを提案します。まず「反復的発見」：1ラウンドごとに少数の候補を、既に発見された内容を考慮しながら抽出し、以降のラウンドでカバレッジを拡張します。次に「思考テンプレート」：過去に解決した事例から抽出した再利用可能なスキーマであり、どのようなコンテキストシグナルに注目すべきか、それらをどのように結びつけるかを規定し、各予測を認識可能な問題クラスに紐づけます。我々はTIDEを、個人ワークスペースとソフトウェアリポジトリという2つの現実的な設定で、4つのモデルバックボーンを用いて検証しました。その結果、単発的な予測や並列マルチエージェントベースラインと比較して、タスクのカバレッジ、特定、解決において大幅な改善を示しました。

English

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their total number unknown in advance. We frame this as the task of discovering multiple hidden problems from context, in which coexisting problems should be uncovered, grounded in supporting evidence, and paired with concrete actions. To this end, we introduce TIDE, a template-guided iterative framework with two complementary mechanisms. Specifically, motivated by the observation that single-pass prediction anchors on the most salient cases and yields generic claims, we propose iterative discovery, which surfaces a small batch of candidates per round while conditioning on what has already been found, so subsequent rounds extend coverage; and thought templates, reusable schemas distilled from previously solved cases that specify what contextual signals to attend to and how to connect them, anchoring each prediction in a recognizable problem class. We validate TIDE on two realistic settings, personal workspaces and software repositories, across four model backbones, showing substantial gains over single-shot and parallel multi-agent baselines on task coverage, identification, and resolution.