事例をタスク指示に蒸留：実世界のB2B会話のための強化されたインコンテキスト学習

要旨

文脈内学習（In-context learning, ICL）は低リソース分類における標準的な手法であるが、専門領域での有効性は未だほとんど解明されていない。本研究では、複数の少数ショット例を連結することによる文脈長の増加に伴い、従来のICLが顕著な限界に直面する、意味的に複雑な多者間B2B会話の分類課題に取り組む。我々は、実世界のB2B会話から抽出された中核的な営業概念を対象とする5つの分類タスクを備えたCall Playbookデータセットを導入する。性能と実用性の乖離を埋めるため、冗長な例を構造化された分類基準と精密なタスク記述のコンパクトで解釈可能な表現へと蒸留する、新たな知識抽出手法を提案する。本手法は、従来のICLと比較してトークン使用量を99%削減し、マクロ平均AUCを最大7%向上させる。特筆すべきは、9F1ポイント以上低下する高度なトークン圧縮ベースラインとは対照的に、文脈が増加しても本手法は頑健性を維持することである。さらに重要な点として、本フレームワークは分類ロジックの直接的な改良を可能にし、実世界のNLPアプリケーションにおける透明性、効率性、ユーザーとの対話に関する重要なニーズに対応する。

English

In-context learning (ICL) is the standard method for low-resource classification, yet its efficacy in specialized domains remains largely unexplored. We address the challenge of classifying semantically complex, multi-party B2B conversations, where traditional ICL encounters significant limitations, especially as context length increases due to the concatenation of multiple few-shot examples. We introduce the Call Playbook dataset, featuring five classification tasks derived from real-world B2B conversations targeting core sales concepts. To bridge the gap between performance and practical utility, we propose novel knowledge extraction methods that distill verbose examples into compact, interpretable representations of structured classification criteria and precise task descriptions. Our approach achieves a 99\% reduction in token usage and improves macro-averaged AUC by up to 7\% over traditional ICL. Notably, it remains robust as context grows, unlike advanced token compression baselines which degrade by over 9 F1 points. Importantly, our framework enables direct refinement of classification logic, addressing critical needs for transparency, efficiency, and user interaction in real-world NLP applications.