将示例提炼为任务指令：面向真实世界B2B对话的增强上下文学习

摘要

上下文学习（ICL）是低资源分类的标准方法，但其在专业领域的效力仍鲜有探索。我们针对语义复杂的多方B2B对话分类难题展开研究——传统ICL在此场景下存在显著局限，尤其是当多个少样本示例拼接导致上下文长度增加时。我们提出Call Playbook数据集，包含源自真实B2B对话的五项分类任务，聚焦核心销售概念。为弥合性能与实用性的差距，我们创新性地提出知识提取方法，将冗长示例蒸馏为结构化分类标准与精确任务描述的紧凑可解释表征。该方法相比传统ICL实现99%的令牌用量缩减，宏平均AUC提升最高达7%。值得关注的是，在上下文增长时其仍保持稳健性，而先进的令牌压缩基线方法则衰减超过9个F1分值。更重要的是，我们的框架支持对分类逻辑的直接优化，满足了现实NLP应用中对透明度、效率与用户交互的关键需求。

English

In-context learning (ICL) is the standard method for low-resource classification, yet its efficacy in specialized domains remains largely unexplored. We address the challenge of classifying semantically complex, multi-party B2B conversations, where traditional ICL encounters significant limitations, especially as context length increases due to the concatenation of multiple few-shot examples. We introduce the Call Playbook dataset, featuring five classification tasks derived from real-world B2B conversations targeting core sales concepts. To bridge the gap between performance and practical utility, we propose novel knowledge extraction methods that distill verbose examples into compact, interpretable representations of structured classification criteria and precise task descriptions. Our approach achieves a 99\% reduction in token usage and improves macro-averaged AUC by up to 7\% over traditional ICL. Notably, it remains robust as context grows, unlike advanced token compression baselines which degrade by over 9 F1 points. Importantly, our framework enables direct refinement of classification logic, addressing critical needs for transparency, efficiency, and user interaction in real-world NLP applications.