大型语言模型作为零-shot 对话状态跟踪器通过函数调用。

摘要

大型语言模型（LLMs）在会话系统中越来越普遍，这是因为它们在一般情境中具有先进的理解和生成能力。然而，在需要不仅生成响应还要在特定任务和领域内有效跟踪对话状态（DST）的任务导向对话（TOD）中，它们的有效性仍然不尽人意。在这项工作中，我们提出了一种新颖的FnCTOD方法，通过函数调用来解决LLMs中的DST。这种方法改进了零-shot DST，使其能够适应各种领域，而无需进行大量数据收集或模型调整。我们的实验结果表明，我们的方法在使用开源模型和专有LLMs时均取得了出色的性能：通过上下文提示，它使各种7B或13B参数模型能够超越ChatGPT之前的最先进技术水平（SOTA），并提高ChatGPT的性能，超过SOTA 5.6%的平均JGA。GPT-3.5和GPT-4的单独模型结果分别提高了4.8%和14%。我们还展示，通过在一小组多样化的任务导向对话上进行微调，我们可以为中等规模的模型，特别是13B参数的LLaMA2-Chat模型，提供函数调用功能和DST性能，这与ChatGPT相当，同时保持其聊天功能。我们计划开源实验代码和模型。

English

Large language models (LLMs) are increasingly prevalent in conversational systems due to their advanced understanding and generative capabilities in general contexts. However, their effectiveness in task-oriented dialogues (TOD), which requires not only response generation but also effective dialogue state tracking (DST) within specific tasks and domains, remains less satisfying. In this work, we propose a novel approach FnCTOD for solving DST with LLMs through function calling. This method improves zero-shot DST, allowing adaptation to diverse domains without extensive data collection or model tuning. Our experimental results demonstrate that our approach achieves exceptional performance with both modestly sized open-source and also proprietary LLMs: with in-context prompting it enables various 7B or 13B parameter models to surpass the previous state-of-the-art (SOTA) achieved by ChatGPT, and improves ChatGPT's performance beating the SOTA by 5.6% Avg. JGA. Individual model results for GPT-3.5 and GPT-4 are boosted by 4.8% and 14%, respectively. We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities. We plan to open-source experimental code and model.

大型语言模型作为零-shot 对话状态跟踪器通过函数调用。

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

摘要

Support