通過函數調用將大型語言模型作為零-shot 對話狀態跟踪器

摘要

大型語言模型（LLMs）在對話系統中日益普及，因為它們在一般情境中具有先進的理解和生成能力。然而，在需要不僅生成回應還要在特定任務和領域內有效跟踪對話狀態（DST）的任務導向對話（TOD）中，它們的效果仍然不盡人意。在這項工作中，我們提出了一種新方法 FnCTOD，通過函數調用來解決LLMs中的DST。這種方法改進了零-shot DST，使其能夠適應不同領域，而無需進行大量數據收集或模型調整。我們的實驗結果表明，我們的方法在中等大小的開源和專有LLMs上均取得了優異表現：通過上下文提示，使各種7B或13B參數模型超越了ChatGPT先前達到的最新技術水平（SOTA），並提高了ChatGPT的性能，超越SOTA 5.6%的平均JGA。 GPT-3.5和GPT-4的單個模型結果分別提高了4.8%和14%。我們還展示，通過在一小部分多樣化的任務導向對話上進行微調，我們可以為中等大小的模型提供功能調用能力，特別是13B參數LLaMA2-Chat模型，其DST性能可與ChatGPT相媲美，同時保持其聊天能力。我們計劃開源實驗代碼和模型。

English

Large language models (LLMs) are increasingly prevalent in conversational systems due to their advanced understanding and generative capabilities in general contexts. However, their effectiveness in task-oriented dialogues (TOD), which requires not only response generation but also effective dialogue state tracking (DST) within specific tasks and domains, remains less satisfying. In this work, we propose a novel approach FnCTOD for solving DST with LLMs through function calling. This method improves zero-shot DST, allowing adaptation to diverse domains without extensive data collection or model tuning. Our experimental results demonstrate that our approach achieves exceptional performance with both modestly sized open-source and also proprietary LLMs: with in-context prompting it enables various 7B or 13B parameter models to surpass the previous state-of-the-art (SOTA) achieved by ChatGPT, and improves ChatGPT's performance beating the SOTA by 5.6% Avg. JGA. Individual model results for GPT-3.5 and GPT-4 are boosted by 4.8% and 14%, respectively. We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities. We plan to open-source experimental code and model.

通過函數調用將大型語言模型作為零-shot 對話狀態跟踪器

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

摘要

Support