复杂话语的自然语言分解与解释

摘要

自然语言界面通常需要监督数据，将用户请求翻译成程序、数据库查询或其他结构化意图表示。在数据收集过程中，很难预测并形式化用户需求的全部范围 -- 例如，在设计用于处理简单请求的系统中（比如找到明天的会议或将与经理的会议改到中午），用户可能还会表达更复杂的请求（比如交换周一和周二的所有电话）。我们提出了一种方法，通过分层自然语言分解过程，为简单的语言到代码模型提供处理复杂话语的能力。我们的方法使用预训练语言模型将复杂话语分解为一系列较小的自然语言步骤，然后使用语言到代码模型解释每个步骤。为了测试我们的方法，我们收集并发布了DeCU -- 一个新的NL到程序基准，用于评估复杂话语的分解。实验表明，所提出的方法能够几乎不需要复杂训练数据即可解释复杂话语，同时优于标准的少样本提示方法。

English

Natural language interfaces often require supervised data to translate user requests into programs, database queries, or other structured intent representations. During data collection, it can be difficult to anticipate and formalize the full range of user needs -- for example, in a system designed to handle simple requests (like find my meetings tomorrow or move my meeting with my manager to noon), users may also express more elaborate requests (like swap all my calls on Monday and Tuesday). We introduce an approach for equipping a simple language-to-code model to handle complex utterances via a process of hierarchical natural language decomposition. Our approach uses a pre-trained language model to decompose a complex utterance into a sequence of smaller natural language steps, then interprets each step using the language-to-code model. To test our approach, we collect and release DeCU -- a new NL-to-program benchmark to evaluate Decomposition of Complex Utterances. Experiments show that the proposed approach enables the interpretation of complex utterances with almost no complex training data, while outperforming standard few-shot prompting approaches.

复杂话语的自然语言分解与解释

Natural Language Decomposition and Interpretation of Complex Utterances

摘要

Support