ChatPaper.aiChatPaper

非常规推理:关于不寻常情况的演绎推理

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

November 14, 2023
作者: Wenting Zhao, Justin T Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi, Xiang Lorraine Li, Alane Suhr
cs.AI

摘要

准确建模事件动态的语言技术必须进行常识推理。现有的评估常识推理的工作侧重于对常见、日常情况进行推断。为了研究模拟不寻常、意外和不太可能情况的能力,我们探索了非常识性推理任务。在给定一个具有意外结果的背景情境时,这项任务要求通过演绎推理生成一个自然语言解释,使意外结果在背景情境中更加可能发生。为此,我们策划并发布了一个名为UNcommonsense的新英语语料库。我们对人类解释者和表现最佳的大型语言模型的性能差异进行了表征,发现通过在具体性和多样性之间权衡,模型增强的人类撰写解释实现了最高质量。最后,我们尝试了几种在线模仿学习算法,以在这一任务上训练开放且可访问的语言模型。与基本的监督微调方法相比,这些方法在常识和非常识性推理上都能持续降低失误率,经由人类评估者评判。
English
Language technologies that accurately model the dynamics of events must perform commonsense reasoning. Existing work evaluating commonsense reasoning focuses on making inferences about common, everyday situations. To instead investigate the ability to model unusual, unexpected, and unlikely situations, we explore the task of uncommonsense abductive reasoning. Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate a natural language explanation that makes the unexpected outcome more likely in the context. To this end, we curate and release a new English language corpus called UNcommonsense. We characterize the differences between the performance of human explainers and the best performing large language models, finding that model-enhanced human-written explanations achieve the highest quality by trading off between specificity and diversity. Finally, we experiment with several online imitation learning algorithms to train open and accessible language models on this task. When compared with the vanilla supervised fine-tuning approach, these methods consistently reduce lose rates on both common and uncommonsense abductive reasoning judged by human evaluators.
PDF110December 15, 2024