ChatPaper.aiChatPaper

非常規推理:對於罕見情況的推論

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

November 14, 2023
作者: Wenting Zhao, Justin T Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi, Xiang Lorraine Li, Alane Suhr
cs.AI

摘要

準確模擬事件動態的語言技術必須進行常識推理。現有的常識推理評估工作著重於對常見日常情況進行推論。為了探究模擬不尋常、意外和不太可能情況的能力,我們探索了非常識推理的任務。在給定一段具有意外結果的上下文時,這個任務要求進行演繹推理,生成一個自然語言解釋,使得上下文中的意外結果更有可能發生。為此,我們彙編並發布了一個名為UNcommonsense的新英語語料庫。我們對比了人類解釋者和表現最佳的大型語言模型的表現差異,發現模型增強的人工撰寫解釋通過在特定性和多樣性之間取得平衡,實現了最高質量。最後,我們嘗試了幾種在線模仿學習算法,以在這個任務上訓練開放且易於訪問的語言模型。與基本監督微調方法相比,這些方法在人類評估者評判的常識和非常識演繹推理中持續降低失敗率。
English
Language technologies that accurately model the dynamics of events must perform commonsense reasoning. Existing work evaluating commonsense reasoning focuses on making inferences about common, everyday situations. To instead investigate the ability to model unusual, unexpected, and unlikely situations, we explore the task of uncommonsense abductive reasoning. Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate a natural language explanation that makes the unexpected outcome more likely in the context. To this end, we curate and release a new English language corpus called UNcommonsense. We characterize the differences between the performance of human explainers and the best performing large language models, finding that model-enhanced human-written explanations achieve the highest quality by trading off between specificity and diversity. Finally, we experiment with several online imitation learning algorithms to train open and accessible language models on this task. When compared with the vanilla supervised fine-tuning approach, these methods consistently reduce lose rates on both common and uncommonsense abductive reasoning judged by human evaluators.
PDF110December 15, 2024