非常規推理:對於罕見情況的推論
UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations
November 14, 2023
作者: Wenting Zhao, Justin T Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi, Xiang Lorraine Li, Alane Suhr
cs.AI
摘要
準確模擬事件動態的語言技術必須進行常識推理。現有的常識推理評估工作著重於對常見日常情況進行推論。為了探究模擬不尋常、意外和不太可能情況的能力,我們探索了非常識推理的任務。在給定一段具有意外結果的上下文時,這個任務要求進行演繹推理,生成一個自然語言解釋,使得上下文中的意外結果更有可能發生。為此,我們彙編並發布了一個名為UNcommonsense的新英語語料庫。我們對比了人類解釋者和表現最佳的大型語言模型的表現差異,發現模型增強的人工撰寫解釋通過在特定性和多樣性之間取得平衡,實現了最高質量。最後,我們嘗試了幾種在線模仿學習算法,以在這個任務上訓練開放且易於訪問的語言模型。與基本監督微調方法相比,這些方法在人類評估者評判的常識和非常識演繹推理中持續降低失敗率。
English
Language technologies that accurately model the dynamics of events must
perform commonsense reasoning. Existing work evaluating commonsense reasoning
focuses on making inferences about common, everyday situations. To instead
investigate the ability to model unusual, unexpected, and unlikely situations,
we explore the task of uncommonsense abductive reasoning. Given a piece of
context with an unexpected outcome, this task requires reasoning abductively to
generate a natural language explanation that makes the unexpected outcome more
likely in the context. To this end, we curate and release a new English
language corpus called UNcommonsense. We characterize the differences between
the performance of human explainers and the best performing large language
models, finding that model-enhanced human-written explanations achieve the
highest quality by trading off between specificity and diversity. Finally, we
experiment with several online imitation learning algorithms to train open and
accessible language models on this task. When compared with the vanilla
supervised fine-tuning approach, these methods consistently reduce lose rates
on both common and uncommonsense abductive reasoning judged by human
evaluators.