DAR：基于能动性约束的道义推理

摘要

道义推理是指通过将明确的规则和政策应用于具体案例事实来回答问题，例如根据法规计算纳税义务或确定移民上诉结果。基于大语言模型的道义推理面临的一个关键技术挑战是，相关规则集可能冗长且相互交叉引用，因此模型仍可能无法定位特定推理步骤所需的规则。我们提出了道义代理推理（DAR），这是一种按需与法规交互的代理推理框架。我们在DeonticBench困难子集上使用多个框架对DAR进行了评估。在这些设置中，我们发现代理框架能够推动道义推理任务的前沿进展，但改进并非均衡：较弱模型通常在数值任务上表现下降，同时消耗远超以往的令牌量。

English

Deontic reasoning is the task of answering questions by applying explicit rules and policies to case-specific facts, for example computing tax liability under a statute or determining the outcome of an immigration appeal. A key technical challenge for LLM-based deontic reasoning is that the relevant ruleset can be long and cross-referenced, so models may still fail to locate the rules needed for a particular reasoning step. We introduce Deontic Agentic Reasoning (DAR), an agentic reasoning setup in which the model interacts with the statutes on demand. We evaluate DAR under multiple harnesses on hard subsets of DeonticBench. Across these settings, we find that agentic harnesses can push the frontier on deontic reasoning tasks, but improvements are not uniform: weaker models often degrade on numerical tasks while consuming far more tokens.