DAR：基於主體性框架的道義推理

摘要

道义推理是指通过将明确的规则和规范应用于具体案件事实来回答问题，例如根据法规计算税务责任，或决定移民上诉的结果。基于大型语言模型的道义推理面临一项关键技术挑战：相关规则集可能冗长且相互引用，导致模型在特定推理步骤中可能仍无法定位所需规则。我们提出道义代理推理（Deontic Agentic Reasoning, DAR），这是一种按需让模型与法规进行交互的代理推理架构。我们在DeonticBench的困难子集上，通过多种框架对DAR进行评估。在这些情境下，我们发现代理框架能够推动道义推理任务的前沿发展，但改进并不均衡：较弱的模型在数值类任务上往往性能下降，同时消耗大量额外令牌。

English

Deontic reasoning is the task of answering questions by applying explicit rules and policies to case-specific facts, for example computing tax liability under a statute or determining the outcome of an immigration appeal. A key technical challenge for LLM-based deontic reasoning is that the relevant ruleset can be long and cross-referenced, so models may still fail to locate the rules needed for a particular reasoning step. We introduce Deontic Agentic Reasoning (DAR), an agentic reasoning setup in which the model interacts with the statutes on demand. We evaluate DAR under multiple harnesses on hard subsets of DeonticBench. Across these settings, we find that agentic harnesses can push the frontier on deontic reasoning tasks, but improvements are not uniform: weaker models often degrade on numerical tasks while consuming far more tokens.