多維度約束框架：評估與提升大型語言模型的指令遵循能力

摘要

指令遵循评估旨在检验大型语言模型（LLMs）在生成符合用户定义约束的输出方面的能力。然而，现有的基准测试往往依赖于模板化的约束提示，这些提示缺乏现实世界应用的多样性，限制了细粒度性能的评估。为填补这一空白，我们提出了一个多维约束框架，涵盖三种约束模式、四类约束类别以及四个难度等级。基于此框架，我们开发了一个自动化指令生成管道，执行约束扩展、冲突检测和指令重写，生成了1,200个可代码验证的指令遵循测试样本。我们对来自七个模型家族的19个LLMs进行了评估，发现不同约束形式下的性能存在显著差异。例如，平均性能从第一级的77.67%降至第四级的32.96%。此外，我们通过利用该方法生成强化学习数据，展示了其实际效用，在指令遵循方面取得了显著提升，且未降低模型的整体性能。深入分析表明，这些提升主要源于模型注意力模块参数的调整，增强了约束的识别与遵循能力。代码与数据可在https://github.com/Junjie-Ye/MulDimIF获取。

English

Instruction following evaluates large language models (LLMs) on their ability to generate outputs that adhere to user-defined constraints. However, existing benchmarks often rely on templated constraint prompts, which lack the diversity of real-world usage and limit fine-grained performance assessment. To fill this gap, we propose a multi-dimensional constraint framework encompassing three constraint patterns, four constraint categories, and four difficulty levels. Building on this framework, we develop an automated instruction generation pipeline that performs constraint expansion, conflict detection, and instruction rewriting, yielding 1,200 code-verifiable instruction-following test samples. We evaluate 19 LLMs across seven model families and uncover substantial variation in performance across constraint forms. For instance, average performance drops from 77.67% at Level I to 32.96% at Level IV. Furthermore, we demonstrate the utility of our approach by using it to generate data for reinforcement learning, achieving substantial gains in instruction following without degrading general performance. In-depth analysis indicates that these gains stem primarily from modifications in the model's attention modules parameters, which enhance constraint recognition and adherence. Code and data are available in https://github.com/Junjie-Ye/MulDimIF.

多維度約束框架：評估與提升大型語言模型的指令遵循能力

A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

摘要

Support