多维约束框架：评估与提升大语言模型指令遵循能力

摘要

指令遵循评估旨在衡量大型语言模型（LLMs）在生成符合用户定义约束的输出方面的能力。然而，现有基准测试多依赖于模板化的约束提示，缺乏现实应用场景的多样性，限制了细粒度性能的评估。为填补这一空白，我们提出了一种多维约束框架，该框架包含三种约束模式、四类约束类别以及四个难度等级。基于此框架，我们开发了一套自动化指令生成流程，执行约束扩展、冲突检测及指令重写，最终生成了1,200个可代码验证的指令遵循测试样本。我们对来自七个模型家族的19个LLMs进行了评估，发现不同约束形式下的性能存在显著差异。例如，平均性能从第一级的77.67%降至第四级的32.96%。此外，我们通过利用该方法生成强化学习数据，展示了其实际效用，在指令遵循方面取得了显著提升，且未损害模型的整体性能。深入分析表明，这些提升主要源于模型注意力模块参数的调整，从而增强了约束的识别与遵循能力。代码与数据已发布于https://github.com/Junjie-Ye/MulDimIF。

English

Instruction following evaluates large language models (LLMs) on their ability to generate outputs that adhere to user-defined constraints. However, existing benchmarks often rely on templated constraint prompts, which lack the diversity of real-world usage and limit fine-grained performance assessment. To fill this gap, we propose a multi-dimensional constraint framework encompassing three constraint patterns, four constraint categories, and four difficulty levels. Building on this framework, we develop an automated instruction generation pipeline that performs constraint expansion, conflict detection, and instruction rewriting, yielding 1,200 code-verifiable instruction-following test samples. We evaluate 19 LLMs across seven model families and uncover substantial variation in performance across constraint forms. For instance, average performance drops from 77.67% at Level I to 32.96% at Level IV. Furthermore, we demonstrate the utility of our approach by using it to generate data for reinforcement learning, achieving substantial gains in instruction following without degrading general performance. In-depth analysis indicates that these gains stem primarily from modifications in the model's attention modules parameters, which enhance constraint recognition and adherence. Code and data are available in https://github.com/Junjie-Ye/MulDimIF.

多维约束框架：评估与提升大语言模型指令遵循能力

A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

摘要

Support