ChatPaper.aiChatPaper

熵正则化激活:通过将激活作为熵约束来增强连续控制、大型语言模型和图像分类

Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints

October 9, 2025
作者: Zilin Kang, Chonghua Liao, Tingqiang Xu, Huazhe Xu
cs.AI

摘要

我们提出了ERA这一新范式,通过对模型输出施加特殊设计的激活函数,将采样熵约束在给定阈值之上。我们的方法在不同领域展现出广泛的有效性:1) 对于大语言模型(LLMs),将Qwen2.5-Math-7B在AIME 2025上的得分提升了37.4%;2) 对于连续控制强化学习智能体,在HumanoidBench等挑战性任务上,相较于SAC等强基线,性能提升超过30%;3) 在图像分类任务中,ResNet-50在ImageNet上的top-1准确率提高了0.69%。这些性能提升仅带来了不到7%的计算开销。我们的工作验证了输出激活作为熵控制的有力工具,为设计更简单、更鲁棒的算法开辟了新方向。
English
We propose ERA, a new paradigm that constrains the sampling entropy above given thresholds by applying specially designed activations to the outputs of models. Our approach demonstrates broad effectiveness across different domains: 1) for large language models(LLMs), boosting the AIME 2025 score for Qwen2.5-Math-7B by 37.4%; 2) for continuous control reinforcement learning agents, improving performance by more than 30% over strong baselines such as SAC on the challenging HumanoidBench; 3) for image classification, enhancing ImageNet top-1 accuracy by 0.69% for ResNet-50. These gains are achieved with a computational overhead of less than 7%. Our work validates output activation as a powerful tool for entropy control, opening a new direction for designing simpler and more robust algorithms.
PDF62October 10, 2025