熵正则化激活：通过将激活作为熵约束来增强连续控制、大型语言模型和图像分类

摘要

我们提出了ERA这一新范式，通过对模型输出施加特殊设计的激活函数，将采样熵约束在给定阈值之上。我们的方法在不同领域展现出广泛的有效性：1) 对于大语言模型(LLMs)，将Qwen2.5-Math-7B在AIME 2025上的得分提升了37.4%；2) 对于连续控制强化学习智能体，在HumanoidBench等挑战性任务上，相较于SAC等强基线，性能提升超过30%；3) 在图像分类任务中，ResNet-50在ImageNet上的top-1准确率提高了0.69%。这些性能提升仅带来了不到7%的计算开销。我们的工作验证了输出激活作为熵控制的有力工具，为设计更简单、更鲁棒的算法开辟了新方向。

English

We propose ERA, a new paradigm that constrains the sampling entropy above given thresholds by applying specially designed activations to the outputs of models. Our approach demonstrates broad effectiveness across different domains: 1) for large language models(LLMs), boosting the AIME 2025 score for Qwen2.5-Math-7B by 37.4%; 2) for continuous control reinforcement learning agents, improving performance by more than 30% over strong baselines such as SAC on the challenging HumanoidBench; 3) for image classification, enhancing ImageNet top-1 accuracy by 0.69% for ResNet-50. These gains are achieved with a computational overhead of less than 7%. Our work validates output activation as a powerful tool for entropy control, opening a new direction for designing simpler and more robust algorithms.

熵正则化激活：通过将激活作为熵约束来增强连续控制、大型语言模型和图像分类

Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints

摘要

Support