熵正则化激活:通過激活作為熵約束提升連續控制、大型語言模型與圖像分類
Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints
October 9, 2025
作者: Zilin Kang, Chonghua Liao, Tingqiang Xu, Huazhe Xu
cs.AI
摘要
我們提出了ERA這一新範式,通過對模型輸出施加特別設計的激活函數,將採樣熵限制在給定閾值之上。我們的方法在多個領域展現了廣泛的有效性:1) 對於大型語言模型(LLMs),將Qwen2.5-Math-7B在AIME 2025上的得分提升了37.4%;2) 對於連續控制強化學習代理,在HumanoidBench等挑戰性任務上,相較於SAC等強基線,性能提升超過30%;3) 在圖像分類任務中,將ResNet-50在ImageNet上的top-1準確率提高了0.69%。這些增益的實現僅伴隨著不到7%的計算開銷。我們的工作驗證了輸出激活作為熵控制的有力工具,為設計更簡單、更魯棒的算法開闢了新的方向。
English
We propose ERA, a new paradigm that constrains the sampling entropy above
given thresholds by applying specially designed activations to the outputs of
models. Our approach demonstrates broad effectiveness across different domains:
1) for large language models(LLMs), boosting the AIME 2025 score for
Qwen2.5-Math-7B by 37.4%; 2) for continuous control reinforcement learning
agents, improving performance by more than 30% over strong baselines such as
SAC on the challenging HumanoidBench; 3) for image classification, enhancing
ImageNet top-1 accuracy by 0.69% for ResNet-50. These gains are achieved with a
computational overhead of less than 7%. Our work validates output activation as
a powerful tool for entropy control, opening a new direction for designing
simpler and more robust algorithms.