ChatPaper.aiChatPaper

重探模型插值以实现高效推理

Revisiting Model Interpolation for Efficient Reasoning

October 13, 2025
作者: Taiqiang Wu, Runming Yang, Tao Liu, Jiahao Wang, Ngai Wong
cs.AI

摘要

模型融合,特别是在指令型和思维型模型上的应用,已展现出卓越的推理效率。本文中,我们系统性地重新审视了最直接的权重插值这一简单融合方法。特别地,我们观察到模型插值遵循一个三阶段的演化范式,在推理轨迹上表现出独特的行为模式。这些动态特性为权衡性能与成本提供了原则性指导。实证结果表明,策略性插值后的模型在效率和效果上意外地超越了复杂的模型融合基线。我们进一步通过模型层级、模块及解码策略的广泛消融研究验证了这些发现。最终,本研究揭示了模型插值的内在机制,并为精确定制具备目标推理能力的模型提供了实用框架。代码已发布于https://github.com/wutaiqiang/MI{Github}。
English
Model merging, typically on Instruct and Thinking models, has shown remarkable performance for efficient reasoning. In this paper, we systematically revisit the simplest merging method that interpolates two weights directly. Particularly, we observe that model interpolation follows a three-stage evolutionary paradigm with distinct behaviors on the reasoning trajectory. These dynamics provide a principled guide for navigating the performance-cost trade-off. Empirical results demonstrate that a strategically interpolated model surprisingly surpasses sophisticated model merging baselines on both efficiency and effectiveness. We further validate our findings with extensive ablation studies on model layers, modules, and decoding strategies. Ultimately, this work demystifies model interpolation and offers a practical framework for crafting models with precisely targeted reasoning capabilities. Code is available at https://github.com/wutaiqiang/MI{Github}.
PDF86October 16, 2025