控制LLM：LLM智能保留的受控演化

摘要

大型語言模型（LLMs）需要大量的計算資源，因此必須增強其功能，而無需從頭重新訓練。在這個領域的一個關鍵挑戰是災難性遺忘（CF），它影響了連續預訓練（CPT）和連續監督微調（CSFT）期間的性能。我們提出了Control LLM，這是一種新穎的方法，利用平行預先訓練和擴展的Transformer模塊，通過插值策略對齊它們的隱藏狀態。這種方法有效地保留了現有任務的性能，同時無縫地整合新知識。大量實驗證明了Control LLM在CPT和CSFT中的有效性。在Llama3.1-8B-Instruct上，它在數學推理（+14.4%在Math-Hard）和編碼性能（+10%在MBPP-PLUS）方面取得了顯著的改善。在Llama3.1-8B上，它增強了多語言能力（+10.6%在C-Eval，+6.8%在CMMLU，以及+30.2%在CMMLU-0shot-CoT）。它超越了現有方法，在從相同基礎模型調整的開源模型中實現了SOTA，同時使用的數據和計算量明顯較少。至關重要的是，這些收益是在保留強大原始功能的同時實現的，與開源數學和編碼模型相比，其降級最小（<4.3%在MMLU，而開源數學和編碼模型則超過35%）。這種方法已成功應用於LinkedIn的GenAI驅動的求職者和廣告單元產品中。為了支持進一步的研究，我們釋出了訓練和評估代碼（https://github.com/linkedin/ControlLLM），以及在公共數據集上訓練的模型（https://huggingface.co/ControlLLM）供社區使用。

English

Large Language Models (LLMs) demand significant computational resources, making it essential to enhance their capabilities without retraining from scratch. A key challenge in this domain is catastrophic forgetting (CF), which hampers performance during Continuous Pre-training (CPT) and Continuous Supervised Fine-Tuning (CSFT). We propose Control LLM, a novel approach that leverages parallel pre-trained and expanded transformer blocks, aligning their hidden-states through interpolation strategies This method effectively preserves performance on existing tasks while seamlessly integrating new knowledge. Extensive experiments demonstrate the effectiveness of Control LLM in both CPT and CSFT. On Llama3.1-8B-Instruct, it achieves significant improvements in mathematical reasoning (+14.4% on Math-Hard) and coding performance (+10% on MBPP-PLUS). On Llama3.1-8B, it enhances multilingual capabilities (+10.6% on C-Eval, +6.8% on CMMLU, and +30.2% on CMMLU-0shot-CoT). It surpasses existing methods and achieves SOTA among open-source models tuned from the same base model, using substantially less data and compute. Crucially, these gains are realized while preserving strong original capabilities, with minimal degradation (<4.3% on MMLU) compared to >35% in open-source Math and Coding models. This approach has been successfully deployed in LinkedIn's GenAI-powered job seeker and Ads unit products. To support further research, we release the training and evaluation code (https://github.com/linkedin/ControlLLM) along with models trained on public datasets ( https://huggingface.co/ControlLLM) to the community.

控制LLM：LLM智能保留的受控演化

Control LLM: Controlled Evolution for Intelligence Retention in LLM

摘要

Support