ニューロン活性化グラフによる目標指向事前学習データ選択

要旨

日常的なタスクには目標が伴い、この目標に沿ってモデルを事前学習することが、専門家へと変える鍵となる。本論文では、目標志向の言語モデル（LM）事前学習を、ニューロン活性化グラフランキング（NAG-based Ranking）と呼ばれる、訓練不要で解釈可能な目標事前学習データ選択フレームワークを導入して検討する。ブラックボックス的な表現を用いるのではなく、我々の手法は、任意の既製大規模言語モデル（LLM）内の高影響力ニューロンのスパースな集合によって、各目標入力を直接的に特徴づける。具体的には、ニューロンの影響力を定量化し、層を跨いで最も影響力の大きいニューロンを選択してコンパクトなニューロン活性化グラフ（NAG）を構築し、目標事例とのNAG類似度に基づいて候補データをランク付けする。6つのベンチマークで実験を行った結果、NAG-based Rankingは、目標志向の事前学習においてランダムサンプリングよりも平均4.9%向上させ、HellaSwagでは既存の最先端ベースラインを5.3%の精度で上回った。また、より実用的なマルチターゲット設定においても有効性を維持し、最良の設定では2つのベースラインをそれぞれ1.1%、4.1%上回った。さらに、NAGがなぜ、どのように機能するかについて包括的な分析を提供する。例えば、NAGで選択されたニューロン（全ニューロンのわずか0.12%）を不活性化すると23.5%の性能低下を引き起こし、NAGを最終層のみに制限すると平均4.1%の低下を招くことから、NAGが目標特徴を学習するためのスパースな「機能的バックボーン」を捉えていることが示唆される。コードはhttps://github.com/asillycat/NAGで公開している。

English

Everyday tasks come with a target, and pretraining models around this target is what turns them into experts. In this paper, we study target-oriented language model (LM) pretraining by introducing Neuron-Activated Graph Ranking (NAG-based Ranking), a training-free and interpretable framework for target pretraining data selection. Rather than using black-box representations, our approach directly characterizes each target input by a sparse set of high-impact neurons in any off-the-shelf LLMs. Concretely, we quantify neuron impact and select the most influential neurons across layers into a compact Neuron-Activated Graph (NAG), and rank candidate data by NAG similarity to target examples. We conduct experiments across six benchmarks, where our NAG-based Ranking improves target-oriented pretraining by 4.9% on average over random sampling, and also outperforms state-of-the-art baselines by 5.3% accuracy on HellaSwag. It also remains effective under a more applicable multi-target setting, where our best setup surpasses two baselines by 1.1% and 4.1%, respectively. Furthermore, we provide a comprehensive analysis on why and how our NAG works, e.g., deactivating NAG-selected neurons (only 0.12% of all) causes a 23.5% performance collapse, and restricting NAG to the final layer incurs a 4.1% average drop, indicating that NAG captures a sparse "functional backbone" for learning target features. We release the code at https://github.com/asillycat/NAG.

ニューロン活性化グラフによる目標指向事前学習データ選択

Target-Oriented Pretraining Data Selection via Neuron-Activated Graph

要旨

Support