ChatPaper.aiChatPaper

LPZero:從零開始的零成本語言模型代理搜索

LPZero: Language Model Zero-cost Proxy Search from Zero

October 7, 2024
作者: Peijie Dong, Lujun Li, Xiang Liu, Zhenheng Tang, Xuebo Liu, Qiang Wang, Xiaowen Chu
cs.AI

摘要

儘管神經架構搜索(NAS)表現優異,但卻因計算量巨大而受到批評。最近,零成本神經架構搜索(Zero-shot NAS)作為一種利用零成本(ZC)代理的新方法嶄露頭角,大幅降低了計算需求。儘管如此,現有的ZC代理卻高度依賴專家知識,並帶來顯著的試誤成本。特別是在自然語言處理(NLP)任務中,大多數現有的ZC代理都無法超越天真基準的表現。為應對這些挑戰,我們提出了一個新穎的框架 LPZero,首次自動設計各種任務的ZC代理,實現比人工設計的代理更高的排名一致性。具體而言,我們將ZC代理建模為符號方程式,並融入一個統一的代理搜索空間,包含現有ZC代理,這些代理由一組預定義的數學符號組成。為了啟發式地尋找最佳的ZC代理,LPZero採用基因編程來找到最佳的符號組合。我們提出了一種基於規則的修剪策略(RPS),預先消除不太有希望的代理,從而減輕代理退化的風險。對FlexiBERT、GPT-2和LLaMA-7B的大量實驗顯示,與當前方法相比,LPZero在下游任務中表現出優越的排名能力和性能。
English
In spite of the outstanding performance, Neural Architecture Search (NAS) is criticized for massive computation. Recently, Zero-shot NAS has emerged as a promising approach by exploiting Zero-cost (ZC) proxies, which markedly reduce computational demands. Despite this, existing ZC proxies heavily rely on expert knowledge and incur significant trial-and-error costs. Particularly in NLP tasks, most existing ZC proxies fail to surpass the performance of the naive baseline. To address these challenges, we introduce a novel framework, LPZero, which is the first to automatically design ZC proxies for various tasks, achieving higher ranking consistency than human-designed proxies. Specifically, we model the ZC proxy as a symbolic equation and incorporate a unified proxy search space that encompasses existing ZC proxies, which are composed of a predefined set of mathematical symbols. To heuristically search for the best ZC proxy, LPZero incorporates genetic programming to find the optimal symbolic composition. We propose a Rule-based Pruning Strategy (RPS), which preemptively eliminates unpromising proxies, thereby mitigating the risk of proxy degradation. Extensive experiments on FlexiBERT, GPT-2, and LLaMA-7B demonstrate LPZero's superior ranking ability and performance on downstream tasks compared to current approaches.

Summary

AI-Generated Summary

PDF22November 16, 2024