ChatPaper.aiChatPaper

LPZero:从零开始的零成本语言模型代理搜索

LPZero: Language Model Zero-cost Proxy Search from Zero

October 7, 2024
作者: Peijie Dong, Lujun Li, Xiang Liu, Zhenheng Tang, Xuebo Liu, Qiang Wang, Xiaowen Chu
cs.AI

摘要

尽管神经架构搜索(NAS)表现出色,但却因计算量巨大而受到批评。最近,零成本NAS作为一种新兴方法崭露头角,通过利用零成本(ZC)代理大幅减少了计算需求。尽管如此,现有的ZC代理在很大程度上依赖专业知识,并带来了显著的试错成本。特别是在自然语言处理(NLP)任务中,大多数现有的ZC代理无法超越朴素基准的性能。为了解决这些挑战,我们引入了一种新颖的框架LPZero,它是第一个能够自动设计各种任务的ZC代理,实现了比人工设计代理更高的排名一致性。具体而言,我们将ZC代理建模为一个符号方程,并融合了一个统一的代理搜索空间,该空间涵盖了现有的ZC代理,这些代理由预定义的一组数学符号组成。为了启发式地搜索最佳的ZC代理,LPZero采用遗传编程来找到最佳的符号组合。我们提出了一种基于规则的修剪策略(RPS),可以预先消除不太有希望的代理,从而减轻代理退化的风险。对FlexiBERT、GPT-2和LLaMA-7B的大量实验表明,与当前方法相比,LPZero在下游任务中表现出更优越的排名能力和性能。
English
In spite of the outstanding performance, Neural Architecture Search (NAS) is criticized for massive computation. Recently, Zero-shot NAS has emerged as a promising approach by exploiting Zero-cost (ZC) proxies, which markedly reduce computational demands. Despite this, existing ZC proxies heavily rely on expert knowledge and incur significant trial-and-error costs. Particularly in NLP tasks, most existing ZC proxies fail to surpass the performance of the naive baseline. To address these challenges, we introduce a novel framework, LPZero, which is the first to automatically design ZC proxies for various tasks, achieving higher ranking consistency than human-designed proxies. Specifically, we model the ZC proxy as a symbolic equation and incorporate a unified proxy search space that encompasses existing ZC proxies, which are composed of a predefined set of mathematical symbols. To heuristically search for the best ZC proxy, LPZero incorporates genetic programming to find the optimal symbolic composition. We propose a Rule-based Pruning Strategy (RPS), which preemptively eliminates unpromising proxies, thereby mitigating the risk of proxy degradation. Extensive experiments on FlexiBERT, GPT-2, and LLaMA-7B demonstrate LPZero's superior ranking ability and performance on downstream tasks compared to current approaches.

Summary

AI-Generated Summary

PDF22November 16, 2024