φ-디코딩: 균형 잡힌 추론 시점 탐색과 활용을 위한 적응형 예측 샘플링

초록

추론 시간 최적화는 효과적인 성능을 위해 신중한 추론 단계를 도출하기 위해 계산을 확장합니다. 이전의 탐색 기반 전략들은 자동 회귀 생성의 근시안적 문제를 해결했지만, 방대한 탐색 공간으로 인해 과도한 탐색과 불충분한 활용이 발생했습니다. 최적의 단계를 도출하기 위해 효율적인 균형을 맞추기 위해, 우리는 디코딩 전략을 선견 샘플링으로 구성하여 시뮬레이션된 미래 단계를 활용하여 전역적으로 최적의 단계 추정을 얻습니다. 이를 기반으로, 우리는 phi-Decoding이라는 새로운 디코딩 전략을 제안합니다. 단계 값의 정확하고 표현력 있는 추정을 제공하기 위해, phi-Decoding은 선견과 클러스터링을 통해 두 가지 분포를 근사화합니다. 결합 분포에서 샘플링하여 최적의 단계를 선택하여 활용할 수 있습니다. 적응형 계산 할당을 지원하기 위해, 우리는 인-너비와 인-깊이 가지치기 전략을 제안하며, 이는 추론 효율성을 달성하기 위한 경량 솔루션을 특징으로 합니다. 7개의 벤치마크에 걸친 광범위한 실험은 phi-Decoding이 성능과 효율성 모두에서 강력한 베이스라인을 능가함을 보여줍니다. 추가 분석은 다양한 LLM에 걸친 일반화와 광범위한 컴퓨팅 예산에 걸친 확장성을 입증합니다. 코드는 https://github.com/xufangzhi/phi-Decoding에서 공개될 예정이며, 오픈소스 PyPI 패키지도 곧 출시될 예정입니다.

English

Inference-time optimization scales computation to derive deliberate reasoning steps for effective performance. While previous search-based strategies address the short-sightedness of auto-regressive generation, the vast search space leads to excessive exploration and insufficient exploitation. To strike an efficient balance to derive the optimal step, we frame the decoding strategy as foresight sampling, leveraging simulated future steps to obtain globally optimal step estimation. Built on it, we propose a novel decoding strategy, named phi-Decoding. To provide a precise and expressive estimation of step value, phi-Decoding approximates two distributions via foresight and clustering. Sampling from the joint distribution, the optimal steps can be selected for exploitation. To support adaptive computation allocation, we propose in-width and in-depth pruning strategies, featuring a light-weight solution to achieve inference efficiency. Extensive experiments across seven benchmarks show phi-Decoding outperforms strong baselines in both performance and efficiency. Additional analysis demonstrates its generalization across various LLMs and scalability across a wide range of computing budgets. The code will be released at https://github.com/xufangzhi/phi-Decoding, and the open-source PyPI package is coming soon.

φ-디코딩: 균형 잡힌 추론 시점 탐색과 활용을 위한 적응형 예측 샘플링

φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation

초록

Support