PathFinder: 다단계 추론 경로에 대한 가이드 탐색

초록

최근 대규모 언어 모델의 발전으로, 사고 사슬을 이끌어내기 위한 사고 연쇄 프롬프팅(chain-of-thought prompting)과 같은 방법들이 추론 과제에서 결과를 개선하는 것으로 나타났습니다. 그러나 다단계 추론이 필요한 과제들은 여전히 최첨단 모델들에게 상당한 도전 과제로 남아 있습니다. 빔 서치(beam search) 알고리즘에서 영감을 얻어, 우리는 트리 탐색 기반의 추론 경로 생성 접근법인 PathFinder를 제안합니다. 이 방법은 다양한 샘플링 방법과 매개변수를 통해 동적 디코딩을 통합함으로써 다양한 분기와 다중 홉 추론을 강화합니다. 제한된 추론을 사용하여 PathFinder는 새로운 품질 제약 조건, 가지치기, 탐색 방법을 통합하여 생성의 효율성과 품질을 향상시킵니다. 또한, 후보 선택을 개선하기 위해 점수 매기기와 순위 지정 기능을 포함합니다. 우리의 접근법은 세 가지 복잡한 산술 및 상식 추론 과제에서 경쟁력 있는 기준선을 평균 6% 앞질렀습니다. 우리의 모델은 더 길고 보지 못한 추론 사슬에도 잘 일반화되며, 큰 분기 요인을 가진 빔 서치와 유사한 복잡성을 반영합니다.

English

With recent advancements in large language models, methods like chain-of-thought prompting to elicit reasoning chains have been shown to improve results on reasoning tasks. However, tasks that require multiple steps of reasoning still pose significant challenges to state-of-the-art models. Drawing inspiration from the beam search algorithm, we propose PathFinder, a tree-search-based reasoning path generation approach. It enhances diverse branching and multi-hop reasoning through the integration of dynamic decoding, enabled by varying sampling methods and parameters. Using constrained reasoning, PathFinder integrates novel quality constraints, pruning, and exploration methods to enhance the efficiency and the quality of generation. Moreover, it includes scoring and ranking features to improve candidate selection. Our approach outperforms competitive baselines on three complex arithmetic and commonsense reasoning tasks by 6% on average. Our model generalizes well to longer, unseen reasoning chains, reflecting similar complexities to beam search with large branching factors.

PathFinder: 다단계 추론 경로에 대한 가이드 탐색

PathFinder: Guided Search over Multi-Step Reasoning Paths

초록

Support