언어 모델은 논리적 문제 해결사가 될 수 있다

초록

논리적 추론은 인간 지능의 근본적인 측면이며 문제 해결 및 의사결정과 같은 작업의 핵심 요소이다. 최근의 발전으로 대형 언어 모델(LLMs)이 잠재적으로 추론 능력을 보일 수 있게 되었지만, 복잡한 논리적 추론은 여전히 과제로 남아 있다. 최신 기술인 솔버-보강 언어 모델은 LLMs를 사용하여 자연어로 된 논리적 질문을 먼저 기호 표현으로 파싱한 후, 외부 논리 솔버가 기호 표현을 입력받아 답을 출력하도록 한다. 이러한 모델은 인상적인 성능을 보이지만, 파싱 오류가 발생하면 외부 논리 솔버의 실행이 실패하고 논리적 질문에 대한 답을 얻을 수 없게 된다. 본 논문에서는 논리 솔버의 추론 과정을 직접 모방하고 솔버의 구문과 문법을 엄격히 준수함으로써 파싱 오류를 우회하는 새로운 언어 모델인 LoGiPT를 소개한다. LoGiPT는 연역적 솔버의 보이지 않는 추론 과정을 드러내고 정제하여 새롭게 구축한 지시 튜닝 데이터셋을 기반으로 미세 조정되었다. 두 개의 공개된 연역적 추론 데이터셋에 대한 실험 결과는 LoGiPT가 ChatGPT나 GPT-4와 같은 경쟁력 있는 LLMs의 최신 솔버-보강 언어 모델 및 소수 샷 프롬프팅 방법을 능가함을 보여준다.

English

Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questions into symbolic representations first and then adopt external logical solvers to take in the symbolic representations and output the answers. Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions. In this paper, we introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers and bypasses the parsing errors by learning to strict adherence to solver syntax and grammar. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers. Experimental results on two public deductive reasoning datasets demonstrate that LoGiPT outperforms state-of-the-art solver-augmented LMs and few-shot prompting methods on competitive LLMs like ChatGPT or GPT-4.

언어 모델은 논리적 문제 해결사가 될 수 있다

Language Models can be Logical Solvers

초록

Support