사고 연쇄 추론의 연역적 검증

초록

대형 언어 모델(LLMs)은 다양한 추론 작업을 수행함에 있어 사고의 연쇄(Chain-of-Thought, CoT) 프롬프팅으로부터 상당한 이점을 얻습니다. CoT는 모델이 더 포괄적인 추론 과정을 생성할 수 있게 해주지만, 중간 추론 단계에 대한 강조로 인해 환각(hallucination)과 누적 오류가 발생할 수 있어 복잡한 추론 문제 해결 능력을 제한할 수 있습니다. 인간이 과제를 해결하기 위해 신중하고 꼼꼼한 연역적 논리 추론 과정을 거치는 방식에서 영감을 받아, 우리는 언어 모델이 명시적이고 엄격한 연역적 추론을 수행할 수 있도록 하고, 이를 통해 추론 과정의 신뢰성을 자체 검증을 통해 보장하고자 합니다. 그러나 ChatGPT와 같은 고급 모델을 사용하더라도 전체 연역적 추론 과정의 타당성을 직접 검증하는 것은 어려운 과제입니다. 이를 고려하여, 우리는 추론 검증 과정을 단계별 하위 과정으로 분해하고, 각 하위 과정이 필요한 맥락과 전제만을 받아들이도록 제안합니다. 이 과정을 지원하기 위해, 우리는 자연어 기반의 연역적 추론 형식인 Natural Program을 제안합니다. 우리의 접근 방식은 모델이 이전 단계를 더 엄격하게 기반으로 한 정확한 추론 단계를 생성할 수 있게 하며, 언어 모델이 단계별로 추론 자체 검증을 수행할 수 있도록 합니다. 이 검증 과정을 각 연역적 추론 단계에 통합함으로써, 생성된 추론 단계의 엄격성과 신뢰성을 크게 향상시킵니다. 이 과정을 통해 복잡한 추론 과제에서의 답변 정확성도 개선됩니다. 코드는 https://github.com/lz1oceani/verify_cot에서 공개될 예정입니다.

English

Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at https://github.com/lz1oceani/verify_cot.

사고 연쇄 추론의 연역적 검증

Deductive Verification of Chain-of-Thought Reasoning

초록

Support