CRITIC: 대형 언어 모델은 도구 상호작용적 비판을 통해 자가 수정이 가능하다

초록

최근 대규모 언어 모델(LLM)의 발전은 매우 인상적입니다. 그러나 이러한 모델들은 때때로 사실을 왜곡하거나, 결함이 있는 코드를 생성하거나, 공격적이고 유해한 콘텐츠를 만들어내는 등 일관성 없는 문제 행동을 보이기도 합니다. 이러한 모델들과 달리, 인간은 일반적으로 초기 콘텐츠를 교차 검증하고 개선하기 위해 외부 도구를 활용합니다. 예를 들어, 사실 확인을 위해 검색 엔진을 사용하거나, 디버깅을 위해 코드 인터프리터를 사용하는 것과 같은 방식입니다. 이러한 관찰에서 영감을 받아, 우리는 LLM이 인간의 도구 상호작용과 유사한 방식으로 자신의 출력을 검증하고 점진적으로 수정할 수 있도록 하는 CRITIC이라는 프레임워크를 소개합니다. 보다 구체적으로, CRITIC은 초기 출력을 시작으로 적절한 도구와 상호작용하여 텍스트의 특정 측면을 평가하고, 이 검증 과정에서 얻은 피드백을 바탕으로 출력을 수정합니다. 자유 형식 질문 답변, 수학적 프로그램 합성, 유해성 감소 등 포괄적인 평가를 통해 CRITIC이 LLM의 성능을 지속적으로 향상시킨다는 것을 입증했습니다. 동시에, 우리의 연구는 외부 피드백이 LLM의 지속적인 자기 개선을 촉진하는 데 있어 중요한 역할을 한다는 점을 강조합니다.

English

Recent developments in large language models (LLMs) have been impressive. However, these models sometimes show inconsistencies and problematic behavior, such as hallucinating facts, generating flawed code, or creating offensive and toxic content. Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging. Inspired by this observation, we introduce a framework called CRITIC that allows LLMs, which are essentially "black boxes" to validate and progressively amend their own outputs in a manner similar to human interaction with tools. More specifically, starting with an initial output, CRITIC interacts with appropriate tools to evaluate certain aspects of the text, and then revises the output based on the feedback obtained during this validation process. Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs. Meanwhile, our research highlights the crucial importance of external feedback in promoting the ongoing self-improvement of LLMs.

CRITIC: 대형 언어 모델은 도구 상호작용적 비판을 통해 자가 수정이 가능하다

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

초록

Support