체인 오브 사고 토큰은 컴퓨터 프로그램 변수입니다.

초록

사고의 연쇄(Chain-of-Thoughts, CoT)는 대형 언어 모델(LLMs)이 최종 답변에 도달하기 전에 중간 단계를 생성하도록 요구하며, 복잡한 추론 과제를 해결하는 데 효과적임이 입증되었습니다. 그러나 CoT의 내부 메커니즘은 여전히 대부분 명확하지 않습니다. 본 논문에서는 두 가지 구성적 과제인 다중 자릿수 곱셈과 동적 프로그래밍에서 CoT 토큰의 역할을 실증적으로 연구합니다. CoT는 이러한 문제를 해결하는 데 필수적이지만, 중간 결과를 저장하는 토큰만 보존해도 비슷한 성능을 달성할 수 있음을 발견했습니다. 또한, 중간 결과를 대체 잠재 형태로 저장해도 모델 성능에 영향을 미치지 않음을 관찰했습니다. CoT의 일부 값을 무작위로 개입시켜 보았을 때, 후속 CoT 토큰과 최종 답변이 그에 따라 변경되는 것을 확인했습니다. 이러한 발견들은 CoT 토큰이 컴퓨터 프로그램의 변수와 유사한 기능을 할 수 있지만, 의도하지 않은 단축 경로와 토큰 간의 계산 복잡성 한계와 같은 잠재적인 단점이 있을 수 있음을 시사합니다. 코드와 데이터는 https://github.com/solitaryzero/CoTs_are_Variables에서 확인할 수 있습니다.

English

Chain-of-thoughts (CoT) requires large language models (LLMs) to generate intermediate steps before reaching the final answer, and has been proven effective to help LLMs solve complex reasoning tasks. However, the inner mechanism of CoT still remains largely unclear. In this paper, we empirically study the role of CoT tokens in LLMs on two compositional tasks: multi-digit multiplication and dynamic programming. While CoT is essential for solving these problems, we find that preserving only tokens that store intermediate results would achieve comparable performance. Furthermore, we observe that storing intermediate results in an alternative latent form will not affect model performance. We also randomly intervene some values in CoT, and notice that subsequent CoT tokens and the final answer would change correspondingly. These findings suggest that CoT tokens may function like variables in computer programs but with potential drawbacks like unintended shortcuts and computational complexity limits between tokens. The code and data are available at https://github.com/solitaryzero/CoTs_are_Variables.

체인 오브 사고 토큰은 컴퓨터 프로그램 변수입니다.

Chain-of-Thought Tokens are Computer Program Variables

초록

Support