잠깐, "기다릴" 필요가 없습니다! 사고 토큰 제거가 추론 효율성을 향상시킵니다

초록

대형 추론 모델의 최근 발전은 복잡한 단계별 추론을 가능하게 했지만, 종종 과도한 사고를 유발하여 장황하고 중복된 출력을 초래함으로써 효율성을 저해하는 문제가 발생해 왔다. 본 연구에서는 "Wait" 및 "Hmm"과 같은 토큰으로 표시되는 명시적 자기 반성이 고급 추론에 필수적인지 여부를 검토한다. 우리는 이러한 토큰을 추론 과정에서 억제함으로써 명시적 자기 반성을 비활성화하는 간단하면서도 효과적인 접근법인 NoWait를 제안한다. 텍스트, 시각 및 비디오 추론 작업에 걸친 10개의 벤치마크에서 수행된 광범위한 실험 결과, NoWait는 다섯 가지 R1 스타일 모델 시리즈에서 사고의 연쇄적 궤적 길이를 최대 27%-51%까지 줄이면서도 모델의 유용성을 저해하지 않음을 보여준다. 따라서 NoWait는 효율적이고 유용성을 유지하는 다중 모드 추론을 위한 플러그 앤 플레이 솔루션을 제공한다.

English

Recent advances in large reasoning models have enabled complex, step-by-step reasoning but often introduce significant overthinking, resulting in verbose and redundant outputs that hinder efficiency. In this study, we examine whether explicit self-reflection, signaled by tokens such as "Wait" and "Hmm", is necessary for advanced reasoning. We propose NoWait, a simple yet effective approach that disables explicit self-reflection by suppressing these tokens during inference. Extensive experiments on ten benchmarks across textual, visual, and video reasoning tasks show that NoWait reduces chain-of-thought trajectory length by up to 27%-51% in five R1-style model series, without compromising model utility. NoWait thus offers a plug-and-play solution for efficient and utility-preserving multimodal reasoning.

잠깐, "기다릴" 필요가 없습니다! 사고 토큰 제거가 추론 효율성을 향상시킵니다

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

초록

Support