대형 언어 모델을 범용 패턴 기계로 활용하기

초록

우리는 사전 학습된 대규모 언어 모델(LLMs)이 복잡한 토큰 시퀀스를 자동회귀적으로 완성할 수 있음을 관찰했습니다. 이는 확률적 문맥 자유 문법(PCFG)에 의해 절차적으로 생성된 임의의 시퀀스부터, 일반 AI 벤치마크인 추상 추론 코퍼스(ARC)에서 발견되는 더 풍부한 공간 패턴까지, ASCII 아트 스타일로 프롬프트된 것들을 포함합니다. 놀랍게도, 이러한 패턴 완성 능력은 시퀀스가 어휘에서 무작위로 샘플링된 토큰을 사용하여 표현된 경우에도 부분적으로 유지될 수 있습니다. 이러한 결과는 추가적인 학습 없이도 LLMs가 문맥 학습에 의해 구동되는 일반 시퀀스 모델러로 기능할 수 있음을 시사합니다. 본 연구에서는 이러한 제로샷 능력을 로보틱스 문제에 어떻게 적용할 수 있는지 탐구합니다. 시간에 따른 상태를 나타내는 숫자 시퀀스를 외삽하여 간단한 동작을 완성하는 것부터, 보상 조건화된 궤적을 최소에서 최대로 프롬프트하여 폐루프 정책(예: CartPole의 안정화 컨트롤러)을 발견하고 표현할 수 있는 것까지 다룹니다. 현재는 지연 시간, 문맥 크기 제한 및 계산 비용으로 인해 실제 시스템에 배포하기 어렵지만, LLMs를 저수준 제어를 구동하는 데 사용하는 접근 방식은 단어 간 패턴이 행동으로 전환될 수 있는 흥미로운 가능성을 보여줄 수 있습니다.

English

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences -- from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstract Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary. These results suggest that without any additional training, LLMs can serve as general sequence modelers, driven by in-context learning. In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics -- from extrapolating sequences of numbers that represent states over time to complete simple motions, to least-to-most prompting of reward-conditioned trajectories that can discover and represent closed-loop policies (e.g., a stabilizing controller for CartPole). While difficult to deploy today for real systems due to latency, context size limitations, and compute costs, the approach of using LLMs to drive low-level control may provide an exciting glimpse into how the patterns among words could be transferred to actions.

대형 언어 모델을 범용 패턴 기계로 활용하기

Large Language Models as General Pattern Machines

초록

Support