大型語言模型程式
Large Language Model Programs
May 9, 2023
作者: Imanol Schlag, Sainbayar Sukhbaatar, Asli Celikyilmaz, Wen-tau Yih, Jason Weston, Jürgen Schmidhuber, Xian Li
cs.AI
摘要
近年來,大型預訓練語言模型(LLMs)已展示出能夠遵循指示並從少量範例執行新任務的能力。通過在上下文中使用這些範例來對LLM進行參數化,可以擴大它們的能力,而成本遠低於微調。我們延伸這一思路並提出了一種方法,通過將LLM嵌入到算法或程序中,進一步擴展其能力。為了展示這種方法的好處,我們提出了一個證據支持的問答示例。通過更具算法性的方法,我們在不進行任何微調的情況下,相對於思維基線獲得了6.4%的改進。此外,我們從這個角度突出了最近的工作,並討論了與標準方法相比的優缺點。
English
In recent years, large pre-trained language models (LLMs) have demonstrated
the ability to follow instructions and perform novel tasks from a few examples.
The possibility to parameterise an LLM through such in-context examples widens
their capability at a much lower cost than finetuning. We extend this line of
reasoning and present a method which further expands the capabilities of an LLM
by embedding it within an algorithm or program. To demonstrate the benefits of
this approach, we present an illustrative example of evidence-supported
question-answering. We obtain a 6.4\% improvement over the chain of thought
baseline through a more algorithmic approach without any finetuning.
Furthermore, we highlight recent work from this perspective and discuss the
advantages and disadvantages in comparison to the standard approaches.