Vector-ICL:具有连续向量表示的上下文学习
Vector-ICL: In-context Learning with Continuous Vector Representations
October 8, 2024
作者: Yufan Zhuang, Chandan Singh, Liyuan Liu, Jingbo Shang, Jianfeng Gao
cs.AI
摘要
大型语言模型(LLMs)已经展示出在文本数据上的上下文学习(ICL)能力。我们探讨这些能力是否可以扩展到从黑盒预训练编码器获得的来自不同领域的连续向量。通过将输入数据与LLM的嵌入空间通过轻量级投影器对齐,我们观察到LLMs可以有效处理和学习这些投影向量,我们将其称为向量-ICL。特别是,我们发现使用通用语言建模目标预训练投影器可以实现向量-ICL,而任务特定的微调可以进一步提高性能。在我们跨越各种任务和模态的实验中,包括文本重建、数值函数回归、文本分类、摘要、分子字幕、时间序列分类、图分类和fMRI解码,向量-ICL通常优于少样本ICL和特定领域模型或调整。我们进一步进行分析和案例研究,表明LLMs处理超越传统基于标记的范式的向量表示的潜力。
English
Large language models (LLMs) have shown remarkable in-context learning (ICL)
capabilities on textual data. We explore whether these capabilities can be
extended to continuous vectors from diverse domains, obtained from black-box
pretrained encoders. By aligning input data with an LLM's embedding space
through lightweight projectors, we observe that LLMs can effectively process
and learn from these projected vectors, which we term Vector-ICL. In
particular, we find that pretraining projectors with general language modeling
objectives enables Vector-ICL, while task-specific finetuning further enhances
performance. In our experiments across various tasks and modalities, including
text reconstruction, numerical function regression, text classification,
summarization, molecule captioning, time-series classification, graph
classification, and fMRI decoding, Vector-ICL often surpasses both few-shot ICL
and domain-specific model or tuning. We further conduct analyses and case
studies, indicating the potential of LLMs to process vector representations
beyond traditional token-based paradigms.Summary
AI-Generated Summary