ChatPaper.aiChatPaper

向量-ICL:具有連續向量表示的內文學習

Vector-ICL: In-context Learning with Continuous Vector Representations

October 8, 2024
作者: Yufan Zhuang, Chandan Singh, Liyuan Liu, Jingbo Shang, Jianfeng Gao
cs.AI

摘要

大型語言模型(LLMs)展示了在文本數據上顯著的上下文學習(ICL)能力。我們探索這些能力是否可以擴展到來自黑盒預訓練編碼器的不同領域的連續向量。通過通過輕量級投影器將輸入數據與LLM的嵌入空間對齊,我們觀察到LLMs可以有效地處理和學習這些投影向量,我們稱之為向量-ICL。特別是,我們發現使用通用語言建模目標預訓練投影器可以實現向量-ICL,而任務特定的微調進一步提高了性能。在我們的實驗中涵蓋各種任務和模態,包括文本重構、數值函數回歸、文本分類、摘要、分子標題、時間序列分類、圖分類和fMRI解碼,向量-ICL通常優於少樣本ICL和特定領域模型或調整。我們進一步進行分析和案例研究,顯示LLMs處理向量表示的潛力超越了傳統基於標記的範式。
English
Large language models (LLMs) have shown remarkable in-context learning (ICL) capabilities on textual data. We explore whether these capabilities can be extended to continuous vectors from diverse domains, obtained from black-box pretrained encoders. By aligning input data with an LLM's embedding space through lightweight projectors, we observe that LLMs can effectively process and learn from these projected vectors, which we term Vector-ICL. In particular, we find that pretraining projectors with general language modeling objectives enables Vector-ICL, while task-specific finetuning further enhances performance. In our experiments across various tasks and modalities, including text reconstruction, numerical function regression, text classification, summarization, molecule captioning, time-series classification, graph classification, and fMRI decoding, Vector-ICL often surpasses both few-shot ICL and domain-specific model or tuning. We further conduct analyses and case studies, indicating the potential of LLMs to process vector representations beyond traditional token-based paradigms.

Summary

AI-Generated Summary

PDF33November 16, 2024