대화형 추천을 위한 아이템-언어 모델

초록

대규모 언어 모델(LLMs)은 복잡한 대화 이해, 추론 및 코딩과 같은 작업에서 탁월한 성과를 보이며, 이는 이들의 창발적 능력 덕분입니다. 이러한 창발적 능력은 이미지, 오디오, 비디오 기능을 포함한 다중 모달리티로 확장되었습니다. 반면, 추천 시스템은 정보 탐색 및 아이템 발견에 있어 중요한 역할을 해왔습니다. 최근에는 LLMs를 추천에 적용하려는 시도가 이루어지고 있습니다. 현재 시도에서의 어려움 중 하나는, 기본 LLM이 사용자 상호작용 신호를 주로 포함하고 있으며 공개적으로 이용 가능하지 않은 추천 시스템 데이터로 훈련되지 않았다는 점입니다. 또 다른 어려움은 사용자 상호작용 신호가 자연어 텍스트와는 다른 패턴을 보이며, 기존 추천 시스템 방법에 비해 상호작용 신호로부터 더 복잡한 지식을 학습할 수 있는지 여부가 현재 명확하지 않다는 점입니다. 마지막으로, 다양한 사용 사례를 위해 여러 LLMs를 훈련시키고, 추천 시스템 데이터를 학습하면서도 원래의 언어 및 추론 능력을 유지하는 것이 어렵다는 점입니다. 이 세 가지 한계를 해결하기 위해, 우리는 사용자 상호작용 신호를 인코딩하는 텍스트 정렬 아이템 표현을 생성하는 아이템 인코더와, 사전 훈련된 지식을 유지하며 이러한 아이템 표현을 이해할 수 있는 고정된 LLM으로 구성된 아이템-언어 모델(ILM)을 제안합니다. 우리는 광범위한 실험을 통해 아이템 인코더에서 언어 정렬과 사용자 상호작용 지식의 중요성을 입증합니다.

English

Large-language Models (LLMs) have been extremely successful at tasks like complex dialogue understanding, reasoning and coding due to their emergent abilities. These emergent abilities have been extended with multi-modality to include image, audio, and video capabilities. Recommender systems, on the other hand, have been critical for information seeking and item discovery needs. Recently, there have been attempts to apply LLMs for recommendations. One difficulty of current attempts is that the underlying LLM is usually not trained on the recommender system data, which largely contains user interaction signals and is often not publicly available. Another difficulty is user interaction signals often have a different pattern from natural language text, and it is currently unclear if the LLM training setup can learn more non-trivial knowledge from interaction signals compared with traditional recommender system methods. Finally, it is difficult to train multiple LLMs for different use-cases, and to retain the original language and reasoning abilities when learning from recommender system data. To address these three limitations, we propose an Item-Language Model (ILM), which is composed of an item encoder to produce text-aligned item representations that encode user interaction signals, and a frozen LLM that can understand those item representations with preserved pretrained knowledge. We conduct extensive experiments which demonstrate both the importance of the language-alignment and of user interaction knowledge in the item encoder.

대화형 추천을 위한 아이템-언어 모델

Item-Language Model for Conversational Recommendation

초록

Summary

Support

Support