ActionPiece: Contextuele Tokenisatie van Actiesequenties voor Generatieve Aanbevelingen

Samenvatting

Generatieve aanbeveling (GR) is een opkomend paradigma waarbij gebruikersacties worden getokeniseerd in discrete tokenpatronen en autoregressief worden gegenereerd als voorspellingen. Bestaande GR-modellen tokeniseren echter elke actie onafhankelijk, waarbij dezelfde vaste tokens worden toegewezen aan identieke acties in alle sequenties, zonder rekening te houden met contextuele relaties. Dit gebrek aan contextbewustzijn kan leiden tot suboptimale prestaties, aangezien dezelfde actie verschillende betekenissen kan hebben afhankelijk van de omringende context. Om dit probleem aan te pakken, stellen we ActionPiece voor, waarbij context expliciet wordt meegenomen bij het tokeniseren van actiesequenties. In ActionPiece wordt elke actie weergegeven als een set van itemkenmerken, die dienen als de initiële tokens. Gegeven de corpora van actiesequenties, construeren we de vocabulaire door kenmerkpatronen samen te voegen als nieuwe tokens, gebaseerd op hun co-voorkomfrequentie zowel binnen individuele sets als over aangrenzende sets. Gezien de ongeordende aard van kenmerksets, introduceren we verder setpermutatieregularisatie, wat meerdere segmentaties van actiesequenties met dezelfde semantiek oplevert. Experimenten op openbare datasets tonen aan dat ActionPiece consistent beter presteert dan bestaande methoden voor actietokenisatie, met een verbetering van NDCG@10 met 6,00% tot 12,82%.

English

Generative recommendation (GR) is an emerging paradigm where user actions are tokenized into discrete token patterns and autoregressively generated as predictions. However, existing GR models tokenize each action independently, assigning the same fixed tokens to identical actions across all sequences without considering contextual relationships. This lack of context-awareness can lead to suboptimal performance, as the same action may hold different meanings depending on its surrounding context. To address this issue, we propose ActionPiece to explicitly incorporate context when tokenizing action sequences. In ActionPiece, each action is represented as a set of item features, which serve as the initial tokens. Given the action sequence corpora, we construct the vocabulary by merging feature patterns as new tokens, based on their co-occurrence frequency both within individual sets and across adjacent sets. Considering the unordered nature of feature sets, we further introduce set permutation regularization, which produces multiple segmentations of action sequences with the same semantics. Experiments on public datasets demonstrate that ActionPiece consistently outperforms existing action tokenization methods, improving NDCG@10 by 6.00% to 12.82%.

ActionPiece: Contextuele Tokenisatie van Actiesequenties voor Generatieve Aanbevelingen

ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

Samenvatting

Support