UniGame：將統一多模態模型轉化為其自身對抗者的遊戲

摘要

統一多模態模型（UMMs）憑藉單一架構在理解與生成任務上均展現出卓越性能。然而，此類模型仍存在根本性不一致問題：理解任務偏好緊湊的嵌入表徵，而生成任務則需依賴重建豐富的表示。這種結構性權衡會導致決策邊界失準、跨模態連貫性下降，並加劇模型在分佈偏移與對抗性攻擊下的脆弱性。本文提出UniGame——一種直接針對此不一致性的自對抗式後訓練框架。通過在共享令牌接口施加輕量級擾動器，該框架使生成分支能主動搜尋並挑戰脆弱的理解表徵，從而讓模型自身成為其對抗者。實驗表明，UniGame顯著提升模型一致性（+4.6%），同時在理解任務（+3.6%）、生成質量（+0.02）以及分佈外泛化性與對抗魯棒性（在NaturalBench和AdVQA數據集上分別提升+4.8%和+6.2%）方面實現顯著進步。該框架具架構無關性，僅增加不足1%的參數量，並可與現有後訓練方法互補。這些成果確立了對抗自博弈作為增強未來多模態基礎模型連貫性、穩定性與統一能力的通用有效準則。官方代碼已發佈於：https://github.com/AIFrontierLab/UniGame

English

Unified Multimodal Models (UMMs) have shown impressive performance in both understanding and generation with a single architecture. However, UMMs still exhibit a fundamental inconsistency: understanding favors compact embeddings, whereas generation favors reconstruction-rich representations. This structural trade-off produces misaligned decision boundaries, degraded cross-modal coherence, and heightened vulnerability under distributional and adversarial shifts. In this paper, we present UniGame, a self-adversarial post-training framework that directly targets the inconsistencies. By applying a lightweight perturber at the shared token interface, UniGame enables the generation branch to actively seek and challenge fragile understanding, turning the model itself into its own adversary. Experiments demonstrate that UniGame significantly improves the consistency (+4.6%). Moreover, it also achieves substantial improvements in understanding (+3.6%), generation (+0.02), out-of-distribution and adversarial robustness (+4.8% and +6.2% on NaturalBench and AdVQA). The framework is architecture-agnostic, introduces less than 1% additional parameters, and is complementary to existing post-training methods. These results position adversarial self-play as a general and effective principle for enhancing the coherence, stability, and unified competence of future multimodal foundation models. The official code is available at: https://github.com/AIFrontierLab/UniGame

UniGame：將統一多模態模型轉化為其自身對抗者的遊戲

UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

摘要

Support