Macaron-A2UI：個人代理中生成式使用者介面的模型

摘要

隨著個人代理發展到能夠處理複雜、以使用者為中心的任務，靜態純文字對話迅速成為瓶頸。生成式UI應運而生，成為必要的新介面層，能即時從互動情境中動態合成正確的控制項、選項與狀態。我們提出Macaron-A2UI，這是一個專為個人代理設計的生成式UI模型。目標是超越純文字互動，使代理能同時生成自然語言，以及輕量級、可執行的UI動作，用於資訊收集、偏好精煉、確認與多重目標組織。我們從異質對話來源建構大規模生成式UI語料庫，引入A2UI-Bench進行控制式評估，並透過參數高效的LoRA基礎監督式微調，搭配獎勵驅動的強化學習，訓練出30B、235B與754B模型。最佳Macaron-A2UI模型在A2UI-Bench上，無須明確綱要提示即達到75.6總分，超越最強的完整綱要先進基準。我們釋出模型、基準與評估協議，以支援未來個人代理生成式UI的研究工作。

English

As personal agents evolve to handle complex, user-centric tasks, static plain-text chat is rapidly becoming a bottleneck. Generative UI emerges as the necessary new interface layer, dynamically synthesizing the right controls, options, and state from the interaction context in real time. We present Macaron-A2UI, a model for Generative UI in personal agents. Our goal is to move beyond text-only interaction by enabling agents to generate natural language together with lightweight, executable UI actions for information collection, preference refinement, confirmation, and multi-goal organization. We build a large-scale Generative UI corpus from heterogeneous dialogue sources, introduce A2UI-Bench for controlled evaluation, and train 30B, 235B and 754B models with parameter-efficient LoRA-based supervised fine-tuning followed by reward-driven reinforcement learning. The best Macaron-A2UI model reaches 75.6 overall on A2UI-Bench without explicit schema hints, surpassing the strongest full-schema frontier baseline. We release the models, benchmark, and evaluation protocol to support future work on Generative UI for personal agents.