G-FOCUS:邁向評估用戶界面設計說服力的穩健方法
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
May 8, 2025
作者: Jaehyun Jeon, Jang Han Yoon, Min Soo Kim, Sumin Shim, Yejin Choi, Hanbin Kim, Youngjae Yu
cs.AI
摘要
評估用戶界面(UI)設計的有效性不僅限於美學層面,更關鍵的是影響用戶行為,這一原則正是設計說服力的核心。A/B測試是確定哪些UI變體能帶來更高用戶參與度的主要方法,但其成本高昂且耗時。儘管近期的視覺-語言模型(VLMs)能夠處理自動化的UI分析,但現有方法主要關注孤立的設計屬性,而非比較性的說服力——這是優化用戶交互的關鍵因素。為解決這一問題,我們引入了WiserUI-Bench,這是一個專為成對UI設計說服力評估任務設計的基準,包含300組真實世界的UI圖像對,並標註了A/B測試結果和專家解釋。此外,我們提出了G-FOCUS,一種新穎的推理時策略,通過減少位置偏差和提高評估準確性,增強了基於VLM的說服力評估。實驗結果表明,在成對UI評估的一致性和準確性方面,G-FOCUS超越了現有的推理策略。通過推動VLM驅動的UI說服力評估,我們的工作提供了一種補充A/B測試的方法,推動了可擴展的UI偏好建模和設計優化的進展。代碼和數據將公開發布。
English
Evaluating user interface (UI) design effectiveness extends beyond aesthetics
to influencing user behavior, a principle central to Design Persuasiveness. A/B
testing is the predominant method for determining which UI variations drive
higher user engagement, but it is costly and time-consuming. While recent
Vision-Language Models (VLMs) can process automated UI analysis, current
approaches focus on isolated design attributes rather than comparative
persuasiveness-the key factor in optimizing user interactions. To address this,
we introduce WiserUI-Bench, a benchmark designed for Pairwise UI Design
Persuasiveness Assessment task, featuring 300 real-world UI image pairs labeled
with A/B test results and expert rationales. Additionally, we propose G-FOCUS,
a novel inference-time reasoning strategy that enhances VLM-based
persuasiveness assessment by reducing position bias and improving evaluation
accuracy. Experimental results show that G-FOCUS surpasses existing inference
strategies in consistency and accuracy for pairwise UI evaluation. Through
promoting VLM-driven evaluation of UI persuasiveness, our work offers an
approach to complement A/B testing, propelling progress in scalable UI
preference modeling and design optimization. Code and data will be released
publicly.Summary
AI-Generated Summary