ChatPaper.aiChatPaper

理解生成式AI在日常圖像編輯任務中的能力

Understanding Generative AI Capabilities in Everyday Image Editing Tasks

May 22, 2025
作者: Mohammad Reza Taesiri, Brandon Collins, Logan Bolton, Viet Dac Lai, Franck Dernoncourt, Trung Bui, Anh Totti Nguyen
cs.AI

摘要

生成式人工智慧(GenAI)在自動化日常影像編輯任務方面展現出巨大潛力,尤其是在2025年3月25日GPT-4o發布之後。然而,人們最常希望編輯的主題是什麼?他們希望執行哪些類型的編輯操作(例如,移除或風格化主體)?人們更偏好結果可預測的精確編輯,還是高度創意的編輯?通過理解現實世界中的請求特徵以及自由職業影像編輯高手所做的相應編輯,我們能否汲取經驗來改進基於AI的編輯器,並確定目前哪些類型的請求能夠被AI編輯器成功處理?在本論文中,我們透過分析Reddit社群過去12年(2013-2025)的83,000個請求,進行了一項獨特的研究,這些請求共收集了305,000次PSR高手編輯。根據人類評分,僅約33%的請求能被最佳AI編輯器(包括GPT-4o、Gemini-2.0-Flash、SeedEdit)完成。有趣的是,AI編輯器在需要精確編輯的低創意請求上表現較差,而在更開放式的任務上表現較好。它們經常難以保留人和動物的身份特徵,並經常進行未請求的修飾。另一方面,視覺語言模型(VLM)評判者(例如o1)的評判方式與人類評判者不同,可能更偏好AI編輯而非人類編輯。程式碼與質性範例可於以下網址取得:https://psrdataset.github.io
English
Generative AI (GenAI) holds significant promise for automating everyday image editing tasks, especially following the recent release of GPT-4o on March 25, 2025. However, what subjects do people most often want edited? What kinds of editing actions do they want to perform (e.g., removing or stylizing the subject)? Do people prefer precise edits with predictable outcomes or highly creative ones? By understanding the characteristics of real-world requests and the corresponding edits made by freelance photo-editing wizards, can we draw lessons for improving AI-based editors and determine which types of requests can currently be handled successfully by AI editors? In this paper, we present a unique study addressing these questions by analyzing 83k requests from the past 12 years (2013-2025) on the Reddit community, which collected 305k PSR-wizard edits. According to human ratings, approximately only 33% of requests can be fulfilled by the best AI editors (including GPT-4o, Gemini-2.0-Flash, SeedEdit). Interestingly, AI editors perform worse on low-creativity requests that require precise editing than on more open-ended tasks. They often struggle to preserve the identity of people and animals, and frequently make non-requested touch-ups. On the other side of the table, VLM judges (e.g., o1) perform differently from human judges and may prefer AI edits more than human edits. Code and qualitative examples are available at: https://psrdataset.github.io

Summary

AI-Generated Summary

PDF202May 23, 2025