ChatPaper.aiChatPaper

PartCraft:透過零件製作創意物件

PartCraft: Crafting Creative Objects by Parts

July 5, 2024
作者: Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang
cs.AI

摘要

本文通過允許用戶“選擇”,在生成式視覺人工智能中推動創意控制。與傳統的基於文本或素描的方法不同,我們首次允許用戶按部分選擇視覺概念,用於其創意努力。結果是精細生成,精確捕捉所選視覺概念,確保整體忠實且合理的結果。為實現此目的,我們首先通過無監督特徵聚類將對象解析為部分。然後,我們將部分編碼為文本標記,並引入基於熵的標準化注意力損失對它們進行操作。這種損失設計使我們的模型學習有關對象部分組成的通用先驗拓撲知識,進一步推廣到新的部分組成,以確保生成看起來整體忠實。最後,我們使用瓶頸編碼器來投影部分標記。這不僅增強了忠實度,還通過利用共享知識和促進實例之間的信息交流來加速學習。本文和補充資料中的視覺結果展示了PartCraft在製作高度定制、創新作品方面的引人入勝力量,以“迷人”和有創意的鳥類為例。代碼已發布在 https://github.com/kamwoh/partcraft。
English
This paper propels creative control in generative visual AI by allowing users to "select". Departing from traditional text or sketch-based methods, we for the first time allow users to choose visual concepts by parts for their creative endeavors. The outcome is fine-grained generation that precisely captures selected visual concepts, ensuring a holistically faithful and plausible result. To achieve this, we first parse objects into parts through unsupervised feature clustering. Then, we encode parts into text tokens and introduce an entropy-based normalized attention loss that operates on them. This loss design enables our model to learn generic prior topology knowledge about object's part composition, and further generalize to novel part compositions to ensure the generation looks holistically faithful. Lastly, we employ a bottleneck encoder to project the part tokens. This not only enhances fidelity but also accelerates learning, by leveraging shared knowledge and facilitating information exchange among instances. Visual results in the paper and supplementary material showcase the compelling power of PartCraft in crafting highly customized, innovative creations, exemplified by the "charming" and creative birds. Code is released at https://github.com/kamwoh/partcraft.

Summary

AI-Generated Summary

PDF62November 28, 2024