ChatPaper.aiChatPaper

pix2gestalt:通過合成整體來進行無模態分割

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

January 25, 2024
作者: Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick
cs.AI

摘要

我們介紹了 pix2gestalt,一個用於零樣本非物體分割的框架,該框架學習估計僅部分可見並被遮擋的整個物體的形狀和外觀。通過利用大規模擴散模型並將它們的表示轉移到這個任務中,我們學習了一個條件擴散模型,用於在具有挑戰性的零樣本情況下重建整個物體,包括打破自然和物理先驗的示例,如藝術品。作為訓練數據,我們使用了一個經過合成精心策劃的數據集,其中包含被遮擋的物體與它們的整體對應物。實驗表明,我們的方法在已建立的基準測試中優於監督基準。此外,我們的模型還可以用於顯著提高現有物體識別和三維重建方法在存在遮擋情況下的性能。
English
We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions. By capitalizing on large-scale diffusion models and transferring their representations to this task, we learn a conditional diffusion model for reconstructing whole objects in challenging zero-shot cases, including examples that break natural and physical priors, such as art. As training data, we use a synthetically curated dataset containing occluded objects paired with their whole counterparts. Experiments show that our approach outperforms supervised baselines on established benchmarks. Our model can furthermore be used to significantly improve the performance of existing object recognition and 3D reconstruction methods in the presence of occlusions.
PDF101December 15, 2024