ChatPaper.aiChatPaper

PlaceIt3D:基于语言指导的真实3D场景物体布局

PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes

May 8, 2025
作者: Ahmed Abdelreheem, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Abdelrahman Eldesokey, Peter Wonka, Gabriel Brostow, Sara Vicente, Guillermo Garcia-Hernando
cs.AI

摘要

我们提出了“语言引导下的真实3D场景物体放置”这一新颖任务。我们的模型接收一个3D场景的点云数据、一个3D资产以及一段大致描述该3D资产应放置位置的文本提示。此任务的核心在于寻找一个既符合提示又有效的3D资产放置位置。相较于3D场景中的其他语言引导定位任务(如接地任务),本任务面临特定挑战:其具有多解性,即存在多个有效解决方案,并且需要推理3D几何关系及空闲空间。我们通过提出新的基准和评估协议,正式开启了这一任务的研究。同时,我们引入了一个用于训练3D大语言模型(LLMs)的新数据集,以及首个作为非平凡基线的方法。我们相信,这一具有挑战性的任务及其新基准,有望成为评估和比较通用型3D大语言模型性能的基准测试套件之一。
English
We introduce the novel task of Language-Guided Object Placement in Real 3D Scenes. Our model is given a 3D scene's point cloud, a 3D asset, and a textual prompt broadly describing where the 3D asset should be placed. The task here is to find a valid placement for the 3D asset that respects the prompt. Compared with other language-guided localization tasks in 3D scenes such as grounding, this task has specific challenges: it is ambiguous because it has multiple valid solutions, and it requires reasoning about 3D geometric relationships and free space. We inaugurate this task by proposing a new benchmark and evaluation protocol. We also introduce a new dataset for training 3D LLMs on this task, as well as the first method to serve as a non-trivial baseline. We believe that this challenging task and our new benchmark could become part of the suite of benchmarks used to evaluate and compare generalist 3D LLM models.

Summary

AI-Generated Summary

PDF51May 9, 2025