CAD-MLLM:將多模態條件CAD生成與MLLM統一
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM
November 7, 2024
作者: Jingwei Xu, Chenyu Wang, Zibo Zhao, Wen Liu, Yi Ma, Shenghua Gao
cs.AI
摘要
本文旨在設計一個統一的電腦輔助設計(CAD)生成系統,可以根據用戶以文本描述、圖像、點雲甚至它們的組合形式輸入,輕鬆生成CAD模型。為了實現這一目標,我們引入了CAD-MLLM,這是第一個能夠生成參數化CAD模型並以多模態輸入為條件的系統。具體來說,在CAD-MLLM框架內,我們利用CAD模型的命令序列,然後利用先進的大型語言模型(LLMs)來對齊這些不同多模態數據和CAD模型的向量表示的特徵空間。為了促進模型訓練,我們設計了一個全面的數據構建和標註流程,為每個CAD模型配備相應的多模態數據。我們的結果數據集名為Omni-CAD,是第一個包含文本描述、多視圖圖像、點和命令序列的多模態CAD數據集。它包含約450K個實例及其CAD構建序列。為了徹底評估我們生成的CAD模型的質量,我們超越了目前僅關注重建質量的評估指標,引入了評估拓撲質量和表面封閉範圍的額外指標。廣泛的實驗結果表明,CAD-MLLM明顯優於現有的有條件生成方法,並且對噪音和缺失點具有高度的魯棒性。項目頁面和更多可視化內容可在以下網址找到:https://cad-mllm.github.io/
English
This paper aims to design a unified Computer-Aided Design (CAD) generation
system that can easily generate CAD models based on the user's inputs in the
form of textual description, images, point clouds, or even a combination of
them. Towards this goal, we introduce the CAD-MLLM, the first system capable of
generating parametric CAD models conditioned on the multimodal input.
Specifically, within the CAD-MLLM framework, we leverage the command sequences
of CAD models and then employ advanced large language models (LLMs) to align
the feature space across these diverse multi-modalities data and CAD models'
vectorized representations. To facilitate the model training, we design a
comprehensive data construction and annotation pipeline that equips each CAD
model with corresponding multimodal data. Our resulting dataset, named
Omni-CAD, is the first multimodal CAD dataset that contains textual
description, multi-view images, points, and command sequence for each CAD
model. It contains approximately 450K instances and their CAD construction
sequences. To thoroughly evaluate the quality of our generated CAD models, we
go beyond current evaluation metrics that focus on reconstruction quality by
introducing additional metrics that assess topology quality and surface
enclosure extent. Extensive experimental results demonstrate that CAD-MLLM
significantly outperforms existing conditional generative methods and remains
highly robust to noises and missing points. The project page and more
visualizations can be found at: https://cad-mllm.github.io/Summary
AI-Generated Summary