Raster2Seq: 用于平面图重建的多边形序列生成
Raster2Seq: Polygon Sequence Generation for Floorplan Reconstruction
May 11, 2026
作者: Hao Phung, Hadar Averbuch-Elor
cs.AI
摘要
从栅格化的平面图图像中重建结构化的矢量图形表示,通常是涉及平面图的计算任务(如自动理解或CAD工作流程)的重要前提。然而,现有技术在准确生成描述大型室内空间、包含众多房间和变化多边形角点的复杂平面图所传达的结构与语义方面仍存在困难。为此,我们提出Raster2Seq,将平面图重建定义为序列到序列任务,其中平面图元素(如房间、窗户和门)被表示为联合编码几何与语义的带标签多边形序列。我们的方法引入自回归解码器,通过可学习锚点的引导,学习基于图像特征和先前生成角点来预测下一个角点。这些锚点表示图像空间中的空间坐标,从而有效引导注意力机制聚焦于信息丰富的图像区域。通过采用自回归机制,我们的方法在输出格式上具备灵活性,能够高效处理包含众多房间和多样化多边形结构的复杂平面图。本方法在Struc3D、CubiCasa5K和Raster2Graph等标准基准上取得了最先进性能,同时在更具挑战性的数据集(如包含多样房间结构和复杂几何变化的WAFFLE)上也展现出强大的泛化能力。
English
Reconstructing a structured vector-graphics representation from a rasterized floorplan image is typically an important prerequisite for computational tasks involving floorplans such as automated understanding or CAD workflows. However, existing techniques struggle in faithfully generating the structure and semantics conveyed by complex floorplans that depict large indoor spaces with many rooms and a varying numbers of polygon corners. To this end, we propose Raster2Seq, framing floorplan reconstruction as a sequence-to-sequence task in which floorplan elements--such as rooms, windows, and doors--are represented as labeled polygon sequences that jointly encode geometry and semantics. Our approach introduces an autoregressive decoder that learns to predict the next corner conditioned on image features and previously generated corners using guidance from learnable anchors. These anchors represent spatial coordinates in image space, hence allowing for effectively directing the attention mechanism to focus on informative image regions. By embracing the autoregressive mechanism, our method offers flexibility in the output format, enabling for efficiently handling complex floorplans with numerous rooms and diverse polygon structures. Our method achieves state-of-the-art performance on standard benchmarks such as Structure3D, CubiCasa5K, and Raster2Graph, while also demonstrating strong generalization to more challenging datasets like WAFFLE, which contain diverse room structures and complex geometric variations.