ChatPaper.aiChatPaper

任何场景中的任何事物:逼真视频对象插入

Anything in Any Scene: Photorealistic Video Object Insertion

January 30, 2024
作者: Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang, Yichen Guan, Xiaoyin Zheng, Tao Wang, Cheng Lu
cs.AI

摘要

逼真的视频模拟在各种应用中展现出显著潜力,从虚拟现实到电影制作。这在捕捉真实世界场景中不切实际或昂贵的情况下尤为明显。现有的视频模拟方法通常无法准确建模光照环境、表示物体几何形状,或实现高度逼真感。在本文中,我们提出了“任意场景中的任意物体”(Anything in Any Scene)的新颖通用框架,用于逼真视频模拟,能够将任何物体无缝地插入现有动态视频中,强调物理逼真感。我们提出的通用框架包括三个关键过程:1)将逼真的物体整合到给定场景视频中,确保几何逼真;2)估计天空和环境光分布,并模拟逼真阴影以增强光线逼真感;3)使用风格转移网络,优化最终视频输出以最大程度实现逼真感。我们通过实验证明,“任意场景中的任意物体”框架能够生成具有出色几何逼真感、光照逼真感和逼真感的模拟视频。通过显著减轻与视频数据生成相关的挑战,我们的框架为获取高质量视频提供了高效且具有成本效益的解决方案。此外,它的应用远不止于视频数据增强,在虚拟现实、视频编辑和各种其他以视频为中心的应用中展现出有前途的潜力。请访问我们的项目网站https://anythinginanyscene.github.io,获取我们的项目代码和更多高分辨率视频结果。
English
Realistic video simulation has shown significant potential across diverse applications, from virtual reality to film production. This is particularly true for scenarios where capturing videos in real-world settings is either impractical or expensive. Existing approaches in video simulation often fail to accurately model the lighting environment, represent the object geometry, or achieve high levels of photorealism. In this paper, we propose Anything in Any Scene, a novel and generic framework for realistic video simulation that seamlessly inserts any object into an existing dynamic video with a strong emphasis on physical realism. Our proposed general framework encompasses three key processes: 1) integrating a realistic object into a given scene video with proper placement to ensure geometric realism; 2) estimating the sky and environmental lighting distribution and simulating realistic shadows to enhance the light realism; 3) employing a style transfer network that refines the final video output to maximize photorealism. We experimentally demonstrate that Anything in Any Scene framework produces simulated videos of great geometric realism, lighting realism, and photorealism. By significantly mitigating the challenges associated with video data generation, our framework offers an efficient and cost-effective solution for acquiring high-quality videos. Furthermore, its applications extend well beyond video data augmentation, showing promising potential in virtual reality, video editing, and various other video-centric applications. Please check our project website https://anythinginanyscene.github.io for access to our project code and more high-resolution video results.
PDF171December 15, 2024