어떤 장면에도 무엇이든: 사실적인 비디오 객체 삽입

초록

현실적인 비디오 시뮬레이션은 가상 현실부터 영화 제작에 이르기까지 다양한 분야에서 상당한 잠재력을 보여주고 있다. 이는 특히 실제 환경에서 비디오를 촬영하기가 비현실적이거나 비용이 많이 드는 시나리오에서 더욱 두드러진다. 기존의 비디오 시뮬레이션 접근법은 종종 조명 환경을 정확히 모델링하거나 물체의 기하학적 구조를 표현하거나 높은 수준의 사진 같은 현실감을 달성하는 데 실패한다. 본 논문에서는 'Anything in Any Scene'이라는 새로운 범용 프레임워크를 제안한다. 이 프레임워크는 물리적 현실감을 강조하며 기존의 동적 비디오에 어떠한 물체도 자연스럽게 삽입할 수 있다. 제안된 일반 프레임워크는 세 가지 주요 프로세스로 구성된다: 1) 기하학적 현실감을 보장하기 위해 주어진 장면 비디오에 현실적인 물체를 적절히 배치하여 통합하는 과정; 2) 하늘 및 환경 조명 분포를 추정하고 현실적인 그림자를 시뮬레이션하여 조명 현실감을 강화하는 과정; 3) 최종 비디오 출력을 개선하여 사진 같은 현실감을 극대화하기 위한 스타일 전이 네트워크를 사용하는 과정. 실험을 통해 'Anything in Any Scene' 프레임워크가 높은 수준의 기하학적 현실감, 조명 현실감, 사진 같은 현실감을 가진 시뮬레이션 비디오를 생성함을 입증한다. 비디오 데이터 생성과 관련된 어려움을 크게 완화함으로써, 본 프레임워크는 고품질 비디오를 획득하기 위한 효율적이고 비용 효과적인 솔루션을 제공한다. 또한, 이 프레임워크의 응용 범위는 비디오 데이터 증강을 넘어 가상 현실, 비디오 편집 및 다양한 비디오 중심 응용 분야에서도 유망한 잠재력을 보여준다. 프로젝트 코드 및 고해상도 비디오 결과를 확인하려면 프로젝트 웹사이트(https://anythinginanyscene.github.io)를 방문하시기 바란다.

English

Realistic video simulation has shown significant potential across diverse applications, from virtual reality to film production. This is particularly true for scenarios where capturing videos in real-world settings is either impractical or expensive. Existing approaches in video simulation often fail to accurately model the lighting environment, represent the object geometry, or achieve high levels of photorealism. In this paper, we propose Anything in Any Scene, a novel and generic framework for realistic video simulation that seamlessly inserts any object into an existing dynamic video with a strong emphasis on physical realism. Our proposed general framework encompasses three key processes: 1) integrating a realistic object into a given scene video with proper placement to ensure geometric realism; 2) estimating the sky and environmental lighting distribution and simulating realistic shadows to enhance the light realism; 3) employing a style transfer network that refines the final video output to maximize photorealism. We experimentally demonstrate that Anything in Any Scene framework produces simulated videos of great geometric realism, lighting realism, and photorealism. By significantly mitigating the challenges associated with video data generation, our framework offers an efficient and cost-effective solution for acquiring high-quality videos. Furthermore, its applications extend well beyond video data augmentation, showing promising potential in virtual reality, video editing, and various other video-centric applications. Please check our project website https://anythinginanyscene.github.io for access to our project code and more high-resolution video results.

어떤 장면에도 무엇이든: 사실적인 비디오 객체 삽입

Anything in Any Scene: Photorealistic Video Object Insertion

초록

Support