生成的画像ダイナミクス

要旨

シーンダイナミクスに対する画像空間上の事前分布をモデル化するアプローチを提案する。本手法の事前分布は、木々、花、ろうそく、風になびく衣服など、自然な振動運動を含む実写映像シーケンスから抽出された運動軌跡の集合から学習される。単一の画像が与えられると、学習済みモデルは周波数調整された拡散サンプリングプロセスを用いて、フーリエ領域におけるピクセル単位の長期運動表現を予測する。これをニューラル確率運動テクスチャと呼ぶ。この表現は、映像全体にわたる密な運動軌跡に変換可能である。画像ベースのレンダリングモジュールと組み合わせることで、静止画をシームレスにループする動画に変換したり、実写画像内の物体と現実的にインタラクションしたりするなど、様々な下流タスクに活用できる。

English

We present an approach to modeling an image-space prior on scene dynamics. Our prior is learned from a collection of motion trajectories extracted from real video sequences containing natural, oscillating motion such as trees, flowers, candles, and clothes blowing in the wind. Given a single image, our trained model uses a frequency-coordinated diffusion sampling process to predict a per-pixel long-term motion representation in the Fourier domain, which we call a neural stochastic motion texture. This representation can be converted into dense motion trajectories that span an entire video. Along with an image-based rendering module, these trajectories can be used for a number of downstream applications, such as turning still images into seamlessly looping dynamic videos, or allowing users to realistically interact with objects in real pictures.

生成的画像ダイナミクス

Generative Image Dynamics

要旨

Support