의미적 진행 함수를 통한 비디오 분석 및 생성

초록

이미지 및 비디오 생성 모델이 생성하는 변환 과정은 종종 높은 비선형성을 보입니다: 내용이 거의 변화하지 않는 긴 구간 뒤에 갑작스러운 의미론적 도약이 발생하죠. 이러한 현상을 분석하고 수정하기 위해, 우리는 주어진 시퀀스의 의미가 시간에 따라 어떻게 진화하는지를 포착하는 1차원 표현인 '의미 진행 함수(Semantic Progress Function)'를 도입합니다. 각 프레임에 대해 의미 임베딩 간의 거리를 계산하고, 시퀀스 전체의 누적 의미 변화를 반영하는 부드러운 곡선을 적합합니다. 이 곡선이 직선에서 벗어나는 정도는 고르지 않은 의미 진행 속도를 나타냅니다. 이러한 통찰을 바탕으로, 의미 변화가 일정한 속도로 전개되도록 시퀀스를 재매개변수화(또는 재시간화)하는 '의미 선형화(semantic linearization)' 절차를 제안합니다. 이를 통해 더 부드럽고 일관된 전환을 얻을 수 있습니다. 선형화를 넘어, 우리의 프레임워크는 시간적 불규칙성 식별, 다양한 생성기 간 의미 진행 속도 비교, 생성된 및 실제 비디오 시퀀스를 임의의 목표 속도로 조정하는 데 필요한 모델-불가지론적 기반을 제공합니다.

English

Transformations produced by image and video generation models often evolve in a highly non-linear manner: long stretches where the content barely changes are followed by sudden, abrupt semantic jumps. To analyze and correct this behavior, we introduce a Semantic Progress Function, a one-dimensional representation that captures how the meaning of a given sequence evolves over time. For each frame, we compute distances between semantic embeddings and fit a smooth curve that reflects the cumulative semantic shift across the sequence. Departures of this curve from a straight line reveal uneven semantic pacing. Building on this insight, we propose a semantic linearization procedure that reparameterizes (or retimes) the sequence so that semantic change unfolds at a constant rate, yielding smoother and more coherent transitions. Beyond linearization, our framework provides a model-agnostic foundation for identifying temporal irregularities, comparing semantic pacing across different generators, and steering both generated and real-world video sequences toward arbitrary target pacing.

의미적 진행 함수를 통한 비디오 분석 및 생성

Video Analysis and Generation via a Semantic Progress Function

초록

Support