Video Coding Using Advanced Motion Models

3D DCT of a video sequence with global horizontal motion

Team: N. Bozinovic, J. Konrad
Collaborators: M. Barlaud, A. Thomas, University of Nice, France
Funding: National Science Foundation (CISE-CCR-SPS, International Collaboration USA-France)
Status: Completed (2001-2006)

Background: It is widely believed in the research community today that the next significant video coding gains will come from a joint compression of multiple video frames, such as offered by 3D wavelet coding. To a degree this is already exploited in motion-compensated hybrid coders, such those standardized by MPEG and ISO, by means of B-frames and multi-frame prediction. Can further coding be obtained by more advanced methods exploiting continuity of motion in time?

R-D performance of 3D DCT-based compression versus MPEG-2 and MPEG-4

Summary: Global, constant-velocity, translational motion in an image sequence induces a characteristic energy footprint in the Fourier-transform (FT) domain; spectrum is limited to a plane with orientation defined by the direction of motion. By detecting these spectral occupancy planes, methods have been proposed to estimate such global motion. We show that global, constant-velocity, translational motion in an image sequence induces in the DCT domain spectral occupancy planes, similarly to the FT domain. Unlike in the FT case, however, these planes are subject to spectral folding. Based on this analysis, we proposed a motion estimation method in the DCT domain, and we showed that results comparable to standard block matching can be obtained. Moreover, by realizing that significant energy in the DCT domain concentrates around a folded plane, we proposed a new approach to video compression. The approach is based on 3D DCT applied to a group of frames, followed by motion-adaptive scanning of DCT coefficients (akin to “zig-zag” scanning in MPEG coders), their adaptive quantization, and final entropy coding, and is competitive with MPEG standards. We also developed a Discrete-Wavelet Transform (DWT) based method that uses lifting along multi-frame motion trajectories.