Video Coding Using Advanced Motion Models

3D_DCT_of_translation — 3D DCT of a video sequence with global horizontal motion

Team: N. Bozinovic, J. Konrad
Collaborators: M. Barlaud, A. Thomas, University of Nice, France
Funding: National Science Foundation (CISE-CCR-SPS, International Collaboration USA-France)
Status: Completed (2001-2006)

Background: It is widely believed in the research community today that the next significant video coding gains will come from a joint compression of multiple video frames, such as offered by 3D wavelet coding. To a degree this is already exploited in motion-compensated hybrid coders, such those standardized by MPEG and ISO, by means of B-frames and multi-frame prediction. Can further coding be obtained by more advanced methods exploiting continuity of motion in time?

DCT_coding — R-D performance of 3D DCT-based compression versus MPEG-2 and MPEG-4

Summary: Global, constant-velocity, translational motion in an image sequence induces a characteristic energy footprint in the Fourier-transform (FT) domain; spectrum is limited to a plane with orientation defined by the direction of motion. By detecting these spectral occupancy planes, methods have been proposed to estimate such global motion. We show that global, constant-velocity, translational motion in an image sequence induces in the DCT domain spectral occupancy planes, similarly to the FT domain. Unlike in the FT case, however, these planes are subject to spectral folding. Based on this analysis, we proposed a motion estimation method in the DCT domain, and we showed that results comparable to standard block matching can be obtained. Moreover, by realizing that significant energy in the DCT domain concentrates around a folded plane, we proposed a new approach to video compression. The approach is based on 3D DCT applied to a group of frames, followed by motion-adaptive scanning of DCT coefficients (akin to “zig-zag” scanning in MPEG coders), their adaptive quantization, and final entropy coding, and is competitive with MPEG standards. We also developed a Discrete-Wavelet Transform (DWT) based method that uses lifting along multi-frame motion trajectories.

Publications:

N. Božinović and J. Konrad, “Scan order and quantization for 3D-DCT coding,” in Proc. SPIE Visual Communications and Image Process., vol. 5150, pp. 1204-1215, July 2003.
N. Božinović and J. Konrad, “Mesh-based motion models for wavelet video coding,” in Proc. IEEE Int. Conf. Acoustics Speech Signal Processing, vol. III, pp. 141-144, May 2004.
N. Božinović, J. Konrad, T. André, M. Antonini, and M. Barlaud, “Motion-compensated lifted wavelet video coding: toward optimal motion/transform configuration,” in Signal Process. XII: Theories and Applications (Proc. Twelfth European Signal Process. Conf.), pp. 1975-1978, Sept. 2004.
J. Konrad and N. Božinović, “Importance of motion in motion-compensated temporal discrete wavelet transforms,” in Proc. SPIE Image and Video Communications and Process., vol. 5685, pp. 354-365, Jan. 2005.
N. Božinović, J. Konrad, W. Zhao, and C. Vázquez, “On the importance of motion invertibility in MCTF/DWT video coding,” Proc. IEEE Int. Conf. Acoustics Speech Signal Processing, vol. II, pp. 49-52, Mar. 2005.
N. Božinović and J. Konrad, “Motion analysis in 3D DCT domain and its application to video coding,” Signal Process., Image Commun., vol. 20, pp. 510-528, July 200, 2004-2005 EURASIP Image Communication Best Paper Award.
N. Božinović and J. Konrad, “Modeling motion for spatial scalability,” in Proc. IEEE Int. Conf. Acoustics Speech Signal Processing, May 2006.