Efficient Prediction Structures for Multiview Video Coding

Abstract
An experimental analysis of multiview video coding (MVC) for various temporal and inter-view prediction structures is presented. The compression method is based on the multiple reference picture technique in the H.264/AVC video coding standard. The idea is to exploit the statistical dependencies from both temporal and inter-view reference pictures for motion-compensated prediction. The effectiveness of this approach is demonstrated by an experimental analysis of temporal versus inter-view prediction in terms of the Lagrange cost function. The results show that prediction with temporal reference pictures is highly efficient, but for 20% of a picture's blocks on average prediction with reference pictures from adjacent views is more efficient. Hierarchical B pictures are used as basic structure for temporal prediction. Their advantages are combined with inter-view prediction for different temporal hierarchy levels, starting from simulcast coding with no inter-view prediction up to full level inter-view prediction. When using inter-view prediction at key picture temporal levels, average gains of 1.4-dB peak signal-to-noise ratio (PSNR) are reported, while additionally using inter-view prediction at nonkey picture temporal levels, average gains of 1.6-dB PSNR are reported. For some cases, gains of more than 3 dB, corresponding to bit-rate savings of up to 50%, are obtained.

This publication has 21 references indexed in Scilit: