Efficient Integer DCT Architectures for HEVC

Abstract
In this paper, we present area- and power-efficient architectures for the implementation of integer discrete cosine transform (DCT) of different lengths to be used in High Efficiency Video Coding (HEVC). We show that an efficient constant matrix-multiplication scheme can be used to derive parallel architectures for 1-D integer DCT of different lengths. We also show that the proposed structure could be reusable for DCT of lengths 4, 8, 16, and 32 with a throughput of 32 DCT coefficients per cycle irrespective of the transform size. Moreover, the proposed architecture could be pruned to reduce the complexity of implementation substantially with only a marginal affect on the coding performance. We propose power-efficient structures for folded and full-parallel implementations of 2-D DCT. From the synthesis result, it is found that the proposed architecture involves nearly 14% less area-delay product (ADP) and 19% less energy per sample (EPS) compared to the direct implementation of the reference algorithm, on average, for integer DCT of lengths 4, 8, 16, and 32. Also, an additional 19% saving in ADP and 20% saving in EPS can be achieved by the proposed pruning algorithm with nearly the same throughput rate. The proposed architecture is found to support ultrahigh definition 7680 × 4320 at 60 frames/s video, which is one of the applications of HEVC.

This publication has 15 references indexed in Scilit: