Non-generic floating-point software support for embedded media processing

Abstract

This paper presents some work in progress on the design and implementation of efficient floating-point software support for embedded integer processors. We provide quantitative evidence of the benefits of supporting various non-generic (that is, fused, specialized, or paired) operations in addition to the five basic arithmetic operations: for individual calls, speedups range from 1.12 to 4.86, while on DSP kernels and benchmarks, our approach allows us to be up to 1.59x faster.

Keywords

This publication has 12 references indexed in Scilit:

Simultaneous Floating-Point Sine and Cosine for VLIW Integer Processors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
Composite Iterative Algorithm and Architecture for q-th Root Calculation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Designing Custom Arithmetic Data Paths with FloPoCo
IEEE Design & Test of Computers, 2011
FFT Implementation with Fused Floating-Point Operations
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2010
VFloat
ACM Transactions on Reconfigurable Technology and Systems, 2010
Optimizing correctly-rounded reciprocal square roots for embedded VLIW cores
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
A block floating-point treatment to the LMS algorithm: efficient realization and a roundoff error analysis
IEEE Transactions on Signal Processing, 2005
Accelerating sine and cosine evaluation with compiler assistance
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Automatic floating-point to fixed-point conversion for DSP code generation
Published by Association for Computing Machinery (ACM) ,2002
Accuracy and Stability of Numerical Algorithms
Published by Society for Industrial & Applied Mathematics (SIAM) ,2002

Cited by 2 articles