A compiler framework for extracting superword level parallelism
- 11 June 2012
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- Vol. 47 (6), 347-358
- https://doi.org/10.1145/2254064.2254106
Abstract
SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. In this paper, we propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling, of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.Keywords
This publication has 17 references indexed in Scilit:
- A SIMD optimization framework for retargetable compilersACM Transactions on Architecture and Code Optimization, 2009
- A practical automatic polyhedral parallelizer and locality optimizerPublished by Association for Computing Machinery (ACM) ,2008
- Auto-vectorization of interleaved data for SIMDPublished by Association for Computing Machinery (ACM) ,2006
- Efficient SIMD Code Generation for Runtime Alignment and Length ConversionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Vectorization for SIMD architectures with alignment constraintsPublished by Association for Computing Machinery (ACM) ,2004
- Exploiting superword level parallelism with multimedia instruction setsPublished by Association for Computing Machinery (ACM) ,2000
- AMD 3DNow! technology: architecture and implementationsIEEE Micro, 1999
- SUIFACM SIGPLAN Notices, 1994
- Relaxing SIMD control flow constraints using loop transformationsPublished by Association for Computing Machinery (ACM) ,1992
- Strip mining on SIMD architecturesPublished by Association for Computing Machinery (ACM) ,1991