Practical Structure Layout Optimization and Advice
- 7 April 2006
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in International Symposium on Code Generation and Optimization (CGO'06)
- p. 233-244
- https://doi.org/10.1109/cgo.2006.29
Abstract
With the delta between processor clock frequency and memory latency ever increasing and with the standard locality improving transformations maturing, compilers increasingly seek to modify an application's data layout to improve spatial and temporal locality and to reduce cache miss and page fault penalties. In this paper we describe a practical implementation of the data layout optimizations Structure Splitting, Structure Peeling, Structure Field Reordering and Dead Field Removal, both for profile and non-profile based compilations. We demonstrate significant performance gains, but find that automatic transformations fail for a relatively high number of record types because of legality violations or profitability constraints. Additionally, we find a class of desirable transformations for which the framework cannot provide satisfying results. To address this issue we complement the automatic transformations with an advisory tool. We reuse the compiler analysis done for automatic transformation and correlate its results with peformance data collected during runtime for structure fields, such as data cache misses and latencies. We then use the compiler as a pefomtance analysis and reporting tool and provide insight into how to layout structure types more eficiently.Keywords
This publication has 17 references indexed in Scilit:
- Static branch frequency and program profile analysisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Automatic pool allocationPublished by Association for Computing Machinery (ACM) ,2005
- SYZYGY - A Framework for Scalable Cross-Module IPOPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Array regrouping and structure splitting using whole-program reference affinityPublished by Association for Computing Machinery (ACM) ,2004
- A data locality optimizing algorithmACM SIGPLAN Notices, 2004
- Data remapping for design space optimization of embedded memory systemsACM Transactions on Embedded Computing Systems, 2003
- HP Caliper: a framework for performance analysis toolsIEEE Concurrency, 2000
- Automated data-member layout of heap objects to improve memory-hierarchy performanceACM Transactions on Programming Languages and Systems, 2000
- Graph layout through the VCG toolLecture Notes in Computer Science, 1995
- Cache performance of garbage-collected programsPublished by Association for Computing Machinery (ACM) ,1994