Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server
- 1 October 2015
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Graph processing is an increasingly important application domain and is typically communication-bound. In this work, we analyze the performance characteristics of three high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server. Unlike many other communication-bound workloads, graph algorithms struggle to fully utilize the platform's memory bandwidth and so increasing memory bandwidth utilization could be just as effective as decreasing communication. Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new memory system.Keywords
This publication has 31 references indexed in Scilit:
- Microarchitectural performance characterization of irregular GPU kernelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- A lightweight infrastructure for graph analyticsPublished by Association for Computing Machinery (ACM) ,2013
- Pannotia: Understanding irregular GPGPU graph applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- LigraPublished by Association for Computing Machinery (ACM) ,2013
- Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node EfficiencyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Measurement and analysis of online social networksPublished by Association for Computing Machinery (ACM) ,2007
- On the Memory Access Patterns of Supercomputer Applications: Benchmark Selection and Its ImplicationsInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2007
- A Study on the Locality Behavior of Minimum Spanning Tree AlgorithmsLecture Notes in Computer Science, 2006
- Detection of functional modules from protein interaction networksProteins, 2003
- Runahead execution: an alternative to very large instruction windows for out-of-order processorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003