Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server

1 October 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 56-65
https://doi.org/10.1109/iiswc.2015.12

Abstract

Graph processing is an increasingly important application domain and is typically communication-bound. In this work, we analyze the performance characteristics of three high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server. Unlike many other communication-bound workloads, graph algorithms struggle to fully utilize the platform's memory bandwidth and so increasing memory bandwidth utilization could be just as effective as decreasing communication. Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new memory system.

Keywords

This publication has 31 references indexed in Scilit:

Microarchitectural performance characterization of irregular GPU kernels
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
A lightweight infrastructure for graph analytics
Published by Association for Computing Machinery (ACM) ,2013
Pannotia: Understanding irregular GPGPU graph applications
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Ligra
Published by Association for Computing Machinery (ACM) ,2013
Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
Measurement and analysis of online social networks
Published by Association for Computing Machinery (ACM) ,2007
On the Memory Access Patterns of Supercomputer Applications: Benchmark Selection and Its Implications
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2007
A Study on the Locality Behavior of Minimum Spanning Tree Algorithms
Lecture Notes in Computer Science, 2006
Detection of functional modules from protein interaction networks
Proteins, 2003
Runahead execution: an alternative to very large instruction windows for out-of-order processors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003

Cited by 118 articles