The Mondrian Data Engine

24 June 2017

journal article
tutorial
Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News

Vol. 45 (2), 639-651
https://doi.org/10.1145/3140659.3080233

Abstract

The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. As the performance density of traditional CPU-centric architectures stagnates, advancing compute capabilities necessitates novel architectural approaches. Near-memory processing (NMP) architectures are reemerging as promising candidates to improve computing efficiency through tight coupling of logic and memory. NMP architectures are especially fitting for data analytics, as they provide immense bandwidth to memory-resident data and dramatically reduce data movement, the main source of energy consumption. Modern data analytics operators are optimized for CPU execution and hence rely on large caches and employ random memory accesses. In the context of NMP, such random accesses result in wasteful DRAM row buffer activations that account for a significant fraction of the total memory access energy. In addition, utilizing NMP's ample bandwidth with fine-grained random accesses requires complex hardware that cannot be accommodated under NMP's tight area and power constraints. Our thesis is that efficient NMP calls for an algorithm-hardware co-design that favors algorithms with sequential accesses to enable simple hardware that accesses memory in streams. We introduce an instance of such a co-designed NMP architecture for data analytics, the Mondrian Data Engine. Compared to a CPU-centric and a baseline NMP system, the Mondrian Data Engine improves the performance of basic data analytics operators by up to 49x and 5x, and efficiency by up to 28x and 5x, respectively.

Keywords

Funding Information

CE-EuroLab-4-HPC (671610)
DIVIDEND - ERA-Net (FNS 20CH21_155014)
MSR Cambridge PhD Scholarship Programme (2013-029)
nano-tera.ch (NT-YINS RTD - 20NA21_150939)

This publication has 41 references indexed in Scilit:

A scalable processing-in-memory accelerator for parallel graph processing
Published by Association for Computing Machinery (ACM) ,2015
Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems
Published by Association for Computing Machinery (ACM) ,2015
Comparing Implementations of Near-Data Computing with In-Memory MapReduce Workloads
IEEE Micro, 2014
Multi-core, main-memory joins
Proceedings of the VLDB Endowment, 2013
The tail at scale
Communications of the ACM, 2013
The vertica analytic database
Proceedings of the VLDB Endowment, 2012
The case for RAMCloud
Communications of the ACM, 2011
Sort vs. Hash revisited
Proceedings of the VLDB Endowment, 2009
Optimizing main-memory join on modern hardware
IEEE Transactions on Knowledge and Data Engineering, 2002
A case for intelligent RAM
IEEE Micro, 1997

Cited by 20 articles