The Mondrian Data Engine
- 24 June 2017
- journal article
- tutorial
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 45 (2), 639-651
- https://doi.org/10.1145/3140659.3080233
Abstract
The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to system designers, a task only made trickier by the end of Dennard scaling. As the performance density of traditional CPU-centric architectures stagnates, advancing compute capabilities necessitates novel architectural approaches. Near-memory processing (NMP) architectures are reemerging as promising candidates to improve computing efficiency through tight coupling of logic and memory. NMP architectures are especially fitting for data analytics, as they provide immense bandwidth to memory-resident data and dramatically reduce data movement, the main source of energy consumption. Modern data analytics operators are optimized for CPU execution and hence rely on large caches and employ random memory accesses. In the context of NMP, such random accesses result in wasteful DRAM row buffer activations that account for a significant fraction of the total memory access energy. In addition, utilizing NMP's ample bandwidth with fine-grained random accesses requires complex hardware that cannot be accommodated under NMP's tight area and power constraints. Our thesis is that efficient NMP calls for an algorithm-hardware co-design that favors algorithms with sequential accesses to enable simple hardware that accesses memory in streams. We introduce an instance of such a co-designed NMP architecture for data analytics, the Mondrian Data Engine. Compared to a CPU-centric and a baseline NMP system, the Mondrian Data Engine improves the performance of basic data analytics operators by up to 49x and 5x, and efficiency by up to 28x and 5x, respectively.Keywords
Funding Information
- CE-EuroLab-4-HPC (671610)
- DIVIDEND - ERA-Net (FNS 20CH21_155014)
- MSR Cambridge PhD Scholarship Programme (2013-029)
- nano-tera.ch (NT-YINS RTD - 20NA21_150939)
This publication has 41 references indexed in Scilit:
- A scalable processing-in-memory accelerator for parallel graph processingPublished by Association for Computing Machinery (ACM) ,2015
- Fast Serializable Multi-Version Concurrency Control for Main-Memory Database SystemsPublished by Association for Computing Machinery (ACM) ,2015
- Comparing Implementations of Near-Data Computing with In-Memory MapReduce WorkloadsIEEE Micro, 2014
- Multi-core, main-memory joinsProceedings of the VLDB Endowment, 2013
- The tail at scaleCommunications of the ACM, 2013
- The vertica analytic databaseProceedings of the VLDB Endowment, 2012
- The case for RAMCloudCommunications of the ACM, 2011
- Sort vs. Hash revisitedProceedings of the VLDB Endowment, 2009
- Optimizing main-memory join on modern hardwareIEEE Transactions on Knowledge and Data Engineering, 2002
- A case for intelligent RAMIEEE Micro, 1997