IMC-Sort: In-Memory Parallel Sorting Architecture using Hybrid Memory Cube
- 7 September 2020
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM) in Proceedings of the 2020 on Great Lakes Symposium on VLSI
Abstract
Processing-in-memory (PIM) architectures have gained significant importance as an alternative paradigm to the von-Neumann architectures to alleviate the memory wall and technology scaling problems. PIM architectures have achieved significant latency and energy consumption improvements for various emerging and widely used workloads such as deep neural networks, graph analytics, databases and computational genomics. In this work, we propose a PIM based accelerator architecture (IMC-Sort) for the sort algorithm. Sort is one of the fundamental and widely used algorithm in various applications such as databases, networking, and data analytics. IMC-Sort architecture augments the hybrid memory cube memory system by incorporating custom sorting network at each of the HMC vault's logic layer. IMC-Sort uses optimized folded Bitonic sort and merge network to sort input sequences of arbitrary length at each vault and optimized address mapping mechanism to distribute the input data across HMC vaults. Merging of the sorted results across individual vaults is also performed using the vault's sorting network by communicating with other vaults through the HMC's crossbar network. Overall, IMC-Sort achieves 16.8x, 1.1x speedup and 375.5x, 13.6x savings in energy consumption compared to the widely used CPU implementation and state of the art near memory custom sort accelerator respectively.Keywords
Funding Information
- Semiconductor Research Corporation
This publication has 16 references indexed in Scilit:
- Bonsai: High-Performance Adaptive Merge Tree SortingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2020
- Wire-Aware Architecture and Dataflow for CNN AcceleratorsPublished by Association for Computing Machinery (ACM) ,2019
- GraphIAPublished by Association for Computing Machinery (ACM) ,2018
- ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in CrossbarsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main MemoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Fixed-function hardware sorting accelerators for near data MapReduce executionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- A scalable processing-in-memory accelerator for parallel graph processingPublished by Association for Computing Machinery (ACM) ,2015
- Modular Design of High-Throughput, Low-Latency Sorting UnitsInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012
- Hive - a petabyte scale data warehouse using HadoopPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2010
- Map-reduce-mergePublished by Association for Computing Machinery (ACM) ,2007