Enabling cost-effective data processing with smart SSD

1 May 2013

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 1-12
https://doi.org/10.1109/msst.2013.6558444

Abstract

This paper explores the benefits and limitations of in-storage processing on current Solid-State Disk (SSD) architectures. While disk-based in-storage processing has not been widely adopted, due to the characteristics of hard disks, modern SSDs provide high performance on concurrent random writes, and have powerful processors, memory, and multiple I/O channels to flash memory, enabling in-storage processing with almost no hardware changes. In addition, offloading I/O tasks allows a host system to fully utilize devices' internal parallelism without knowing the details of their hardware configurations. To leverage the enhanced data processing capabilities of modern SSDs, we introduce the Smart SSD model, which pairs in-device processing with a powerful host system capable of handling data-oriented tasks without modifying operating system code. By isolating the data traffic within the device, this model promises low energy consumption, high parallelism, low host memory footprint and better performance. To demonstrate these capabilities, we constructed a prototype implementing this model on a real SATA-based SSD. Our system uses an object-based protocol for low-level communication with the host, and extends the Hadoop MapReduce framework to support a Smart SSD. Our experiments show that total energy consumption is reduced by 50% due to the low-power processing inside a Smart SSD. Moreover, a system with a Smart SSD can outperform host-side processing by a factor of two or three by efficiently utilizing internal parallelism when applications have light trafic to the device DRAM under the current architecture.

Keywords

This publication has 10 references indexed in Scilit:

Active Flash: Out-of-core data analytics on flash storage
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
Object-based SCM: An efficient interface for Storage Class Memories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
FAWN
Published by Association for Computing Machinery (ACM) ,2009
Data processing on FPGAs
Proceedings of the VLDB Endowment, 2009
FPGA
Published by Association for Computing Machinery (ACM) ,2009
Scalable parallel programming with CUDA
Published by Association for Computing Machinery (ACM) ,2008
Active disks for large-scale data processing
Computer, 2001
Active disks
Published by Association for Computing Machinery (ACM) ,1998
A case for intelligent disks (IDISKs)
ACM SIGMOD Record, 1998
CASSM
Published by Association for Computing Machinery (ACM) ,1975

Cited by 78 articles