BlueDBM

13 June 2015

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 1-13
https://doi.org/10.1145/2749469.2750412

Abstract

Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data and daily twitter feeds where the datasets of interest are 5TB to 20 TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to 256GBs of DRAM, to accommodate all the data in DRAM. On the other hand, such datasets could be stored easily in the flash memory of a rack-sized cluster. Flash storage has much better random access performance than hard disks, which makes it desirable for analytics workloads. In this paper we present BlueDBM, a new system architecture which has flash-based storage with in-store processing capability and a low-latency high-throughput inter-controller network. We show that BlueDBM outperforms a flash-based system without these features by a factor of 10 for some important applications. While the performance of a ram-cloud system falls sharply even if only 5%~10% of the references are to the secondary storage, this sharp performance degradation is not an issue in BlueDBM. BlueDBM presents an attractive point in the cost-performance trade-off for Big Data analytics.

Keywords

Funding Information

Quanta Computers (Agmt. Dtd. 04/01/05)
Samsung (Res. Agmt. Eff. 01/01/12)
Intel Corporation (Agmt. Eff. 07/23/12)
Lincoln Laboratory (PO7000261350)

This publication has 26 references indexed in Scilit:

Software-Driven Hardware Development
Published by Association for Computing Machinery (ACM) ,2015
Ibex
Proceedings of the VLDB Endowment, 2014
A reconfigurable fabric for accelerating large-scale datacenter services
ACM SIGARCH Computer Architecture News, 2014
Rekindling network protocol innovation with user-level stacks
ACM SIGCOMM Computer Communication Review, 2014
NoFTL
Proceedings of the VLDB Endowment, 2013
QuickSAN
ACM SIGARCH Computer Architecture News, 2013
The case for RAMClouds
ACM SIGOPS Operating Systems Review, 2010
Streams on wires
Proceedings of the VLDB Endowment, 2009
High Performance RDMA Protocols in HPC
Lecture Notes in Computer Science, 2006
A case for intelligent disks (IDISKs)
ACM SIGMOD Record, 1998

Cited by 115 articles