SparkBench
- 6 May 2015
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
Abstract
Spark has been increasingly adopted by industries in recent years for big data analysis by providing a fault tolerant, scalable and easy-to-use in memory abstraction. Moreover, the community has been actively developing a rich ecosystem around Spark, making it even more attractive. However, there is not yet a Spark specify benchmark existing in the literature to guide the development and cluster deployment of Spark to better fit resource demands of user applications. In this paper, we present SparkBench, a Spark specific benchmarking suite, which includes a comprehensive set of applications. SparkBench covers four main categories of applications, including machine learning, graph computation, SQL query and streaming applications. We also characterize the resource consumption, data flow and timing information of each application and evaluate the performance impact of a key configuration parameter to guide the design and optimization of Spark data analytic platform.Keywords
This publication has 13 references indexed in Scilit:
- BigDataBench: A big data benchmark suite from internet servicesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- A characterization of big data benchmarksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- BigBenchPublished by Association for Computing Machinery (ACM) ,2013
- An Introduction to Statistical LearningPublished by Springer Science and Business Media LLC ,2013
- Clearing the cloudsPublished by Association for Computing Machinery (ACM) ,2012
- Efficient Triangle Counting in Large Graphs via Degree-Based Vertex PartitioningInternet Mathematics, 2012
- Benchmarking cloud serving systems with YCSBPublished by Association for Computing Machinery (ACM) ,2010
- A comparison of approaches to large-scale data analysisPublished by Association for Computing Machinery (ACM) ,2009
- Factorization meets the neighborhoodPublished by Association for Computing Machinery (ACM) ,2008
- MapReduceCommunications of the ACM, 2008