Fast and Lightweight Execution Time Predictions for Spark Applications
- 1 July 2019
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD)
- p. 493-495
- https://doi.org/10.1109/cloud.2019.00088
Abstract
Users and operators of cloud-based Spark clusters often require quick insights on how the execution time of an application is likely to be impacted by the resources allocated to the application, e.g., the number of Spark executor cores assigned, and the size of the data to be processed. Existing techniques typically require extensive prior executions of the application under various resource allocation settings and data sizes to obtain an accurate model. In this paper, we explore the accuracy of a model with less prior executions of the application. Such a model can be useful for situations where quick predictions are required and little cluster resources are available for building a model. We use logs from two executions of an application with small sample data and different resource settings and explore the accuracy of the predictions for other resource allocation settings and input data sizes.Keywords
This publication has 4 references indexed in Scilit:
- Dynamic Configuration of Partitioning in Spark ApplicationsIEEE Transactions on Parallel and Distributed Systems, 2017
- A Novel Method for Tuning Configuration Parameters of Spark Based on Machine LearningPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Stage Aware Performance Modeling of DAG Based in Memory Analytic PlatformsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Validity of the single processor approach to achieving large scale computing capabilitiesPublished by Association for Computing Machinery (ACM) ,1967