An empirical analysis of scheduling techniques for real-time cloud-based data processing
- 1 December 2011
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
In this paper, we explore the challenges and needs of current cloud infrastructures, to better support cloud-based data-intensive applications that are not only latency-sensitive but also require strong timing guarantees. These applications have strict deadlines (e.g., to perform time-dependent mission critical tasks or to complete real-time control decisions using a human-in-the-loop), and deadline misses are undesirable. To highlight the challenges in this space, we provide a case study of the online scheduling of MapReduce jobs executed by Hadoop. Our evaluations on Amazon EC2 show that the existing Hadoop scheduler is ill-equipped to handle jobs with deadlines. However, by adapting existing multiprocessor scheduling techniques for the cloud environment, we observe significant performance improvements in minimizing missed deadlines and tardiness. Based on our case study, we discuss a range of challenges in this domain posed by virtualization and scale, and propose our research agenda centered around the application of advanced real-time scheduling techniques in the cloud environment.Keywords
This publication has 33 references indexed in Scilit:
- Implementation of compositional scheduling framework on virtualizationACM SIGBED Review, 2011
- Evaluation of gang scheduling performance and cost in a cloud computing systemThe Journal of Supercomputing, 2010
- HaLoopProceedings of the VLDB Endowment, 2010
- A Compromised-Time-Cost Scheduling Algorithm in SwinDeW-C for Instance-Intensive Cost-Constrained Workflows on a Cloud Computing PlatformThe International Journal of High Performance Computing Applications, 2010
- FLEX: A Slot Allocation Scheduling Optimizer for MapReduce WorkloadsLecture Notes in Computer Science, 2010
- Scheduling shared scans of large data filesProceedings of the VLDB Endowment, 2008
- Comparison of the three CPU schedulers in XenACM SIGMETRICS Performance Evaluation Review, 2007
- Proportionate progress: A notion of fairness in resource allocationAlgorithmica, 1996
- An Overview of Real-Time Database SystemsPublished by Springer Science and Business Media LLC ,1994
- Scheduling Algorithms for Multiprogramming in a Hard-Real-Time EnvironmentJournal of the ACM, 1973