Interruptible tasks

Abstract

Real-world data-parallel programs commonly suffer from great memory pressure, especially when they are executed to process large datasets. Memory problems lead to excessive GC effort and out-of-memory errors, significantly hurting system performance and scalability. This paper proposes a systematic approach that can help data-parallel tasks survive memory pressure, improving their performance and scalability without needing any manual effort to tune system parameters. Our approach advocates interruptible task (ITask), a new type of data-parallel tasks that can be interrupted upon memory pressure---with part or all of their used memory reclaimed---and resumed when the pressure goes away. To support ITasks, we propose a novel programming model and a runtime system, and have instantiated them on two state-of-the-art platforms Hadoop and Hyracks. A thorough evaluation demonstrates the effectiveness of ITask: it has helped real-world Hadoop programs survive 13 out-of-memory problems reported on StackOverflow; a second set of experiments with 5 already well-tuned programs in Hyracks on datasets of different sizes shows that the ITask-based versions are 1.5--3x faster and scale to 3--24x larger datasets than their regular counterparts.

Keywords

Funding Information

National Science Foundation (CCF-0846195, CCF-1217854, CNS-1228995, CCF- 1319786, CNS-1321179, CCF-1409829, CCF-1439091, CCF- 1514189, CNS-1514256)
Office of Naval Research (N00014-14-1-0549)
Alfred P. Sloan Foundation

This publication has 25 references indexed in Scilit:

Pregelix
Proceedings of the VLDB Endowment, 2014
AsterixDB
Proceedings of the VLDB Endowment, 2014
A bloat-aware design for big data applications
Published by Association for Computing Machinery (ACM) ,2013
Inside "Big Data management"
Published by Association for Computing Machinery (ACM) ,2012
Hyracks: A flexible and extensible foundation for data-intensive computing
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
ASTERIX: towards a scalable, semistructured data platform for evolving-world models
Distributed and Parallel Databases, 2011
FlumeJava
Published by Association for Computing Machinery (ACM) ,2010
Leak pruning
Published by Association for Computing Machinery (ACM) ,2009
SCOPE
Proceedings of the VLDB Endowment, 2008
Bigtable
ACM Transactions on Computer Systems, 2008

Cited by 37 articles