When Do Redundant Requests Reduce Latency?
Open Access
- 7 December 2015
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Communications
- Vol. 64 (2), 715-722
- https://doi.org/10.1109/tcomm.2015.2506161
Abstract
Many systems possess the flexibility to serve requests in more than one way, such as distributed storage systems that store multiple copies of the data. In such systems, the latency of serving the requests may potentially be reduced by sending redundant requests: a request may be sent to more servers than needed and deemed served when the requisite number of servers complete service. Such a mechanism trades off the possibility of faster execution of the request with the increase in the load on the system. Several recent works empirically evaluate the latency performance of redundant requests in diverse settings. In this paper, we perform an analytical study of the latency performance of redundant requests, with the primary goals of characterizing under what scenarios sending redundant requests will help (and under what scenarios it will not), and of designing optimal redundant-requesting policies. We show that when service times are i.i.d. memoryless or “heavier,” and when the additional copies of already-completed jobs can be removed instantly, maximally scheduling redundant requests achieves the optimal average latency. On the other hand, when service times are i.i.d. “lighter” or when service times are memoryless and removal of jobs is not instantaneous, then not having any redundancy in the requests is optimal under high loads. Our results are applicable to arbitrary arrival processes.Keywords
Funding Information
- Berkeley Fellowship
- Microsoft Research PhD Fellowship
- KFAS Fellowship
- AFOSR MURI (FA9550-10-1-0567)
- NSF (CCF-1116404)
This publication has 22 references indexed in Scilit:
- The MDS queue: Analysing the latency performance of erasure codesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- FAST CLOUD: Pushing the Envelope on Delay Performance of Cloud Storage With CodingIEEE/ACM Transactions on Networking, 2013
- When do redundant requests reduce latency ?Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Reducing web latencyPublished by Association for Computing Machinery (ACM) ,2013
- Regenerating codes for errors and erasures in distributed storagePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Crowds in two secondsPublished by Association for Computing Machinery (ACM) ,2011
- Computer Architecture and ImplementationPublished by Cambridge University Press (CUP) ,2000
- Data partitioning and load balancing in parallel disk systemsThe VLDB Journal, 1998
- An introduction to disk drive modelingComputer, 1994
- Dynamic file allocation in disk arraysPublished by Association for Computing Machinery (ACM) ,1991