Two-Phase Low-Energy N-Modular Redundancy for Hard Real-Time Multi-Core Systems
- 11 June 2015
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems
- Vol. 27 (5), 1497-1510
- https://doi.org/10.1109/tpds.2015.2444402
Abstract
This paper proposes an N-modular redundancy (NMR) technique with low energy-overhead for hard real-time multi-core systems. NMR is well-suited for multi-core platforms as they provide multiple processing units and low-overhead communication for voting. However, it can impose considerable energy overhead and hence its energy overhead must be controlled, which is the primary consideration of this paper. For this purpose the system operation can be divided into two phases: indispensable phase and on-demand phase. In the indispensable phase only half-plus-one copies for each task are executed. When no fault occurs during this phase, the results must be identical and hence the remaining copies are not required. Otherwise, the remaining copies must be executed in the on-demand phase to perform a complete majority voting. In this paper, for such a two-phase NMR, an energy-management technique is developed where two new concepts have been considered: i ) Block-partitioned scheduling that enables parallel task execution during on-demand phase, thereby leaving more slack for energy saving, ii ) Pseudo-dynamic slack, that results when a task has no faulty execution during the indispensable phase and hence the time which is reserved for its copies in the on-demand phase is reclaimed for energy saving. The energy-management technique has an off-line part that manages static and pseudo-dynamic slacks at design time and an online part that mainly manages dynamic slacks at run-time. Experimental results show that the proposed NMR technique provides up to 29 percent energy saving and is 6 orders of magnitude higher reliable as compared to a recent previous work.Keywords
Funding Information
- Sharif University of Technology (G930827)
- EPSRC (EP/K034448/1)
This publication has 30 references indexed in Scilit:
- Reliability-Driven Software Transformations for Unreliable HardwareIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2014
- Reducing Peak Power Consumption inMulti-Core Systems without ViolatingReal-Time ConstraintsIEEE Transactions on Parallel and Distributed Systems, 2013
- UnSync-CMP: Multicore CMP Architecture for Energy-Efficient Soft-Error ReliabilityIEEE Transactions on Parallel and Distributed Systems, 2013
- Adaptive energy-efficient task partitioning for heterogeneous multi-core multiprocessor real-time systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Energy-aware Standby-Sparing Technique for periodic real-time applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2011
- Flexible Error Protection for Energy Efficient Reliable ArchitecturesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2010
- Multicore soft error rate stabilization using adaptive dual modular redundancyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2010
- Fixed-Priority Allocation and Scheduling for Energy-Efficient Fault Tolerance in Hard Real-Time Multiprocessor SystemsIEEE Transactions on Parallel and Distributed Systems, 2008
- Fingerprinting: bounding soft-error-detection latency and bandwidthIEEE Micro, 2004
- Performance-effective and low-complexity task scheduling for heterogeneous computingIEEE Transactions on Parallel and Distributed Systems, 2002