(searched for: doi:10.1016/b978-155860651-7/50128-5)
Published: 9 October 2014
IEEE Transactions on Parallel and Distributed Systems, Volume 26, pp 2849-2862; https://doi.org/10.1109/tpds.2014.2362540
The omni-kernel architecture is designed around pervasive monitoring and scheduling. Motivated by new requirements in virtualized environments, this architecture ensures that all resource consumption is measured, that resource consumption resulting from a scheduling decision is attributable to an activity, and that scheduling decisions are fine-grained. Vortex , implemented for multi-core x86-64 platforms, instantiates the omni-kernel architecture, providing a wide range of operating system functionality and abstractions. With Vortex, we experimentally demonstrated the efficacy of the omni-kernel architecture to provide accurate scheduler control over resource allocation despite competing workloads. Experiments involving Apache, MySQL, and Hadoop quantify the cost of pervasive monitoring and scheduling in Vortex to be below $6$ percent of cpu consumption.
Published: 1 September 2013
Conference: 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA), 2013-9-10 - 2013-9-13, Cagliari, Italy
Fast kernel boot-time is one of the major concerns in industrial embedded systems. Application domains where boot time is relevant include (among others) automation, automotive, avionics etc. Linux is one of the big players among operating system solutions for general embedded systems, hence, a relevant question is how fast Linux can boot on typical hardware platforms (ARM9) used in such industrial systems. One important constraint is that this boot-time optimization should be as nonintrusive as possible. The reason for this comes from the fact that industrial embedded systems typically have high demands on reliability and stability. For example, adding, removing or changing critical source-code (such as kernel or initialization code) is impermissible. This paper shows the steps towards a fast-booting Linux kernel using non-intrusive methods. Moreover, targeting embedded systems with temporal constraints, the paper shows how fast the real-time scheduling framework ExSched can be loaded and started during bootup. This scheduling framework supports several real-time scheduling algorithms (user defined, multi-core, partitioned, fixed-priority periodic tasks etc.) and it does not modify the Linux kernel source code. Hence, the non-intrusive bootup optimization methods together with the un-modified Linux kernel and the non-patched real-time scheduler module offers both reliability and predictability 1 .
Published: 1 April 2013
Conference: 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS), 2013-4-9 - 2013-4-11
In this paper we present a new synchronization protocol called RRP (Rollback Resource Policy) which is compatible with hierarchically scheduled open systems and specialized for resources that can be aborted and rolled back. We conduct an extensive event-based simulation and compare RRP against all equivalent existing protocols in hierarchical fixed priority preemptive scheduling; SIRAP (Subsystem Integration and Resource Allocation Policy), OPEN-HSRPnP (open systems version of Hierarchical Stack Resource Policy no Payback) and OPEN-HSRPwP (open systems version of Hierarchical Stack Resource Policy with Payback). Our simulation study shows that RRP has better average-case response-times than the state-of-the-art protocol in open systems, i.e., SIRAP, and that it performs better than OPEN-HSRPnP/OPEN-HSRPwP in terms of schedulability of randomly generated systems. The simulations consider both resources that are compatible with rollback as well as resources incompatible with rollback (only abort), such that the resource-rollback overhead can be evaluated. We also measure CPU overhead costs (in VxWorks) related to the rollback mechanism of tasks and resources. We use the eXtremeDB (embedded real-time) database to measure the resource-rollback overhead 1 .
Published: 1 June 2012
Conference: 2012 7th IEEE International Symposium on Industrial Embedded Systems (SIES), 2012-6-20 - 2012-6-22, Karlsruhe, Germany
In our previous work, we have introduced an adaptive hierarchical scheduling framework as a solution for composing dynamic real-time systems, i.e., systems where the CPU demand of their tasks are subjected to unknown and potentially drastic changes during run-time. The framework uses the PI controller which periodically adapts the system to the current load situation. The conventional PI controller despite simplicity and low CPU overhead, provides acceptable performance. However, increasing the pressure on the controller, e.g, with an application consisting of multiple tasks with drastically oscillating execution times, degrades the performance of the PI controller. Therefore, in this paper we modify the structure of our adaptive framework by replacing the PI controller with a fuzzy controller to achieve better performance. Furthermore, we conduct a simulation-based case study in which we compose dynamic tasks such as video decoder tasks with a set of static tasks into a single system, and we show that the new fuzzy controller outperforms our previous PI controller.
Published: 1 July 2011
Conference: 2011 23rd Euromicro Conference on Real-Time Systems (ECRTS), 2011-7-5 - 2011-7-8, Porto, Portugal
Hierarchical scheduling has major benefits when it comes to integrating hard real-time applications. One of those benefits is that it gives a clear runtime separation of applications in the time domain. This in turn gives a protection against timing error propagation in between applications. However, these benefits rely on the assumption that the scheduler itself schedules applications correctly according to the scheduling parameters and the chosen scheduling policy. A faulty scheduler can affect all applications in a negative way. Hence, being able to guarantee that the scheduler is correct is of great importance. Therefore, in this paper, we study how properties of hierarchical scheduling can be verified. We model a hierarchically scheduled system using task automata, and we conduct verification with model checking using the Times tool. Further, we generate C-code from the model and we execute the hierarchical scheduler in the Vx Works kernel. The CPU and memory overhead of the modelled scheduler is compared against an equivalent manually coded two-level hierarchical scheduler. We show that the worst-case memory consumption is similar and that there is a considerable difference in CPU overhead.
Published: 1 April 2011
Conference: 2011 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2011-4-11 - 2011-4-14, Chicago, United States
This paper presents HiRes, a system structured around predictable, hierarchical resource management (HRM). Applications and different subsystems use customized resource managers that control the allocation and usage of memory, CPU, and I/O. This increased resource management flexibility enables subsystems with different timing constraints to specialize resource management around meeting these requirements. In HiRes, subsystems delegate the management of resources to other subsystems, thus creating the resource management hierarchy. In delegating the control of resources, the subsystem focuses on providing isolation between competing subsystems. To make HRM both predictable and efficient, HiRes ensures that regardless of a subsystem's depth in the hierarchy, the overheads of resource usage and control remain constant. In doing so, HiRes encourages HRM as a fundamental system design technique. Results show that HiRes has competitive performance with existing systems, and that HRM naturally provides both strong isolation guarantees, and flexible and efficient subsystem control over resources.
Published: 1 March 2010
Conference: 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010), 2010-3-8 - 2010-3-12, Dresden, Germany
Nowadays, most embedded devices need to support multiple applications running concurrently. In contrast to desktop computing, very often the set of applications is known at design time and the designer needs to assure that critical applications meet their constraints in every possible use-case. In order to do this, all possible use-cases, i.e. subset of applications running simultaneously, have to be verified thoroughly. An approach to reduce the verification effort, is to perform composability analysis which has been studied for sets of applications modeled as Synchronous Dataflow Graphs. In this paper we introduce a framework that supports a more general parallel programming model based on the Kahn Process Networks Model of Computation and integrates a complete MPSoC programming environment that includes: compiler-centric analysis, performance estimation, simulation as well as mapping and scheduling of multiple applications. In our solution, composability analysis is performed on parallel traces obtained by instrumenting the application code. A case study performed on three typical embedded applications, JPEG, GSM and MPEG-2, proved the applicability of our approach.
Published: 16 March 2009
The publisher has not yet granted permission to display this abstract.
Published: 1 January 2009
Conference: 2009 22nd International Conference on VLSI Design: concurrently with the 8th International Conference on Embedded Systems, 2009-1-5 - 2009-1-9, Delhi, India
Power density has been increasing at an alarming rate in recent processor generations resulting in high on-chip temperature. Higher temperature results in poor reliability and increased leakage current. In this paper, we propose a temperature aware scheduling technique in the context of embedded multi-tasking systems. We observe that there is a high variability in the thermal properties of different embedded applications. We design temperature-aware scheduling (TAS) scheme that exploits this variability to maintain the system temperature below a desired level while satisfying a number of requirements such as throughput, fairness and real time constraints. Moreover, TAS enables exploration of the tradeoffs between throughput and fairness in temperature-constrained systems. Compared against standard schedulers with reactive hardware-level thermal management, TAS provides better throughput with negligible impact on fairness.
Published: 1 December 2006
Conference: Proceedings of 2006 IEEE 27th Real-Time Systems Symposium, 2006-12-5 - 2006-12-8, Rio de Janeiro, Brazil
In this paper, we describe the design and evaluation of a scheduler (referred to as Everest) for allocating processors to services in high performance, multi-service routers. A scheduler for such routers is required to maximize the number of packets processed within a given delay tolerance, while isolating the performance of services from each other. The design of such a scheduler is novel and challenging because of three domain-specific characteristics: (1) difficult-to-predict and high packet arrival rates, (2) small delay tolerances of packets, and (3) significant overheads for switching allocation of processors from one service to another. These characteristics require that the scheduler be agile and wary simultaneously. Whereas agility enables the scheduler to react quickly to fluctuations in packet arrival rates, wariness prevents the scheduler from wasting computational resources in unnecessary context switches. We demonstrate that by balancing agility and wariness, Everest, as compared to conventional schedulers, reduces by more than an order of magnitude the average delay and the percentage of packets that experience delays greater than their tolerance. We describe a prototype implementation of Everest on Intel's IXP2400 network processor
Published: 21 July 2006
Conference: 2006 15th IEEE International Conference on High Performance Distributed Computing, Paris, France
ALPS is a per-application user-level proportional-share scheduler that operates with tow overhead and without any special kernel support. ALPS is useful to a range of applications, including scientific applications that need to control the CPU apportionment to the processes they create, to Web servers that need to limit the proportion of available CPU time given to spawned processes that service Web requests, and to middleware that supports multiple execution environments that are to run at different rates. ALPS works by minimally sampling the progress of processes under its control, and making simple predictions for when it should selectively pause and resume the processes. We present the algorithm, a UNIX-based implementation, and a performance evaluation. Our results show that the ALPS approach is practical; we can achieve good accuracy (under 5% error), and low overhead (under 1% of CPU), despite user-level operation
Published: 1 January 2006
Conference: 2006 27th IEEE International Real-Time Systems Symposium (RTSS'06), 2006-12-5 - 2006-12-8
The constant bandwidth server (CBS) is an effective scheduling technique frequently used to handle overruns and implement resource reservation in real-time systems where tasks have variable execution requirements. The behavior of the server is tuned by two parameters: the server bandwidth, which defines the fraction of the processor allocated to the task, and the server period, which defines the time granularity of the allocation. The effect of the granularity on task executions has never been studied before, so it is typically assigned using ad-hoc considerations. This paper presents a statistical study to evaluate the effects of the server parameters on task response times, and proposes a technique to compute the best parameters that minimize the average response time of the served tasks
Published: 23 December 2004
Published: 25 June 2003
Conference: WORDS 2002: 7th International Workshop on Object-Oriented Real-Time Dependable Systems, San Diego, United States
In this paper, we present the design and implementation. details of a flexible reflective scheduling framework, that. supports conjunctive scheduling of both tasks and messages within a distributed message based environment. In future, distributed environments will need to fine tune their systems to provide diverse services, oftentimes implementing dissimilar policies and functionality. We understand that future distributed systems would require their schedulers to be tailor-made or customizable to suit the. diverse workloads at different times. The framework is therefore fashioned to provide both coarse and fine grained scheduling, for better tunability and improved performance. Though this model is designed to work with any thread based system, we have investigated the applicability of these concepts on actors (active objects) within the Compose|Q framework. Scheduling of soft real time tasks are handled by the framework to conform to guarantees, even in the presence of normal time-sharing tasks. We expect that the proposed solution would be scalable while providing higher flexibility than simple task based scheduling Author(s) Mohapatra, S. California Univ., Irvine, CA, USA Venkatasubmanian, N.
Published: 20 January 2003
Published: 20 January 2003
Many scheduling paradigms have been studied for real-time applications and real-time communication network. Among them, the most commonly used paradigms include priority-driven, time-driven and share-driven paradigms. In this paper, we present a general scheduling framework which is designed to integrate these paradigms in one framework. The framework is implemented in our real-time extension of the Linux kernel, RED-Linux. Two scheduler components are used in the framework: Allocator and Dispatcher. For each job, the framework identifies four scheduling attributes: priority, start time, finish time and budget. We show that the framework can be used to efficiently implement many well-known scheduling algorithms. We also measure and analyze the performance of the framework implemented in RED-Linux.
Published: 24 December 2002
Conference: 17th IEEE Real-Time Systems Symposium, 4 Dec 1996 - 6 Dec 1996, Los Alamitos, United States
We propose and analyze a proportional share resource allocation algorithm for realizing real-time performance in time-shared operating systems. Processes are assigned a weight which determines a share (percentage) of the resource they are to receive. The resource is then allocated in discrete-sized time quanta in such a manner that each process makes progress at a precise, uniform rate. Proportional share allocation algorithms are of interest because: they provide a natural means of seamlessly integrating real and non-real-time processing; they are easy to implement; they provide a simple and effective means of precisely controlling the real-time performance of a process; and they provide a natural means of policing so that processes that use more of a resource than they request have no ill-effect on well-behaved processes. We analyze our algorithm in the context of an idealized system in which a resource is assumed to be granted in arbitrarily small intervals of time and show that our algorithm guarantees that the difference between the service time that a process should receive and the service time it actually receives is optimally bounded by the size of a time quantum. In addition, the algorithm provides support for dynamic operations, such as processes joining or leaving the competition, and for both fractional and non-uniform time quanta. As a proof of concept we have implemented a prototype of a CPU scheduler under FreeBSD. The experimental results shows that our implementation performs within the theoretical bounds and hence supports real-time execution in a general purpose operating system.
Published: 27 November 2002
Conference: 1998 IEEE ATM Workshop 'Meeting the challenges of deploying the global broadband network infrastructure'
In this paper we propose a solution that integrates traffic shaping with fair queueing into a single scheduling algorithm. The proposed algorithm called "fair shaping mechanism" is used to manage bandwidth on ATM virtual channels in non real time network terminals. It takes into account the general expression of ATM traffic parameters. We discuss the necessary modifications of these parameters for bursty IP traffic. A leaky bucket model is used to shape this traffic and a packet-to-cell parameter conversion is proposed. We present an implementation prototype of this server in the IP driver for a Solaris 2.5 operating system (OS).