Middleware '20: 21st International Middleware Conference

Conference Information
Name: Middleware '20: 21st International Middleware Conference
Location: Delft, Netherlands

Latest articles from this conference

Yuquan Shan, George Kesidis, Aman Jain, Bhurvan Urgaonkar, Jalal Khamse-Ashari, Ioannis Lambadaris
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429962

Using tiny tasks (microtasks) has long been regarded an effective way of load balancing in parallel computing systems. When combined with containerized execution nodes pulling in work upon becoming idle, microtasking has the desirable property of automatically adapting its load distribution to the processing capacities of participating nodes - more powerful nodes finish their work sooner and, therefore, pull in additional work faster. As a result, microtasking is deemed especially desirable in settings with heterogeneous processing capacities and poorly characterized workloads. However, microtasking does have additional scheduling and I/O overheads that may make it costly in some scenarios. Moreover, the optimal task size generally needs to be learned. We herein study an alternative load balancing scheme - Heterogeneous MacroTasking (HeMT) - wherein workload is intentionally skewed according to the nodes' processing capacity. We implemented and open-sourced a prototype of HeMT within the Apache Spark application framework and conducted experiments using the Apache Mesos cluster manager. It's shown experimentally that when workload-specific estimates of nodes' processing capacities are learned, Spark with HeMT offers up to 10% shorter average completion times for realistic, multistage data-processing workloads over the baseline Homogeneous microTasking (HomT) system.
Guillaume Everarts de Velp, Etienne Rivière, Ramin Sadre, Icteam Guillaume Everarts de Velp Epl
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429967

Many application server backends leverage container technologies to support workloads formed of short-lived, but potentially I/O-intensive, operations. The latency at which container-supported operations complete impacts both the users' experience and the throughput that the platform can achieve. This latency is a result of both the bootstrap and execution time of the containers and is impacted greatly by the performance of the I/O subsystem. Configuring appropriately the container environment and technology stack to obtain good performance is not an easy task, due to the variety of options, and poor visibility on their interactions. We present in this paper a benchmarking tool for the multi-parametric study of container bootstrap time and I/O performance, allowing us to understand such interactions within a controlled environment. We report the results obtained by evaluating a large number of environment configurations. Our conclusions highlight differences in support and performance between container runtime environments and I/O subsystems.
Eddy Truyen, Bert Lagaisse, Wouter Joosen, Arnout Hoebreckx, Cédric De Dycker
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429963

This paper presents the concept of PolyPod that consists of multiple Pods that run different versions of the same container image on the same node in order to share common libraries in memory. Its novelty is that it proposes a blueprint for blue-green deployments in order to balance maximum flexibility in the number of migration steps with maximum workload consolidation within a fixed total resource cost. This balance between flexibility and improved resource utilization is important for various application areas where users are served by the same application instance and have different time preferences for being upgraded to a new application version. The PolyPod concept is also relevant for a planned feature of Kubernetes so that Pods can be vertically scaled without re-starting them, but where scaling actions are aborted if the capacity of the node is to be exceeded. We explain how the PolyPod concept supports balancing flexible migration and resource utilization, with and without Pod restarts, by simulating various migration scenarios based on a quantitative cost model.
Richard Li, Min Du, Hyunseok Chang, Sarit Mukherjee, Eric Eide
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429965

While distributed application-layer tracing is widely used for performance diagnosis in microservices, its coarse granularity at the service level limits its applicability towards detecting more fine-grained system level issues. To address this problem, cross-layer stitching of tracing information has been proposed. However, all existing cross-layer stitching approaches either require modification of the kernel or need updates in the application-layer tracing library to propagate stitching information, both of which add further complex modifications to existing tracing tools. This paper introduces Deepstitch, a deep learning based approach to stitch cross-layer tracing information without requiring any changes to existing application layer tracing tools. Deepstitch leverages a global view of a distributed application composed of multiple services and learns the global system call sequences across all services involved. This knowledge is then used to stitch system call sequences with service-level traces obtained from a deployed application. Our proof of concept experiments show that the proposed approach successfully maps application-level interaction into the system call sequences and can identify thread-level interactions.
Shripad Nadgowda
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429961

The use of filesystems has become a standard, with a purpose well beyond just storing and accessing application data. For instance, common system security and compliance operations, sush as software or package installation, system and application configurations or process management also leverage a filesystem. Over decades, various system management and security tools have been designed to access system state and to implement their respective functions through a file interface. However, we observe that these tools do not require access to all the files in the filesystem and in some cases they can even work with incomplete file contents. Motivated by these observations, we propose filenail (or Filesystem Thumbnail) a system that exercises an incomplete filesystem state marshalling and un-marshalling protocol. We discuss the use of filenail to implement an effective and optimal disaggregated solution to perform common system security tasks for container clouds. In general, depending on the use-case not all the files in the filesystem are equal and that incomplete filesystem state can be often enough. The results of this paper show filenail is very efficient in capturing and transferring filesystem state of systems and enables implementing disaggregated security solutions in the cloud.
Nuno Lopes, Rolando Martins, Manuel Eduardo Correia, Sérgio Serrano, Francisco Nunes
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429966

Nowadays the use of container technologies is ubiquitous and thus the need to make them secure arises. Container technologies such as Docker provide several options to better improve container security, one of those is the use of a Seccomp profile. A major problem with these profiles is that they are hard to maintain because of two different factors: they need to be updated quite often and present a complex and time consuming task to determine exactly what to update, therefore not many people use them. The research goal of this paper is to make Seccomp profiles a viable technique in a production environment by proposing a reliable method to generate custom Seccomp profiles for arbitrary containerized application. This research focused on developing a solution with few requirements allowing for an easy integration with any environment with no human intervention. Results show that using a custom Seccomp profile can mitigate several attacks and even some zero day vulnerabilities on containerized applications. This represents a big step forward on using Seccomp in a production environment, which would benefit users worldwide.
Laaziz Lahlou, Nadjia Kara, Mohssine Arouch, Claes Edstrom, Montreal Laaziz Lahlou Ecole de Technologie Superieure, Canada Claes Edstrom Ericsson Montreal
Proceedings of the 2020 6th International Workshop on Container Technologies and Container Clouds; https://doi.org/10.1145/3429885.3429964

Recently, attributed graphs have been extensively employed in modeling, studying and analyzing complex interactions in real world systems. A myriad of techniques have been proposed to partition these graphs into clusters that exhibit small entropy with respect to both compositional attributes and the structural properties of the graph. In cloud network infrastructures, they play an important role to understand end users, compute nodes and their interactions. One of the main challenges in today's large scale cloud infrastructures is to categorize these compute nodes into clusters that share similar attributes. Existing unsupervised machine learning techniques such as k-Means and DBSCAN, are inadequate to partition large scale computer network infrastructures due to their non suitability for such contexts and their algorithmic complexities that prevent them from being scalable to such sizes in a reasonable time. In this paper, we first formulate the problem of partitioning attributed graphs in the context of cloud infrastructures as a Quadratic Assignment Problem to solve small to medium scale instances and show its NP-Hardness. We then propose Cheetah a fast and scalable multi-objective topology-aware unsupervised machine learning technique that is tailored to effectively partition large scale cloud network infrastructures. Yet, in terms of complexity, Cheetah is linear as it leverages Breadth First Search algorithm. Experimental results demonstrate its ability to quickly construct good-quality clusters (≈ 1.63 seconds) given 1000 nodes compared to K-Means (≈ 2.78 seconds) and DBSCAN (≈ 24.76 seconds), respectively, and reveal its suitability for large scale infrastructures making it an appealing solution to be integrated into orchestration systems.
Quinten Stokkink, Alexander Stannat, Johan Pouwelsee
Proceedings of the 1st International Workshop on Distributed Infrastructure for Common Good; https://doi.org/10.1145/3428662.3428790

Successful classification of good or bad behavior in the digital domain is limited to central governance, as can be seen with trading platforms, search engines and news feeds. We explore and consolidate existing work on decentralized reputation systems to form a common denominator for what makes a reputation system successful when applied without a centralized reputation authority, formalized in 7 axioms and 3 postulates. Reputation must start from nothing and always reward performed work, respectively lowering and increasing as work is consumed and performed. However, it is impossible for nodes to perform work in a purely synchronous attack-proof work model and real systems must necessarily employ relaxations to such a work model. We show how the relaxations of performing parallel work, allowing unconsumed work and seeding well-known identities with work satisfy our model. Our formalizations allow constraint driven design of decentralized reputation mechanisms.
Martijn de Vos, Johan Pouwelse
Proceedings of the 1st International Workshop on Distributed Infrastructure for Common Good; https://doi.org/10.1145/3428662.3428789

Preventing the abuse of resources is a crucial requirement in shared-resource systems. This concern can be addressed through a centralized gatekeeper, yet it enables manipulation by the gatekeeper itself. We present ConTrib, a decentralized mechanism for tracking resource usage across different shared-resource systems. In ConTrib, participants maintain a personal ledger with tamper-proof records. A record describes a resource consumption or contribution and links to other records. Fraud, maintaining multiple copies of a personal ledger, is detected by users themselves through the continuous exchange of records and by validating their consistency against known ones. We implement ConTrib and run experiments. Our evaluation with up to 1'000 instances reveals that fraud can be detected within 22 seconds and with moderate bandwidth usage. To demonstrate the applicability of our work, we deploy ConTrib in a Tor-like overlay and show how resource abuse by free-riders is effectively deterred. This longitudinal, large-scale trial has resulted in over 137 million records, created by more than 86'000 volunteers.
Back to Top Top