EuroSys '22: Seventeenth European Conference on Computer Systems

Conference Information
Name: EuroSys '22: Seventeenth European Conference on Computer Systems
Location: Rennes, France

Latest articles from this conference

Ahmed M. Abdelmoniem, Chen-Yu Ho, Pantelis Papageorgiou, Marco Canini
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526969

Abstract:
Federated learning (FL) is becoming a popular paradigm for collaborative learning over distributed, private datasets owned by non-trusting entities. FL has seen successful deployment in production environments, and it has been adopted in services such as virtual keyboards, auto-completion, item recommendation, and several IoT applications. However, FL comes with the challenge of performing training over largely heterogeneous datasets, devices, and networks that are out of the control of the centralized FL server. Motivated by this inherent setting, we make a first step towards characterizing the impact of device and behavioral heterogeneity on the trained model. We conduct an extensive empirical study spanning close to 1.5K unique configurations on five popular FL benchmarks. Our analysis shows that these sources of heterogeneity have a major impact on both model performance and fairness, thus shedding light on the importance of considering heterogeneity in FL system design.
Hongrui Shi, Valentin Radu
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526980

Abstract:
The Federated Learning (FL) workflow of training a centralized model with distributed data is growing in popularity. However, until recently, this was the realm of contributing clients with similar computing capability. The fast expanding IoT space and data being generated and processed at the edge are encouraging more effort into expanding federated learning to include heterogeneous systems. Previous approaches distribute light-weight models to clients are rely on knowledge transfer to distil the characteristic of local data in partitioned updates. However, their additional knowledge exchange transmitted through the network degrades the communication efficiency of FL. We propose to reduce the size of knowledge exchanged in these FL setups by clustering and selecting only the most representative bits of information from the clients. The partitioned global update adopted in our work splits the global deep neural network into a lower part for generic feature extraction and an upper part that is more sensitive to this selected client knowledge. Our experiments show that only 1.6% of the initially exchanged data can effectively transfer the characteristic of the client data to the global model in our FL approach, using split networks. These preliminary results evolve our understanding of federated learning by demonstrating efficient training using strategically selected training samples.
Deepak George Thomas, Tichakorn Wongpiromsarn, Ali Jannesari
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526968

Abstract:
The function approximators employed by traditional image-based Deep Reinforcement Learning (DRL) algorithms usually lack a temporal learning component and instead focus on learning the spatial component. We propose a technique, Temporal Shift Reinforcement Learning (TSRL), wherein both temporal, as well as spatial components are jointly learned. Moreover, TSRL does not require additional parameters to perform temporal learning. We show that TSRL outperforms the commonly used frame stacking heuristic on all of the Atari environments we test on while beating the SOTA for all except one of them. This investigation has implications in the robotics as well as sequential decision-making domains. Our code is available at - https://github.com/Deepakgthomas/TSM_RL
Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Chuan Lei
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526974

Abstract:
Developing scalable solutions for training Graph Neural Networks (GNNs) for link prediction tasks is challenging due to the inherent data dependencies which entail high computational costs and a huge memory footprint. We propose a new method for scaling training of knowledge graph embedding models for link prediction to address these challenges. Towards this end, we propose the following algorithmic strategies: self-sufficient partitions, constraint-based negative sampling, and edge mini-batch training. The experimental evaluation shows that our scaling solution for GNN-based knowledge graph embedding models achieves a 16x speed up on benchmark datasets while maintaining a comparable model performance to non-distributed methods on standard metrics.
Kai-Hsun Chen, Huan-Ping Su, Wei-Chiu Chuang, Hung-Chang Hsiao, Wangda Tan, Zhankun Tang, Xun Liu, Yanbo Liang, Wen-Chih Lo, Wanqiang Ji, et al.
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526984

Abstract:
As machine learning is applied more widely, it is necessary to have a machine-learning platform for both infrastructure administrators and users including expert data scientists and citizen data scientists [24] to improve their productivity. However, existing machine-learning platforms are ill-equipped to address the "Machine Learning tech debts" [36] such as glue code, reproducibility, and portability. Furthermore, existing platforms only take expert data scientists into consideration, and thus they are inflexible for infrastructure administrators and non-user-friendly for citizen data scientists. We propose Submarine, a unified machine-learning platform, and takes all infrastructure administrators, expert data scientists, and citizen data scientists into consideration. Submarine has been widely used in many technology companies, including Ke.com and LinkedIn. We present two use cases in Section 5.
Muhammad Sabih, Frank Hannig, Jürgen Teich
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526982

Abstract:
Filter pruning is one of the most effective ways to accelerate Convolutional Neural Networks (CNNs). Most of the existing works are focused on the static pruning of CNN filters. In dynamic pruning of CNN filters, existing works are based on the idea of switching between different branches of a CNN or exiting early based on the hardness of a sample. These approaches can reduce the average latency of inference, but they cannot reduce the longest-path latency of inference. In contrast, we present a novel approach of dynamic filter pruning that utilizes explainable AI along with early coarse prediction in the intermediate layers of a CNN. This coarse prediction is performed using a simple branch that is trained to perform top-k classification. The branch either predicts the output class with high confidence, in which case the rest of the computations are left out. Alternatively, the branch predicts the output class to be within a subset of possible output classes. After this coarse prediction, only those filters that are important for this subset of classes are then evaluated. The importances of filters for each output class are obtained using explainable AI. Using this concept of dynamic pruning, we are able not only to reduce the average latency of inference, but also the longest-path latency of inference. Our proposed architecture for dynamic pruning can be deployed on different hardware platforms.
Guilherme H. Apostolo, Pablo Bauszat, Vinod Nigade, Henri E. Bal, Lin Wang
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526973

Abstract:
Many private and public organizations deploy large numbers of cameras, which are used in application services for public safety, healthcare, and traffic control. Recent advances in deep learning have demonstrated remarkable accuracy on computer analytics tasks that are fundamental for these applications, such as object detection and action recognition. While deep learning opens the door for the automation of camera-based applications, deploying pipelines for live video analytics is still a complicated process that requires domain expertise in the fields of machine learning, computer vision, computer systems, and networks. The problem is further amplified when multiple pipelines need to be deployed on the same infrastructure to meet different users' diverse and yet dynamic needs. In this paper, we present a live-video-analytics-as-a-service vision, aiming to remove the complexity barrier and achieve flexibility, agility, and efficiency for applications based on live video analytics. We motivate our vision by identifying its requirements and the shortcomings of existing approaches. Based on our analysis, we present our envisioned system design and discuss the challenges that need to be addressed to make it a reality.
Filip Svoboda, Javier Fernandez-Marques, Edgar Liberis, Nicholas D. Lane
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526978

Abstract:
Microcontrollers are an attractive deployment target due to their low cost, modest power usage and abundance in the wild. However, deploying models to such hardware is non-trivial due to a small amount of on-chip RAM (often < 512KB) and limited compute capabilities. In this work, we delve into the requirements and challenges of fast DNN inference on MCUs: we describe how the memory hierarchy influences the architecture of the model, expose often under-reported costs of compression and quantization techniques, and highlight issues that become critical when deploying to MCUs compared to mobiles. Our findings and experiences are also distilled into a set of guidelines that should ease the future deployment of DNN-based applications on microcontrollers.
Davide Sanvito, Giuseppe Siracusano, Sharan Santhanam, Roberto Gonzalez, Roberto Bifulco
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526979

Abstract:
While monitoring system behavior to detect anomalies and failures is important, existing methods based on log-analysis can only be as good as the information contained in the logs, and other approaches that look at the OS-level software state introduce high overheads. We tackle the problem with syslrn, a system that first builds an understanding of a target system offline, and then tailors the online monitoring instrumentation based on the learned identifiers of normal behavior. While our syslrn prototype is still preliminary and lacks many features, we show in a case study for the monitoring of OpenStack failures that it can outperform state-of-the-art log-analysis systems with little overhead.
Sami Alabed, Eiko Yoneki
Proceedings of the 2nd European Workshop on Machine Learning and Systems; https://doi.org/10.1145/3517207.3526977

Abstract:
Current auto-tuners struggle with computer systems due to their large complex parameter space and high evaluation cost. We propose BoGraph, an auto-tuning framework that builds a graph of the system components before optimizing it using causal structure learning. The graph contextualizes the system via decomposition of the parameter space for faster convergence and handling of many parameters. Furthermore, BoGraph exposes an API to encode experts' knowledge of the system via performance models and a known dependency structure of the components. We evaluated BoGraph via a hardware design case study achieving 5x -- 7x improvement in energy and latency over the default in a variety of tasks.
Back to Top Top