BayesPerf: minimizing performance monitoring errors using Bayesian statistics
- 17 April 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM) in Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
Abstract
Hardware performance counters (HPCs) that measure low-level architectural and microarchitectural events provide dynamic contextual information about the state of the system. However, HPC measurements are error-prone due to non determinism (e.g., undercounting due to event multiplexing, or OS interrupt-handling behaviors). In this paper, we present BayesPerf, a system for quantifying uncertainty in HPC measurements by using a domain-driven Bayesian model that captures microarchitectural relationships between HPCs to jointly infer their values as probability distributions. We provide the design and implementation of an accelerator that allows for low-latency and low-power inference of the BayesPerf model for x86 and ppc64 CPUs. BayesPerf reduces the average error in HPC measurements from 40.1% to 7.6% when events are being multiplexed. The value of BayesPerf in real-time decision-making is illustrated with a simple example of scheduling of PCIe transfers.Keywords
Funding Information
- National Science Foundation (CNS 13-37732, CNS 16-24790, CCF 20-29049)
This publication has 27 references indexed in Scilit:
- Reliable and Efficient Performance Monitoring in LinuxPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2017
- Apache SparkCommunications of the ACM, 2016
- So many performance events, so little timePublished by Association for Computing Machinery (ACM) ,2016
- CAPI: A Coherent Accelerator Processor InterfaceIBM Journal of Research and Development, 2015
- Deployment of query plans on multicoresProceedings of the VLDB Endowment, 2014
- A Top-Down method for performance analysis and counters architecturePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- ParagonPublished by Association for Computing Machinery (ACM) ,2013
- CONNECTPublished by Association for Computing Machinery (ACM) ,2012
- Gaussian Processes for Classification: Mean-Field AlgorithmsNeural Computation, 2000
- Exploiting hardware performance counters with flow and context sensitive profilingACM SIGPLAN Notices, 1997