Synthesizing representative I/O workloads using iterative distillation

Abstract
Storage systems designers are still searching for bet- ter methods of obtaining representative I/O workloads to drive studies of I/O systems. Traces of production work- loads are very accurate, but inflexible and difficult to ob- tain. The use of synthetic workloads addresses these limi- tations; however, synthetic workloads are accurate only if they share certain key properties with the production work- load on which they are based (e.g., mean request size, read percentage). Unfortunately, we do not know which proper- ties are "key" for a given workload and storage system. We have developed a tool, the Distiller, that automati- cally identifies the key properties ("attribute-values") of the workload. The Distiller then uses these attribute-values to generate a synthetic workload representative of the produc- tion workload. This paper presents the design and eval- uation of the Distiller. We demonstrate how the Distiller finds representative synthetic workloads for simple artificial workloads and three production workload traces.

This publication has 10 references indexed in Scilit: