Kangaroo
Open Access
- 26 October 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
Abstract
Many social-media and IoT services have very large working sets consisting of billions of tiny (≈100 B) objects. Large, flash-based caches are important to serving these working sets at acceptable monetary cost. However, caching tiny objects on flash is challenging for two reasons: (i) SSDs can read/write data only in multi-KB "pages" that are much larger than a single object, stressing the limited number of times flash can be written; and (ii) very few bits per cached object can be kept in DRAM without losing flash's cost advantage. Unfortunately, existing flash-cache designs fall short of addressing these challenges: write-optimized designs require too much DRAM, and DRAM-optimized designs require too many flash writes. We present Kangaroo, a new flash-cache design that optimizes both DRAM usage and flash writes to maximize cache performance while minimizing cost. Kangaroo combines a large, set-associative cache with a small, log-structured cache. The set-associative cache requires minimal DRAM, while the log-structured cache minimizes Kangaroo's flash writes. Experiments using traces from Facebook and Twitter show that Kangaroo achieves DRAM usage close to the best prior DRAM-optimized design, flash writes close to the best prior write-optimized design, and miss ratios better than both. Kangaroo's design is Pareto-optimal across a range of allowed write rates, DRAM sizes, and flash sizes, reducing misses by 29% over the state of the art. These results are corroborated with a test deployment of Kangaroo in a production flash cache at Facebook.Keywords
This publication has 19 references indexed in Scilit:
- WhirlpoolPublished by Association for Computing Machinery (ACM) ,2016
- Talus: A simple way to remove cliffs in cache performancePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Exact analysis of TTL cache networksPerformance Evaluation, 2014
- Performance evaluation of the random replacement policy for networks of cachesACM SIGMETRICS Performance Evaluation Review, 2012
- SILTPublished by Association for Computing Machinery (ACM) ,2011
- Performance of the move-to-front algorithm with Markov-modulated request sequencesOperations Research Letters, 1999
- A comparison of trace-sampling techniques for multi-megabyte cachesInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1994
- The design and implementation of a log-structured file systemPublished by Association for Computing Machinery (ACM) ,1991
- An approximate analysis of the LRU and FIFO buffer replacement schemesPublished by Association for Computing Machinery (ACM) ,1990
- Principles of Optimal Page ReplacementJournal of the ACM, 1971