RUE: A caching method for identifying and managing hot data by leveraging resource utilization efficiency

Document Type

Conference Proceeding

Publication Date


Publication Title

Software - Practice and Experience




cache replacement algorithm, caching, hot data identification and management, resource utilization efficiency, reuse distance


In this study, we propose a caching method called RUE for dynamic large-scale data streams. We define a data model to facilitate hot data identification and management. At the heart of RUE model is hot degree that takes into account two factors data resource utilization efficiency and reuse distance, aiming to quantitatively reflect data popularity in a dynamic data stream. Based on data's hot degree, RUE classifies data into four types, each of which is assigned with an associated cache residence time. Guided by RUE model, we develop HM algorithm to identify and manage hot data in a dynamic data stream. HM algorithm is implemented by four stacks, namely, new stack, short stack, long stack, and temp stack. Moreover, an eviction and a migration algorithms are integrated into HM to facilitate block replacement and migration. To evaluate the performance of HM algorithm, we quantitatively compare the performance of RUE with three state-of-art algorithms, namely, LRU, LIRS, and ARC under various replacement policies, operations, and workloads. Experimental results show that RUE outperforms these three existing algorithms in terms of both read and write hit rates. Furthermore, we show that with the four stacks in place, the computing overhead of HM is negligible.

This document is currently not available here.