Improving Energy Efficiency of Hadoop Clusters using Approximate Computing

Document Type

Conference Proceeding

Publication Date


Publication Title

Proceedings - 5th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2019, 5th IEEE International Conference on High Performance and Smart Computing, HPSC 2019 and 4th IEEE International Conference on Intelligent Data and Security, IDS 2019

First Page


Last Page



Approximate computing, Energy-efficiency, Hadoop, MapReduce, Pi, Thermal-efficiency


© 2019 IEEE. There is an ongoing search for finding energy-efficient solutions in multi-core computing platforms. Approximate computing is one such solution leveraging the forgiving nature of applications to improve the energy efficiency at different layers of the computing platform ranging from applications to hardware. We are interested in understanding the benefits of approximate computing in the realm of Apache Hadoop and its applications. A few mechanisms for introducing approximation in programming models include sampling input data, skipping selective computations, relaxing synchronization, and user-defined quality-levels. We believe that it is straightforward to apply the aforementioned mechanisms to conserve energy in Hadoop clusters as well. The emerging trend of approximate computing motivates us to systematically investigate thermal profiling of approximate computing strategies in this research. In particular, we design a thermal-aware approximate computing framework called tHadoop2, which is an extension of tHadoop proposed by Chavan et al. We investigated the thermal behavior of a MapReduce application called Pi running on Hadoop clusters by varying two input parameters-number of maps and number of sampling points per map. Our profiling results show that Pi exhibits inherent resilience in terms of the number of precision digits present in its value.

This document is currently not available here.