Increased processing power of MapReduce clusters generally enhances performance and availability at the cost of substantial energy consumption that often incurs higher operational costs (e.g., electricity bills) and negative environmental impacts (e.g., carbon dioxide emissions). There exist a few greening methods for computing clusters in the literature that focus mainly on computational energy consumption leaving cooling energy, which occupies a significant portion of the total energy consumed by the clusters. To this extent, in this paper, we propose a machine learning based approach named as Green MapReduce Cluster (GMC) that reduces the total energy consumption of a MapReduce cluster considering both computational energy and cooling energy. GMC predicts the number of machines that results in minimum total energy consumption. We perform the prediction through applying different machine learning techniques over year-long data collected from a real setup. We evaluate performance of GMC over a real testbed. Our evaluation reveals that GMC reduces total energy consumption by up to 47% compared to other alternatives while experiencing marginal throughput degradation in a few cases.
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy
1. GMC: Greening MapReduce Clusters
Considering both Computation Energy and
Cooling Energy
1,4,5Bangladesh University of Engineering and Technology, Bangladesh
2University of Nebraska-Lincoln, USA
3University of Southern California, USA
ICC 2018
Kansas City, MO, USA
Tarik Reza Toha1, Mohammad M. R. Lunar2, A s m Rizvi3,
Novia Nurain4, and A. B. M. Alim Al Islam5
2. Outline
• Background and motivation
• Related work
• Proposed methodology
– Considering both computation and cooling energy
consumption
• Performance evaluation
– Test-bed implementation
• Conclusion
2
4. Architectures of Parallel Computing
4
Cluster topology
Grid computingCloud computing
Large Linux cluster in University
of Technology, Germany
Cluster machines are connected
by a local area network
Clouds and grids are
geographically distributed
Clusters are tightly coupled,
whereas clouds and grids are
loosely coupled
Home-made cluster
5. MapReduce Clusters
MapReduce programming model is increasingly
being used in recent times to parallelize and
distribute jobs across a cluster
5
Advantages Disadvantages
• Performance
• Availability
• High energy consumption
– High operational costs
(e.g., electricity bills)
– Negative environmental
impacts (e.g., CO2
emissions)
It is of utmost importance to
invest our efforts on energy
efficiency in addition to
performance improvement of
MapReduce clusters
6. Energy Consumption Breakdown
6
A breakdown of energy consumption by different components of a data center
Dayarathna, Miyuru, et al., 2016
Cooling energy occupies a significant portion of total energy
consumption in parallel and distributed systems encompassing
clusters
7. • Dynamic Energy-Aware Capacity Provisioning
for Cloud Computing Environments
– Zhang et al., ICAC, 2012
– A homogenous cloud solution
• Provides optimum number of machines
– Trade-offs between energy efficiency considering
computational power and waiting time as
performance
• Considers the cost of turning on and off servers and
fluctuation in energy prices
Cooling power is
not considered!
Existing Greening Approaches
7
8. • Power Management in Heterogeneous
MapReduce Cluster
– Sunuwar et al., MSc Thesis, UNL, 2016
– A heterogenous MapReduce cluster solution
• Addresses data unavailability of MapReduce cluster due to
dynamic capacity provisioning
– Trade-offs between energy efficiency considering
computational power and throughput as performance
• Restricts CPU utilization of slave nodes
Cooling power is
not considered!
Existing Greening Approaches (contd.)
8
9. • Thermal Aware Server Provisioning and
Workload Distribution for Internet Data Centers
– Abbasi et al., HPDC, 2010
– A homogenous solution for Internet data center
– Trade-offs between energy efficiency considering
both computational and cooling power and response
time as performance
• Uses thermodynamic model of the data center
• Selects active servers based on least recirculated heat
Neither consider
indoor nor
outdoor weather!
Existing Greening Approaches (contd.)
9
Cooing energy minimization of a spatially
distributed machines in a cloud does not need
indoor and outdoor weather data as it is
difficult to collect them
Indoor and outdoor weather conditions play a
significant role in cooling energy
minimization of a co-located cluster
environment
10. Our Contribution
10
We propose a machine learning based approach,
GMC, which predicts the number of machines for
minimum total energy consumption of a
MapReduce cluster including both computational
energy and cooling energy while considering
weather conditions with minimal overhead
Green MapReduce Cluster (GMC)
11. Our Proposed GMC Framework
11
Block diagram of operations in our proposed framework
21. Energy Consumption Comparison
21
In all cases, GMC
achieves the best
trade-off (on an
average, 29% over
Zhang et al. and
55% over Sunuwar
et al.)
In all cases, GMC
reduces total
energy (on an
average, 19% over
Zhang et al. and
47% over Sunuwar
et al.)
22. Conclusion
• MapReduce clusters consume substantial amount of energy
including both computation and cooling energy
– Little effort has been spent to minimize the energy consumption
considering both the energy components
• We provide a greening scheme simultaneously considering
both computational power and cooling power consumption
with minimal overhead
– Outperforms existing greening method while maintaining performance similar
to static method
• Future work
– Simulate GMC in a large heterogeneous cluster
– Use context dependent classification in cooling power prediction (e.g. Kalman
filter)
– Use many objective optimization technique to optimize all performance terms
22
Hello everyone! Welcome to my presentation. Now I am going to present the paper titled as “GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy”. The authors are Tarik Reza Toha, Mohammad M. R. Lunar, A s m Rizvi, Novia Nurain, and A. B. M. Alim Al Islam. The second author is from University of Nebraska-Lincoln, third author is from University of Southern California and other authors are from Dept. of CSE, BUET. This work has been funded by ICT division of Bangladesh. Let’s see the outline of my presentation.
At first, I will talk about the background and motivation behind our work. Then, I will talk about our proposed methodology followed by performance evaluation. Finally, I will conclude with some future work. Let’s start.
Parallel computing is highly used for extensive computation. We can say about galaxy formation simulation or climate change prediction or weather forecast for future or even traffic simulation in busy roads. All these modeling, simulation and experimentation of complex real-world phenomena demands rigorous computing. Hence, we need parallel computing. Let’s see the implementation of parallel computing.
Parallel computing can be implemented in three architectures such as cluster, cloud and grid. Cluster machines are connected by a local area network, whereas clouds and grids are geographically distributed. Hence, clusters are tightly coupled and clouds and grids are loosely coupled. Here are some snapshots of a large cluster and home-made cluster. Let’s see how cluster works.
To parallelize and distribute jobs across a cluster, a programming model named MapReduce is being used increasingly. MapReduce clusters increase performance and availability in expense of high energy consumption. Hence, operational costs i.e., electricity bill is increasing as well as CO2 emission is also increasing, which have an adverse effect on nature. Therefore, energy efficiency is more important than performance improvement of MapReduce clusters. Let’s see the energy breakdown of a parallel computing architecture.
Here is the breakdown of energy consumption by different components of a data center. We can see that 50% energy is required to cool the system. Note that, data center is an implementation of cloud or grid that is constructed using cluster of machines. Now, we will review the existing power saving approaches.
Zhang et al., propose a homogenous cloud solution that trade-offs between computational power and waiting time. They consider the cost of turning machines on and off and fluctuation of energy prices. However, they do not consider the cooling power. This is a cloud solution and let’s see a cluster one.
Sunuwar et al., propose heterogenous MapReduce cluster solution that trade-offs between computing power and throughput. They restrict CPU usage of slave nodes and address data unavailability of MapReduce cluster by powering all machines on. However, they do not consider cooling power. Let’s see another solution that considers cooling power.
Abbasi et al. propose a homogenous cloud solution that trade-offs between both total power (including cooling power) and response time. They use thermodynamic model of the data center. However, they do not consider external environment weather data. Note that, although cooling energy minimization of a spatially distributed machines in a cloud needs no external weather data, these data must be required in the minimization task of a co-located cluster environment. Therefore, a novel method is needed for cooling energy minimization in computing clusters.
We propose a machine learning based approach named as Green MapReduce Cluster that predicts the number of machines for minimum total energy consumption of a MapReduce cluster including both computational energy and cooling energy while considering weather conditions with minimal overhead. Let’s see the block diagram of our proposed method.
GMC has three modules such as sensing, learning, and capacity provisioning module. Sensing module senses room temperature, environment temperature, and power consumption of both computing and cooling machines. It also collects starting and ending time for a specific job. Then it feeds the data to learning module as training data set.
Learning module uses the sensed data and predicts response time, computational power, and cooling power. Then it generates a distribution of total predicted energy for all number of machines. It returns the number of machines, which results in minimum predicted total energy. Capacity provisioning module adjusts the cluster size by turning on or off the machines to achieve optimum number of machines.
Let’s see the configuration of our experimental setup.
We have a cluster of 30 machines, where we use apache Hadoop to implement MapReduce. We have 29 Core-2-duo slave nodes and one core-i5 master node. We use ubuntu 14.04 32 bit and Hadoop 1.0.3 in all machines. We have three split-type 2 tons ACs. Let’s see some snapshots of test-bed implementation.
Here are some snapshots of our test-bed. We use dht temperature and humidity sensor and use a web api to collect outdoor weather data. We use Arduino energy monitor to collect power consumption data of CPUs and ACs. Let’s see the accuracy of our prediction tasks.
We can see that the indoor temperature is somewhat close to expected temperature. Let’s see the total energy prediction.
We can see that response time is decreasing exponentially with the increase of the number of machines. It will also increase with the increase of data size. Using these training data, 1-nearest neighbors can predict the response time for an incoming job with 87% accuracy. We use Auto-WEKA to select the learning model with their parameters. Next, we will see the computational power prediction.
We can see that computational power increases with the increase of number of machines and data sizes. In test phase, Auto-WEKA tells that support vector machine for regression classifier will be the best. It predicts the CPU power with 98.6% accuracy. Now, we will see the cooling power prediction.
Cooling power does not follow any trend rather it depends on environmental weather. As cooling power is related to environment, it is highly non-linear and depends on various factors, which we do not consider. Nevertheless, we can see that in August the cooling power is much higher than February, since the temperature in August is higher than February. In testing phase, we use additive regression with random forest classifier as suggested by Auto-WEKA and get 67.76% accuracy. With the increase of training data, we believe that the accuracy will be increased. Let’s see the environmental condition during training data collection.
By multiplying the response time with total power, we get total energy. We can see that the accuracy of total energy prediction is 97%. Let’s see the working process of GMC.
The working process is very simple. GMC predicts the optimum number of machines using historical data after arrival of a job. Then it adjust the cluster size and allow the job to be finished. After finishing of that job, it collects all sensor data store them in a database. Now, we will see the experimental evaluation.
We can see that GMC degrades a little performance compared to best performance provider static method in terms of response time and throughput. On an average, GMC degrades 4.51% response time and 5.71% throughput compared to static method. In terms of throughput, GMC performs well compared to other green methods and 42.8% degradation over static method. Therefore, GMC can provide the maximum energy efficiency in exchange of minimal performance degradation among all other alternatives. Now, it is time to conclude.
We compare our method with two state-of-the-art green methods and naïve (static) method. We can see that GMC reduces total energy in all cases on an average, 19% over Zhang et al. and 47% over Sunuwar et al. While comparing in terms of total energy per bit, GMC achieves the best trade-off in all cases on an average, 29% over Zhang et al. and 55% over Sunuwar et al. Now, we will see the performance comparison among these methods.
MapReduce clusters consume huge amount of energy including computation and cooling energy. However, a few studies consider cooling energy in energy efficiency. We provide a greening method considering both energy components with minimal overhead that outperforms all other alternatives.
We plan to simulate in a large heterogenous cluster and use modern classifier in cooling power prediction. We also want to use many objective optimization techniques to optimize all performance terms. That’s all. Thank you.