SlideShare a Scribd company logo
1 of 23
GMC: Greening MapReduce Clusters
Considering both Computation Energy and
Cooling Energy
1,4,5Bangladesh University of Engineering and Technology, Bangladesh
2University of Nebraska-Lincoln, USA
3University of Southern California, USA
ICC 2018
Kansas City, MO, USA
Tarik Reza Toha1, Mohammad M. R. Lunar2, A s m Rizvi3,
Novia Nurain4, and A. B. M. Alim Al Islam5
Outline
• Background and motivation
• Related work
• Proposed methodology
– Considering both computation and cooling energy
consumption
• Performance evaluation
– Test-bed implementation
• Conclusion
2
Parallel Computing
3
Galaxy formation Planetary movements Climate changes
Modeling, simulation, and experimentation of complex real-world
phenomena demand rigorous computing
Traffic simulation Plate tectonics Weather forecasts
Architectures of Parallel Computing
4
Cluster topology
Grid computingCloud computing
Large Linux cluster in University
of Technology, Germany
Cluster machines are connected
by a local area network
Clouds and grids are
geographically distributed
Clusters are tightly coupled,
whereas clouds and grids are
loosely coupled
Home-made cluster
MapReduce Clusters
MapReduce programming model is increasingly
being used in recent times to parallelize and
distribute jobs across a cluster
5
Advantages Disadvantages
• Performance
• Availability
• High energy consumption
– High operational costs
(e.g., electricity bills)
– Negative environmental
impacts (e.g., CO2
emissions)
It is of utmost importance to
invest our efforts on energy
efficiency in addition to
performance improvement of
MapReduce clusters
Energy Consumption Breakdown
6
A breakdown of energy consumption by different components of a data center
Dayarathna, Miyuru, et al., 2016
Cooling energy occupies a significant portion of total energy
consumption in parallel and distributed systems encompassing
clusters
• Dynamic Energy-Aware Capacity Provisioning
for Cloud Computing Environments
– Zhang et al., ICAC, 2012
– A homogenous cloud solution
• Provides optimum number of machines
– Trade-offs between energy efficiency considering
computational power and waiting time as
performance
• Considers the cost of turning on and off servers and
fluctuation in energy prices
Cooling power is
not considered!
Existing Greening Approaches
7
• Power Management in Heterogeneous
MapReduce Cluster
– Sunuwar et al., MSc Thesis, UNL, 2016
– A heterogenous MapReduce cluster solution
• Addresses data unavailability of MapReduce cluster due to
dynamic capacity provisioning
– Trade-offs between energy efficiency considering
computational power and throughput as performance
• Restricts CPU utilization of slave nodes
Cooling power is
not considered!
Existing Greening Approaches (contd.)
8
• Thermal Aware Server Provisioning and
Workload Distribution for Internet Data Centers
– Abbasi et al., HPDC, 2010
– A homogenous solution for Internet data center
– Trade-offs between energy efficiency considering
both computational and cooling power and response
time as performance
• Uses thermodynamic model of the data center
• Selects active servers based on least recirculated heat
Neither consider
indoor nor
outdoor weather!
Existing Greening Approaches (contd.)
9
Cooing energy minimization of a spatially
distributed machines in a cloud does not need
indoor and outdoor weather data as it is
difficult to collect them
Indoor and outdoor weather conditions play a
significant role in cooling energy
minimization of a co-located cluster
environment
Our Contribution
10
We propose a machine learning based approach,
GMC, which predicts the number of machines for
minimum total energy consumption of a
MapReduce cluster including both computational
energy and cooling energy while considering
weather conditions with minimal overhead
Green MapReduce Cluster (GMC)
Our Proposed GMC Framework
11
Block diagram of operations in our proposed framework
• Configuration of our Hadoop cluster
• Other configurations
– Operating system: Ubuntu 14.04 LTS (x86)
– Hadoop version: 1.0.3 (default configuration)
– Input data-set: Wikimedia database
– Algorithm: wordcount (provided by Hadoop)
– Number of air-conditioners: 3
• Type: Split-type
• Tons: 2
12
Test-bed Implementation
Processor Cores RAM # of machines Nodes
Intel Core 2 Duo E4600 @ 2.40GHz 2 1 GB 1
SlaveIntel Core 2 Duo E7300 @ 2.66GHz 2 2 GB 11
Intel Core 2 Duo E7400 @ 2.80GHz 2 2 GB 17
Intel Core i5-3470 @ 3.20GHz 4 4 GB 1 Master
Snapshots of Test-bed Implementation
13
Hadoop cluster Weather sensing module
CPU power sensing module AC power sensing module
Indoor and Outdoor Weather Data
14
Temperature condition during training and testing phases
Prediction 1: Response Time
15
1-nearest neighbors can predict it with 87.27% accuracy
Prediction 2: Computational Power
16
Support vector machine for regression can predict it with 98.62% accuracy
Prediction 3: Cooling Power
17
Additive regression with random forest can predict it with 67.76% accuracy
Overall Prediction: Total Energy
18Accuracy of total energy prediction is 97.23%
Total Energy = (CPU Power + AC Power) × Response Time
Working Process of GMC Framework
19Working process of our proposed GMC framework
Performance Comparison
20
Energy Consumption Comparison
21
In all cases, GMC
achieves the best
trade-off (on an
average, 29% over
Zhang et al. and
55% over Sunuwar
et al.)
In all cases, GMC
reduces total
energy (on an
average, 19% over
Zhang et al. and
47% over Sunuwar
et al.)
Conclusion
• MapReduce clusters consume substantial amount of energy
including both computation and cooling energy
– Little effort has been spent to minimize the energy consumption
considering both the energy components
• We provide a greening scheme simultaneously considering
both computational power and cooling power consumption
with minimal overhead
– Outperforms existing greening method while maintaining performance similar
to static method
• Future work
– Simulate GMC in a large heterogeneous cluster
– Use context dependent classification in cooling power prediction (e.g. Kalman
filter)
– Use many objective optimization technique to optimize all performance terms
22
Thank you
Questions are welcome!
Email: alim_razi@cse.buet.ac.bd
23

More Related Content

What's hot

JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMmailjkb
 
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGA SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGijujournal
 
A survey to harness an efficient energy in cloud computing
A survey to harness an efficient energy in cloud computingA survey to harness an efficient energy in cloud computing
A survey to harness an efficient energy in cloud computingijujournal
 
Statistical power consumption analysis and modeling
Statistical power consumption analysis and modelingStatistical power consumption analysis and modeling
Statistical power consumption analysis and modelingnadikari123
 
An Enhanced Support Vector Regression Model for Weather Forecasting
An Enhanced Support Vector Regression Model for Weather ForecastingAn Enhanced Support Vector Regression Model for Weather Forecasting
An Enhanced Support Vector Regression Model for Weather ForecastingIOSR Journals
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...IRJET Journal
 
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDIMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDijcax
 
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDIMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDijcax
 
A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...eSAT Publishing House
 
A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...eSAT Journals
 
REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING
REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTINGREAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING
REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTINGijdpsjournal
 

What's hot (12)

JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
 
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGA SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
 
A survey to harness an efficient energy in cloud computing
A survey to harness an efficient energy in cloud computingA survey to harness an efficient energy in cloud computing
A survey to harness an efficient energy in cloud computing
 
Statistical power consumption analysis and modeling
Statistical power consumption analysis and modelingStatistical power consumption analysis and modeling
Statistical power consumption analysis and modeling
 
An Enhanced Support Vector Regression Model for Weather Forecasting
An Enhanced Support Vector Regression Model for Weather ForecastingAn Enhanced Support Vector Regression Model for Weather Forecasting
An Enhanced Support Vector Regression Model for Weather Forecasting
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
 
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDIMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
 
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDIMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUD
 
A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...
 
A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...A survey on energy efficient with task consolidation in the virtualized cloud...
A survey on energy efficient with task consolidation in the virtualized cloud...
 
REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING
REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTINGREAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING
REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING
 

Similar to GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy

ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMSENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMSijdms
 
Accelerating S3D A GPGPU Case Study
Accelerating S3D  A GPGPU Case StudyAccelerating S3D  A GPGPU Case Study
Accelerating S3D A GPGPU Case StudyMartha Brown
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...eSAT Publishing House
 
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...csandit
 
Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...
Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...
Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...Power System Operation
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingRoger Rafanell Mas
 
Energy efficient-resource-allocation-in-distributed-computing-systems
Energy efficient-resource-allocation-in-distributed-computing-systemsEnergy efficient-resource-allocation-in-distributed-computing-systems
Energy efficient-resource-allocation-in-distributed-computing-systemsCemal Ardil
 
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...Absi Ahmed
 
Residual Energy Based Cluster head Selection in WSNs for IoT Application
Residual Energy Based Cluster head Selection in WSNs for IoT ApplicationResidual Energy Based Cluster head Selection in WSNs for IoT Application
Residual Energy Based Cluster head Selection in WSNs for IoT ApplicationIRJET Journal
 
Achieving Energy Proportionality In Server Clusters
Achieving Energy Proportionality In Server ClustersAchieving Energy Proportionality In Server Clusters
Achieving Energy Proportionality In Server ClustersCSCJournals
 
AI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptxAI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptxTamar Eilam
 
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGA SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGijujournal
 
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGA SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGijujournal
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersTarik Reza Toha
 
Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...
Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...
Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...Cemal Ardil
 
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloudEnergy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloudLinda J
 

Similar to GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy (20)

Energy Efficiency in Data Centers
Energy Efficiency in Data CentersEnergy Efficiency in Data Centers
Energy Efficiency in Data Centers
 
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMSENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
 
Accelerating S3D A GPGPU Case Study
Accelerating S3D  A GPGPU Case StudyAccelerating S3D  A GPGPU Case Study
Accelerating S3D A GPGPU Case Study
 
An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...An enhanced adaptive scoring job scheduling algorithm with replication strate...
An enhanced adaptive scoring job scheduling algorithm with replication strate...
 
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...
 
Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...
Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...
Sampling-Based Model Predictive Control of PV-Integrated Energy Storage Syste...
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud Computing
 
Energy efficient-resource-allocation-in-distributed-computing-systems
Energy efficient-resource-allocation-in-distributed-computing-systemsEnergy efficient-resource-allocation-in-distributed-computing-systems
Energy efficient-resource-allocation-in-distributed-computing-systems
 
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...
 
Residual Energy Based Cluster head Selection in WSNs for IoT Application
Residual Energy Based Cluster head Selection in WSNs for IoT ApplicationResidual Energy Based Cluster head Selection in WSNs for IoT Application
Residual Energy Based Cluster head Selection in WSNs for IoT Application
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Achieving Energy Proportionality In Server Clusters
Achieving Energy Proportionality In Server ClustersAchieving Energy Proportionality In Server Clusters
Achieving Energy Proportionality In Server Clusters
 
AI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptxAI Sustainability Mascots 23-f.pptx
AI Sustainability Mascots 23-f.pptx
 
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGA SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
 
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTINGA SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
A SURVEY: TO HARNESS AN EFFICIENT ENERGY IN CLOUD COMPUTING
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing Clusters
 
Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...
Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...
Automatic generation-control-of-multi-area-electric-energy-systems-using-modi...
 
HYPPO - NECSTTechTalk 23/04/2020
HYPPO - NECSTTechTalk 23/04/2020HYPPO - NECSTTechTalk 23/04/2020
HYPPO - NECSTTechTalk 23/04/2020
 
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloudEnergy-aware Task Scheduling using Ant-colony Optimization in cloud
Energy-aware Task Scheduling using Ant-colony Optimization in cloud
 

More from Tarik Reza Toha

An approach towards greening the digital display system
An approach towards greening the digital display systemAn approach towards greening the digital display system
An approach towards greening the digital display systemTarik Reza Toha
 
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor SettingsPredicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor SettingsTarik Reza Toha
 
Automatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact NetworkAutomatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact NetworkTarik Reza Toha
 
Binarization of degraded document images based on hierarchical deep supervise...
Binarization of degraded document images based on hierarchical deep supervise...Binarization of degraded document images based on hierarchical deep supervise...
Binarization of degraded document images based on hierarchical deep supervise...Tarik Reza Toha
 
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...Tarik Reza Toha
 
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...Tarik Reza Toha
 
PNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving PlatformPNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving PlatformTarik Reza Toha
 
Towards Greening the Digital Display System
Towards Greening the Digital Display SystemTowards Greening the Digital Display System
Towards Greening the Digital Display SystemTarik Reza Toha
 
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...Tarik Reza Toha
 
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...Tarik Reza Toha
 
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...Tarik Reza Toha
 
Smart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting SolutionSmart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting SolutionTarik Reza Toha
 
uReporter, an open public reporting system(SD)
uReporter, an open public reporting system(SD)uReporter, an open public reporting system(SD)
uReporter, an open public reporting system(SD)Tarik Reza Toha
 
uReporter, a social problem reporting system (ISD+DB)
uReporter, a social problem reporting system (ISD+DB)uReporter, a social problem reporting system (ISD+DB)
uReporter, a social problem reporting system (ISD+DB)Tarik Reza Toha
 
Euler trails and circuit
Euler trails and circuitEuler trails and circuit
Euler trails and circuitTarik Reza Toha
 
Islam, the ultimate solution
Islam, the ultimate solutionIslam, the ultimate solution
Islam, the ultimate solutionTarik Reza Toha
 

More from Tarik Reza Toha (19)

An approach towards greening the digital display system
An approach towards greening the digital display systemAn approach towards greening the digital display system
An approach towards greening the digital display system
 
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor SettingsPredicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
 
Automatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact NetworkAutomatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact Network
 
Binarization of degraded document images based on hierarchical deep supervise...
Binarization of degraded document images based on hierarchical deep supervise...Binarization of degraded document images based on hierarchical deep supervise...
Binarization of degraded document images based on hierarchical deep supervise...
 
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
 
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
 
PNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving PlatformPNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving Platform
 
Path shala
Path shalaPath shala
Path shala
 
Towards Greening the Digital Display System
Towards Greening the Digital Display SystemTowards Greening the Digital Display System
Towards Greening the Digital Display System
 
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
 
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
 
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
 
Smart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting SolutionSmart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting Solution
 
uReporter, an open public reporting system(SD)
uReporter, an open public reporting system(SD)uReporter, an open public reporting system(SD)
uReporter, an open public reporting system(SD)
 
uReporter, a social problem reporting system (ISD+DB)
uReporter, a social problem reporting system (ISD+DB)uReporter, a social problem reporting system (ISD+DB)
uReporter, a social problem reporting system (ISD+DB)
 
Euler trails and circuit
Euler trails and circuitEuler trails and circuit
Euler trails and circuit
 
Green Networking
Green NetworkingGreen Networking
Green Networking
 
Amplifier
AmplifierAmplifier
Amplifier
 
Islam, the ultimate solution
Islam, the ultimate solutionIslam, the ultimate solution
Islam, the ultimate solution
 

Recently uploaded

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 

Recently uploaded (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 

GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy

  • 1. GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy 1,4,5Bangladesh University of Engineering and Technology, Bangladesh 2University of Nebraska-Lincoln, USA 3University of Southern California, USA ICC 2018 Kansas City, MO, USA Tarik Reza Toha1, Mohammad M. R. Lunar2, A s m Rizvi3, Novia Nurain4, and A. B. M. Alim Al Islam5
  • 2. Outline • Background and motivation • Related work • Proposed methodology – Considering both computation and cooling energy consumption • Performance evaluation – Test-bed implementation • Conclusion 2
  • 3. Parallel Computing 3 Galaxy formation Planetary movements Climate changes Modeling, simulation, and experimentation of complex real-world phenomena demand rigorous computing Traffic simulation Plate tectonics Weather forecasts
  • 4. Architectures of Parallel Computing 4 Cluster topology Grid computingCloud computing Large Linux cluster in University of Technology, Germany Cluster machines are connected by a local area network Clouds and grids are geographically distributed Clusters are tightly coupled, whereas clouds and grids are loosely coupled Home-made cluster
  • 5. MapReduce Clusters MapReduce programming model is increasingly being used in recent times to parallelize and distribute jobs across a cluster 5 Advantages Disadvantages • Performance • Availability • High energy consumption – High operational costs (e.g., electricity bills) – Negative environmental impacts (e.g., CO2 emissions) It is of utmost importance to invest our efforts on energy efficiency in addition to performance improvement of MapReduce clusters
  • 6. Energy Consumption Breakdown 6 A breakdown of energy consumption by different components of a data center Dayarathna, Miyuru, et al., 2016 Cooling energy occupies a significant portion of total energy consumption in parallel and distributed systems encompassing clusters
  • 7. • Dynamic Energy-Aware Capacity Provisioning for Cloud Computing Environments – Zhang et al., ICAC, 2012 – A homogenous cloud solution • Provides optimum number of machines – Trade-offs between energy efficiency considering computational power and waiting time as performance • Considers the cost of turning on and off servers and fluctuation in energy prices Cooling power is not considered! Existing Greening Approaches 7
  • 8. • Power Management in Heterogeneous MapReduce Cluster – Sunuwar et al., MSc Thesis, UNL, 2016 – A heterogenous MapReduce cluster solution • Addresses data unavailability of MapReduce cluster due to dynamic capacity provisioning – Trade-offs between energy efficiency considering computational power and throughput as performance • Restricts CPU utilization of slave nodes Cooling power is not considered! Existing Greening Approaches (contd.) 8
  • 9. • Thermal Aware Server Provisioning and Workload Distribution for Internet Data Centers – Abbasi et al., HPDC, 2010 – A homogenous solution for Internet data center – Trade-offs between energy efficiency considering both computational and cooling power and response time as performance • Uses thermodynamic model of the data center • Selects active servers based on least recirculated heat Neither consider indoor nor outdoor weather! Existing Greening Approaches (contd.) 9 Cooing energy minimization of a spatially distributed machines in a cloud does not need indoor and outdoor weather data as it is difficult to collect them Indoor and outdoor weather conditions play a significant role in cooling energy minimization of a co-located cluster environment
  • 10. Our Contribution 10 We propose a machine learning based approach, GMC, which predicts the number of machines for minimum total energy consumption of a MapReduce cluster including both computational energy and cooling energy while considering weather conditions with minimal overhead Green MapReduce Cluster (GMC)
  • 11. Our Proposed GMC Framework 11 Block diagram of operations in our proposed framework
  • 12. • Configuration of our Hadoop cluster • Other configurations – Operating system: Ubuntu 14.04 LTS (x86) – Hadoop version: 1.0.3 (default configuration) – Input data-set: Wikimedia database – Algorithm: wordcount (provided by Hadoop) – Number of air-conditioners: 3 • Type: Split-type • Tons: 2 12 Test-bed Implementation Processor Cores RAM # of machines Nodes Intel Core 2 Duo E4600 @ 2.40GHz 2 1 GB 1 SlaveIntel Core 2 Duo E7300 @ 2.66GHz 2 2 GB 11 Intel Core 2 Duo E7400 @ 2.80GHz 2 2 GB 17 Intel Core i5-3470 @ 3.20GHz 4 4 GB 1 Master
  • 13. Snapshots of Test-bed Implementation 13 Hadoop cluster Weather sensing module CPU power sensing module AC power sensing module
  • 14. Indoor and Outdoor Weather Data 14 Temperature condition during training and testing phases
  • 15. Prediction 1: Response Time 15 1-nearest neighbors can predict it with 87.27% accuracy
  • 16. Prediction 2: Computational Power 16 Support vector machine for regression can predict it with 98.62% accuracy
  • 17. Prediction 3: Cooling Power 17 Additive regression with random forest can predict it with 67.76% accuracy
  • 18. Overall Prediction: Total Energy 18Accuracy of total energy prediction is 97.23% Total Energy = (CPU Power + AC Power) × Response Time
  • 19. Working Process of GMC Framework 19Working process of our proposed GMC framework
  • 21. Energy Consumption Comparison 21 In all cases, GMC achieves the best trade-off (on an average, 29% over Zhang et al. and 55% over Sunuwar et al.) In all cases, GMC reduces total energy (on an average, 19% over Zhang et al. and 47% over Sunuwar et al.)
  • 22. Conclusion • MapReduce clusters consume substantial amount of energy including both computation and cooling energy – Little effort has been spent to minimize the energy consumption considering both the energy components • We provide a greening scheme simultaneously considering both computational power and cooling power consumption with minimal overhead – Outperforms existing greening method while maintaining performance similar to static method • Future work – Simulate GMC in a large heterogeneous cluster – Use context dependent classification in cooling power prediction (e.g. Kalman filter) – Use many objective optimization technique to optimize all performance terms 22
  • 23. Thank you Questions are welcome! Email: alim_razi@cse.buet.ac.bd 23

Editor's Notes

  1. Hello everyone! Welcome to my presentation. Now I am going to present the paper titled as “GMC: Greening MapReduce Clusters Considering both Computation Energy and Cooling Energy”. The authors are Tarik Reza Toha, Mohammad M. R. Lunar, A s m Rizvi, Novia Nurain, and A. B. M. Alim Al Islam. The second author is from University of Nebraska-Lincoln, third author is from University of Southern California and other authors are from Dept. of CSE, BUET. This work has been funded by ICT division of Bangladesh. Let’s see the outline of my presentation.
  2. At first, I will talk about the background and motivation behind our work. Then, I will talk about our proposed methodology followed by performance evaluation. Finally, I will conclude with some future work. Let’s start.
  3. Parallel computing is highly used for extensive computation. We can say about galaxy formation simulation or climate change prediction or weather forecast for future or even traffic simulation in busy roads. All these modeling, simulation and experimentation of complex real-world phenomena demands rigorous computing. Hence, we need parallel computing. Let’s see the implementation of parallel computing.
  4. Parallel computing can be implemented in three architectures such as cluster, cloud and grid. Cluster machines are connected by a local area network, whereas clouds and grids are geographically distributed. Hence, clusters are tightly coupled and clouds and grids are loosely coupled. Here are some snapshots of a large cluster and home-made cluster. Let’s see how cluster works.
  5. To parallelize and distribute jobs across a cluster, a programming model named MapReduce is being used increasingly. MapReduce clusters increase performance and availability in expense of high energy consumption. Hence, operational costs i.e., electricity bill is increasing as well as CO2 emission is also increasing, which have an adverse effect on nature. Therefore, energy efficiency is more important than performance improvement of MapReduce clusters. Let’s see the energy breakdown of a parallel computing architecture.
  6. Here is the breakdown of energy consumption by different components of a data center. We can see that 50% energy is required to cool the system. Note that, data center is an implementation of cloud or grid that is constructed using cluster of machines. Now, we will review the existing power saving approaches.
  7. Zhang et al., propose a homogenous cloud solution that trade-offs between computational power and waiting time. They consider the cost of turning machines on and off and fluctuation of energy prices. However, they do not consider the cooling power. This is a cloud solution and let’s see a cluster one.
  8. Sunuwar et al., propose heterogenous MapReduce cluster solution that trade-offs between computing power and throughput. They restrict CPU usage of slave nodes and address data unavailability of MapReduce cluster by powering all machines on. However, they do not consider cooling power. Let’s see another solution that considers cooling power.
  9. Abbasi et al. propose a homogenous cloud solution that trade-offs between both total power (including cooling power) and response time. They use thermodynamic model of the data center. However, they do not consider external environment weather data. Note that, although cooling energy minimization of a spatially distributed machines in a cloud needs no external weather data, these data must be required in the minimization task of a co-located cluster environment. Therefore, a novel method is needed for cooling energy minimization in computing clusters.
  10. We propose a machine learning based approach named as Green MapReduce Cluster that predicts the number of machines for minimum total energy consumption of a MapReduce cluster including both computational energy and cooling energy while considering weather conditions with minimal overhead. Let’s see the block diagram of our proposed method.
  11. GMC has three modules such as sensing, learning, and capacity provisioning module. Sensing module senses room temperature, environment temperature, and power consumption of both computing and cooling machines. It also collects starting and ending time for a specific job. Then it feeds the data to learning module as training data set. Learning module uses the sensed data and predicts response time, computational power, and cooling power. Then it generates a distribution of total predicted energy for all number of machines. It returns the number of machines, which results in minimum predicted total energy. Capacity provisioning module adjusts the cluster size by turning on or off the machines to achieve optimum number of machines. Let’s see the configuration of our experimental setup.
  12. We have a cluster of 30 machines, where we use apache Hadoop to implement MapReduce. We have 29 Core-2-duo slave nodes and one core-i5 master node. We use ubuntu 14.04 32 bit and Hadoop 1.0.3 in all machines. We have three split-type 2 tons ACs. Let’s see some snapshots of test-bed implementation.
  13. Here are some snapshots of our test-bed. We use dht temperature and humidity sensor and use a web api to collect outdoor weather data. We use Arduino energy monitor to collect power consumption data of CPUs and ACs. Let’s see the accuracy of our prediction tasks.
  14. We can see that the indoor temperature is somewhat close to expected temperature. Let’s see the total energy prediction.
  15. We can see that response time is decreasing exponentially with the increase of the number of machines. It will also increase with the increase of data size. Using these training data, 1-nearest neighbors can predict the response time for an incoming job with 87% accuracy. We use Auto-WEKA to select the learning model with their parameters. Next, we will see the computational power prediction.
  16. We can see that computational power increases with the increase of number of machines and data sizes. In test phase, Auto-WEKA tells that support vector machine for regression classifier will be the best. It predicts the CPU power with 98.6% accuracy. Now, we will see the cooling power prediction.
  17. Cooling power does not follow any trend rather it depends on environmental weather. As cooling power is related to environment, it is highly non-linear and depends on various factors, which we do not consider. Nevertheless, we can see that in August the cooling power is much higher than February, since the temperature in August is higher than February. In testing phase, we use additive regression with random forest classifier as suggested by Auto-WEKA and get 67.76% accuracy. With the increase of training data, we believe that the accuracy will be increased. Let’s see the environmental condition during training data collection.
  18. By multiplying the response time with total power, we get total energy. We can see that the accuracy of total energy prediction is 97%. Let’s see the working process of GMC.
  19. The working process is very simple. GMC predicts the optimum number of machines using historical data after arrival of a job. Then it adjust the cluster size and allow the job to be finished. After finishing of that job, it collects all sensor data store them in a database. Now, we will see the experimental evaluation.
  20. We can see that GMC degrades a little performance compared to best performance provider static method in terms of response time and throughput. On an average, GMC degrades 4.51% response time and 5.71% throughput compared to static method. In terms of throughput, GMC performs well compared to other green methods and 42.8% degradation over static method. Therefore, GMC can provide the maximum energy efficiency in exchange of minimal performance degradation among all other alternatives. Now, it is time to conclude.
  21. We compare our method with two state-of-the-art green methods and naïve (static) method. We can see that GMC reduces total energy in all cases on an average, 19% over Zhang et al. and 47% over Sunuwar et al. While comparing in terms of total energy per bit, GMC achieves the best trade-off in all cases on an average, 29% over Zhang et al. and 55% over Sunuwar et al. Now, we will see the performance comparison among these methods.
  22. MapReduce clusters consume huge amount of energy including computation and cooling energy. However, a few studies consider cooling energy in energy efficiency. We provide a greening method considering both energy components with minimal overhead that outperforms all other alternatives. We plan to simulate in a large heterogenous cluster and use modern classifier in cooling power prediction. We also want to use many objective optimization techniques to optimize all performance terms. That’s all. Thank you.