EEDC
                          34330
                                     Automatic
Execution
Environments for                    Energy-Aware
Distributed                          Scheduling
Computing
European Master in Distributed        A GREEN Project
Computing – EMDC


                                      Group members:
                                      Maria Stylianou –
                                      marsty5@gmail.com

                                  Georgia Christodoulidou –
                                     geochris71@gmail.com
Outline

●
    Problem Statement
●
    Green500 List
●
    Automatic Energy-Aware Scheduling
●
    Conclusions




                      2
Problem Statement

      Energy-costs dominate!

Performance = Speed

               Reliability
Bad Effects:   Availability
               Usability

     → Huge increase in total cost
      for maintaining a data center
                      3
The Green500 List


●
    Description
●
    Top10 supercomputers
●
    Trends for energy
    consumption decrease




                    4
Description

●
    Started in April 2005
●
    Ranking of the most energy-efficient
    supercomputers in the world
●
    Aim
    → Raise awareness to other performance
      metrics
     ●
         Performance per watt
     ●
         Energy efficiency for improved reliability
    → Encourage “greener” supercomputers

                                    5
Top10 Supercomputers




Retrieved from http://www.green500.org/lists/2011/11/top/list.php

                              6
Trends for energy consumption
                    decrease
●
    Aggregate many low power processors
●
    Use energy-efficient accelerators from
    gaming market

        No use of automatic energy-based
                   scheduling!




                        7
Automatic Energy-Aware Scheduling

●
    Problem Restatement
●
    Energy Management Technologies
    ●
      How to address the problem
    ●
      Server Virtualization
    ●
      Additional Help
●
    What's in the market




                            8
Problem Restatement

●
    Previously said: Energy-costs dominate!

●
    Peaks are fronted by adding servers
    → Servers are underutilized

     “the average server utilization varies between
        11% and 50% for workloads from sports,
       e-commerce, financial, and Internet proxy
                      clusters.”

                           9
Energy Management Technologies
●
    Awareness
    ●
      Energy consumption in data centers
    ●
      Substantial carbon footprint
                    Solutions
        Hardware Level    System Level
         Build energy            Manage power
        efficiency into         consumption of
        components &            servers & systems
        systems design          adapting to changing
                                conditions in the
                                workload

                           10
How to address the problem

  Power-aware dynamic app placement!




  This is...
  Automatic Energy-aware scheduling!


                  11
Server Virtualization

●
    Appeared in 1960s

●
    Disruptive business model

●
    Aim: Workload consolidation
        → Reduce the energy costs



                        12
Server Virtualization
●
    P1: Servers are heavily underutilized
    → Static
    consolidation
    of workloads

    → Reduction
    of servers




                             Reference [1]
                        13
Server Virtualization

●
    P2: Servers are underutilized for long
        periods/day
    → Consolidation
    of workloads

    → Servers in a
    low power state




                             Reference [1]
                        14
Server Virtualization

●
    P3: Low resource utilization of applications

●
    P4: Applications have a complementary
    resource behavior

    → Dynamic consolidation of workloads




                        15
Server Virtualization
 Scheduling policies
 ●   Random: assigns the tasks randomly
     → only if the task can fit into a server

 ●   Round Robin: assigns a task to each available node
     → implies a maximization of the # of resources to a task
     → implies a sparse usage of the resources

 ●   Backfilling: fills in turned on machines before starting offline ones

 ●   Dynamic Backfilling: able to move tasks between machines
     → provide a higher consolidation level.



                                    16
Server Virtualization
 ●
     Benefits
     ●
         More efficient utilization of hardware

     ●
         Reduced floor space

     ●
         Reduced facilities management costs

     ●
         Hide the heterogeneity in server hardware

     ●
         Make apps more portable/resilient to hardware
         changes

                               17
Additional Help – Hardware Level

  Cooling
 ●
     Automatic Air Cooling

 ●
     Water Cooling
     “water as a coolant has the ability to capture heat
     about 4,000 times more efficiently than air” ~IBM
     → Aquasar Supercomputer – IBM Research Zurich
       Use of powerful chip watercoolers
       → no need of the water to be chilled
          in lower temperatures

                             18
Additional Help – System Level

    Machine Learning
●
    Scheduling Information   → use predictive methods
    not available            to model missing information

●
    Dynamic Backfilling Scheduling Policy
       1st step                   2nd step




         → Change static data by estimated data
                             19
What's in the market
●
    VMturbo
    ●
      Created in 2009
    ●
      Aim: Intelligent Workload Management real-time solution
      for Cloud & Virtualized environments

    ●
        Overall strategy:
        ●
            replace manual partitioned management
        ●
            with scalable, automated, and unified resource & performance
            management

    ●
        Use of economic techniques for IT resource management
        ●
            Economic Scheduling Engine: Dynamically adjust
                                           resource allocation

                                      20
Conclusions
●
    Automatic Energy-based scheduling
     → is a recent area

     → should be considered more by researchers

     → should become the target for top10
      supercomputers → even better results!

     → Server Virtualization is an efficient way for
     reducing energy-costs


                             21
References
1. G. Dasgupta, A. Sharma, A. Verma, A. Neogi, R. Kothari, “Workload Management for
   Power Efficiency in Virtualized Data Centers”, Communication of the ACM, 54:7, July
   2011.
2. The Green500, retrieved on 9th May 2012, http://www.green500.org.
3. J. Ll. Berral, Í. Goiri, R. Nou, F. Julià, J. Guitart, R. Gavaldà, J. Torres, “Towards
   energy-aware scheduling in data centers using machine learning”, In Proceedings of
   the 1st International Conference on Energy-Efficient Computing and Networking,
   Germany, April 2010.
4. IBM builds water-cooled processor for Zurich supercomputer, retrieved on 10th May
   2012, http://www.computerweekly.com/feature/IBM-builds-water-cooled-processor-for-
   Zurich-supercomputer.
5. IBM's Water-Cooled Aquasar Supercomputer Uses Waste Heat to Warm Dorms,
   retrieved on 10th May 2012, http://www.popsci.com/technology/article/2010-04/ibms-
   water-cooled-supercomputer-could-cut-energy-costs.
6. VMturbo: Intelligent Workload Management for Cloud and Virtualized Environments,
   retrieved on 10th May 2012, http://www.vmturbo.com/.
7. Operations Management in the Age of Virtualization, A Vmturbo Whitepaper.



                                            22
EEDC
                          34330
                                     Automatic
Execution
Environments for                    Energy-Aware
Distributed                          Scheduling
Computing
European Master in Distributed        A GREEN Project
Computing – EMDC


                                      Group members:
                                      Maria Stylianou –
                                      marsty5@gmail.com

                                  Georgia Christodoulidou –
                                     geochris71@gmail.com

Automatic Energy-based Scheduling

  • 1.
    EEDC 34330 Automatic Execution Environments for Energy-Aware Distributed Scheduling Computing European Master in Distributed A GREEN Project Computing – EMDC Group members: Maria Stylianou – marsty5@gmail.com Georgia Christodoulidou – geochris71@gmail.com
  • 2.
    Outline ● Problem Statement ● Green500 List ● Automatic Energy-Aware Scheduling ● Conclusions 2
  • 3.
    Problem Statement Energy-costs dominate! Performance = Speed Reliability Bad Effects: Availability Usability → Huge increase in total cost for maintaining a data center 3
  • 4.
    The Green500 List ● Description ● Top10 supercomputers ● Trends for energy consumption decrease 4
  • 5.
    Description ● Started in April 2005 ● Ranking of the most energy-efficient supercomputers in the world ● Aim → Raise awareness to other performance metrics ● Performance per watt ● Energy efficiency for improved reliability → Encourage “greener” supercomputers 5
  • 6.
    Top10 Supercomputers Retrieved fromhttp://www.green500.org/lists/2011/11/top/list.php 6
  • 7.
    Trends for energyconsumption decrease ● Aggregate many low power processors ● Use energy-efficient accelerators from gaming market No use of automatic energy-based scheduling! 7
  • 8.
    Automatic Energy-Aware Scheduling ● Problem Restatement ● Energy Management Technologies ● How to address the problem ● Server Virtualization ● Additional Help ● What's in the market 8
  • 9.
    Problem Restatement ● Previously said: Energy-costs dominate! ● Peaks are fronted by adding servers → Servers are underutilized “the average server utilization varies between 11% and 50% for workloads from sports, e-commerce, financial, and Internet proxy clusters.” 9
  • 10.
    Energy Management Technologies ● Awareness ● Energy consumption in data centers ● Substantial carbon footprint Solutions Hardware Level System Level Build energy Manage power efficiency into consumption of components & servers & systems systems design adapting to changing conditions in the workload 10
  • 11.
    How to addressthe problem Power-aware dynamic app placement! This is... Automatic Energy-aware scheduling! 11
  • 12.
    Server Virtualization ● Appeared in 1960s ● Disruptive business model ● Aim: Workload consolidation → Reduce the energy costs 12
  • 13.
    Server Virtualization ● P1: Servers are heavily underutilized → Static consolidation of workloads → Reduction of servers Reference [1] 13
  • 14.
    Server Virtualization ● P2: Servers are underutilized for long periods/day → Consolidation of workloads → Servers in a low power state Reference [1] 14
  • 15.
    Server Virtualization ● P3: Low resource utilization of applications ● P4: Applications have a complementary resource behavior → Dynamic consolidation of workloads 15
  • 16.
    Server Virtualization Schedulingpolicies ● Random: assigns the tasks randomly → only if the task can fit into a server ● Round Robin: assigns a task to each available node → implies a maximization of the # of resources to a task → implies a sparse usage of the resources ● Backfilling: fills in turned on machines before starting offline ones ● Dynamic Backfilling: able to move tasks between machines → provide a higher consolidation level. 16
  • 17.
    Server Virtualization ● Benefits ● More efficient utilization of hardware ● Reduced floor space ● Reduced facilities management costs ● Hide the heterogeneity in server hardware ● Make apps more portable/resilient to hardware changes 17
  • 18.
    Additional Help –Hardware Level Cooling ● Automatic Air Cooling ● Water Cooling “water as a coolant has the ability to capture heat about 4,000 times more efficiently than air” ~IBM → Aquasar Supercomputer – IBM Research Zurich Use of powerful chip watercoolers → no need of the water to be chilled in lower temperatures 18
  • 19.
    Additional Help –System Level Machine Learning ● Scheduling Information → use predictive methods not available to model missing information ● Dynamic Backfilling Scheduling Policy 1st step 2nd step → Change static data by estimated data 19
  • 20.
    What's in themarket ● VMturbo ● Created in 2009 ● Aim: Intelligent Workload Management real-time solution for Cloud & Virtualized environments ● Overall strategy: ● replace manual partitioned management ● with scalable, automated, and unified resource & performance management ● Use of economic techniques for IT resource management ● Economic Scheduling Engine: Dynamically adjust resource allocation 20
  • 21.
    Conclusions ● Automatic Energy-based scheduling → is a recent area → should be considered more by researchers → should become the target for top10 supercomputers → even better results! → Server Virtualization is an efficient way for reducing energy-costs 21
  • 22.
    References 1. G. Dasgupta,A. Sharma, A. Verma, A. Neogi, R. Kothari, “Workload Management for Power Efficiency in Virtualized Data Centers”, Communication of the ACM, 54:7, July 2011. 2. The Green500, retrieved on 9th May 2012, http://www.green500.org. 3. J. Ll. Berral, Í. Goiri, R. Nou, F. Julià, J. Guitart, R. Gavaldà, J. Torres, “Towards energy-aware scheduling in data centers using machine learning”, In Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking, Germany, April 2010. 4. IBM builds water-cooled processor for Zurich supercomputer, retrieved on 10th May 2012, http://www.computerweekly.com/feature/IBM-builds-water-cooled-processor-for- Zurich-supercomputer. 5. IBM's Water-Cooled Aquasar Supercomputer Uses Waste Heat to Warm Dorms, retrieved on 10th May 2012, http://www.popsci.com/technology/article/2010-04/ibms- water-cooled-supercomputer-could-cut-energy-costs. 6. VMturbo: Intelligent Workload Management for Cloud and Virtualized Environments, retrieved on 10th May 2012, http://www.vmturbo.com/. 7. Operations Management in the Age of Virtualization, A Vmturbo Whitepaper. 22
  • 23.
    EEDC 34330 Automatic Execution Environments for Energy-Aware Distributed Scheduling Computing European Master in Distributed A GREEN Project Computing – EMDC Group members: Maria Stylianou – marsty5@gmail.com Georgia Christodoulidou – geochris71@gmail.com