glideinWMS training                         glideinWMS                                               -                    ...
Why this talk?         If you never heard of glideinWMS before,              you likely have no idea if this is        a p...
The basics ●   glideinWMS has been designed to address the     needs of High Throughput Computing (HTC)      ●          Be...
High Throughput Computing ●   The basic premise of HTC is that there is     always more demand than available CPUs ●   We ...
HTC from the user point of view ●   As a side effect, users must be HTC-aware ●   There are some negative aspects      ●  ...
HTC in simplified picture                                   User scheduling                                   usually not ...
HTC products ●   There are many HTC products available      ●   Although most call themselves “batch systems” ●   A non ex...
Why another system? ●   All of the mentioned HTC systems     assume full control     of the compute resources (i.e. CPUs) ...
Non-dedicated resources ●   In the past decade, two paradigms emerged      ●   Grid computing      ●   Cloud computing ●  ...
Grid vs Cloud                              (a short summary)●    Grid computing is                     ●   (Commercial) Cl...
Grid vs Cloud                              (a short summary)●    Grid computing is                     ●   (Commercial) Cl...
glideinWMS and the Grid                      (Cloud resources are used in a similar way) ●   glideinWMS creates     an ove...
Implementation and support ●   glideinWMS heavily based on Condor      ●   Essentially a thin layer on top of it ●   Most ...
glideinWMS and Condor ●   Condor handles the HTC system      ●   Most Condor features thus available ●   glideinWMS role l...
Summary ●   glideinWMS is a HTC product      ●   i.e. enables effective use of a large number          of CPUs by a large ...
Pointers ●   glideinWMS development team is reachable at     glideinwms-support@fnal.gov ●   The official project Web page...
Acknowledgments ●   This document was sponsored by grants from     the US NSF and US DOE,     and by the UC systemglideinW...
Upcoming SlideShare
Loading in...5
×

glideinWMS - The Larger Picture

247

Published on

This talk presents glideinWMS in a larger context, allowing you to understand what this product is all about.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
247
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

glideinWMS - The Larger Picture

  1. 1. glideinWMS training glideinWMS - The Larger Picture i.e. Is it something you would be interested in? by Igor Sfiligoi (UCSD)glideinWMS training glideinWMS - The Larger Picture 1
  2. 2. Why this talk? If you never heard of glideinWMS before, you likely have no idea if this is a product you would be interested in using. This talk presents glideinWMS in a larger context, allowing you to understand what this product is all about.glideinWMS training glideinWMS - The Larger Picture 2
  3. 3. The basics ● glideinWMS has been designed to address the needs of High Throughput Computing (HTC) ● Better known as batch processing ● In a nutshell, we are trying to facilitate the effective use of a large number of CPUs by a large number of usersglideinWMS training glideinWMS - The Larger Picture 3
  4. 4. High Throughput Computing ● The basic premise of HTC is that there is always more demand than available CPUs ● We should make good use of those CPUs ● Keep them busy, ideally, 24x7x365 ● Sustained utilization is thus more important than peak performance ● Measure of success is FLOPY = Floating Points per Year not FLOPS = Floating Points per SecondglideinWMS training glideinWMS - The Larger Picture 4
  5. 5. HTC from the user point of view ● As a side effect, users must be HTC-aware ● There are some negative aspects ● No interactive access, only process queuing – Usually referred to as user jobs ● Waiting in line to get access to CPUs ● But the payoff is potentially huge ● A single user can use 1000s CPUs at a time ● Performing in few days computations that would take several years on a single machineglideinWMS training glideinWMS - The Larger Picture 5
  6. 6. HTC in simplified picture User scheduling usually not FIFO Repository SchedulerglideinWMS training glideinWMS - The Larger Picture 6
  7. 7. HTC products ● There are many HTC products available ● Although most call themselves “batch systems” ● A non exhaustive list: ● Condor ● PBS, with variants like Torque/Maui ● LSF ● SGE, also known as Oracle Grid EngineglideinWMS training glideinWMS - The Larger Picture 7
  8. 8. Why another system? ● All of the mentioned HTC systems assume full control of the compute resources (i.e. CPUs) ● And there are many places where this is the case ● glideinWMS developed to support non-dedicated use of compute resources ● i.e. when CPUs are given to the system only for limited duration at a timeglideinWMS training glideinWMS - The Larger Picture 8
  9. 9. Non-dedicated resources ● In the past decade, two paradigms emerged ● Grid computing ● Cloud computing ● Both allow a user community to use compute resources they dont own ● Often called resource elasticity ● Managing large number of Grid and Cloud resources by hand impractical ● glideinWMS creates a HTC system using themglideinWMS training glideinWMS - The Larger Picture 9
  10. 10. Grid vs Cloud (a short summary)● Grid computing is ● (Commercial) Clouds are basically a federation about leasing resources of HTC clusters on a pay-as-you-go basis ● Thus recently called ● And they happen to use Distributed HTC virtualization● Job queuing is a ● Instances expected to native paradigm start almost immediately ● So-called “scientific clouds” are typically just Grid systems that use virtualization (and a different middleware stack) glideinWMS training glideinWMS - The Larger Picture 10
  11. 11. Grid vs Cloud (a short summary)● Grid computing is ● (Commercial) Clouds are basically a federation about leasing resources of HTC clusters on a pay-as-you-go basis glideinWMS currently optimized ● Thus recently called ● And they happen to use for the Grid model Distributed HTC virtualization● Job queuing is a ● Instances expected to native paradigm start almost immediately ● So-called “scientific clouds” are typically just Grid systems that use virtualization (and a different middleware stack) glideinWMS training glideinWMS - The Larger Picture 11
  12. 12. glideinWMS and the Grid (Cloud resources are used in a similar way) ● glideinWMS creates an overlay system on top of the various HTC clusters HTC ● From the user community HTC point of view, glideinWMS a single HTC system HTC HTC HTC ● Just a dynamic one HTC ● glideinWMS completely automates the processglideinWMS training glideinWMS - The Larger Picture 12
  13. 13. Implementation and support ● glideinWMS heavily based on Condor ● Essentially a thin layer on top of it ● Most of the software support thus coming from the Condor development team ● At University of Wisconsin – Madison http://research.cs.wisc.edu/condor/ ● The glideinWMS-specific layer supported by a team spanning Fermilab, UCSD and ISI http://tinyurl.com/glideinWMSglideinWMS training glideinWMS - The Larger Picture 13
  14. 14. glideinWMS and Condor ● Condor handles the HTC system ● Most Condor features thus available ● glideinWMS role limited to scheduling, configuring and starting the Condor process on the compute resources HTC glideinWMS Condor CPU Handler User Job Condor Job RepositoryglideinWMS training glideinWMS - The Larger Picture 14
  15. 15. Summary ● glideinWMS is a HTC product ● i.e. enables effective use of a large number of CPUs by a large number of users ● glideinWMS creates a HTC system out of non‑dedicated compute resources ● e.g. Grid and Cloud resources ● glideinWMS is heavily based on Condor ● thus benefits from the Condor team supportglideinWMS training glideinWMS - The Larger Picture 15
  16. 16. Pointers ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● The official project Web page is http://tinyurl.com/glideinWMS ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.htmlglideinWMS training glideinWMS - The Larger Picture 16
  17. 17. Acknowledgments ● This document was sponsored by grants from the US NSF and US DOE, and by the UC systemglideinWMS training glideinWMS - The Larger Picture 17
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×