glideinWMS training @ UCSD           glideinWMS architecture                     by Igor Sfiligoi (UCSD)UCSD Jan 17th 2012...
Outline                     ●   A high level overview                         of the glideinWMS                     ●   De...
glideinWMS                      glideinWMS                     from 10k feetUCSD Jan 17th 2012       glideinWMS architectu...
Refresher - Condor ●   A Condor pool is composed of 3 pieces                        Central manager                       ...
What is a glidein? ●   A glidein is just a properly configured     execution node submitted as a Grid job                 ...
What is glideinWMS? ●   glideinWMS is an automated tool for submitting     glideins on demand                        Centr...
glideinWMS architecture ●   glideinWMS has 3 logical piecesFrontend domain Monitor                                 Submit ...
glideinWMS architecture ●   glideinWMS has 3 logical pieces      ●   glidein_startup – Configures and starts              ...
Cardinality ●   N-to-M relationship      ●   Each Frontend can talk to many Factories      ●   Each Factory may serve many...
Many operators ●   Factory and Frontend are usually operated     by different people ●   Frontends VO specific      ●   Op...
glideinWMS               A (sort of) detailed view of                     glidein_startupUCSD Jan 17th 2012        glidein...
Refresher – glideinWMS arch. ●   glidein_startup configures and starts Condor                     Monitor     Submit node ...
glidein_startup tasks ●   Validate node (environment) ●   Download Condor binaries                        Performed       ...
glidein_startup plugins ●   Config files and scripts loaded via HTTP      ●   From both the factory and the frontend Web s...
glidein_startup scripts ●   Standard plugins      ●   Basic Grid node validation (certs, disk space, etc.)      ●   Setup ...
glideinWMS          A (sort of) detailed view of the                     glidein factoryUCSD Jan 17th 2012        glideinW...
Refresher – glideinWMS arch. ●   The factory knowns about the grid and     submits glideins                               ...
Glidein factory ●   Glidein factory knows how to contact sites      ●   List in a local config      ●   Only trusted and t...
Glidein factory role ●   The glidein factory is just a slave      ●   The frontend(s) tell it how many glideins          t...
Factory collector ●    The factory collector handles all communication                                                    ...
Frontends ●   The factory admin decides     which Frontends to serve                                                      ...
Glidein submission ●   The glidein factory (entry) uses     Condor-G to submit glideins      ●   Condor-G does the heavy l...
Credentials/Proxy ●   Proxy typically provided by the frontend      ●   Although the factory can provide a default one (ra...
glideinWMS          A (sort of) detailed view of the                     VO frontendUCSD Jan 17th 2012      glideinWMS arc...
Refresher – glideinWMS arch. ●   The frontend monitors the user Condor pool,     does the matchmaking and requests glidein...
VO frontend ●   The VO frontend is the brain     of a glideinWMS-based pool      ●   Like a site-level “negotiator” VO dom...
Two level matchmaking ●   The frontend triggers glidein submission      ●   The “regular” negotiator matches jobs to glide...
Frontend logic ●   The glideinWMS glidein request logic     is based on the principle on “constant pressure”      ●   Fron...
Frontend logic ●   Frontend matches job attrs against entry attrs      ●   It then counts the matched idle jobs      ●   A...
Frontend config ●   The frontend owns the “glidein proxy”      ●   And delegates it to the factory(s)          when reques...
glideinWMS                      And the                     summaryUCSD Jan 17th 2012    glideinWMS architecture   31
Summary ●   Glideins are just properly configured Condor     execute nodes submitted as Grid jobs ●   The glideinWMS is a ...
Pointers ●   glideinWMS development team is reachable at     glideinwms-support@fnal.gov ●   The official project Web page...
Acknowledgments ●   The glideinWMS is a CMS-led project     developed mostly at FNAL, with contributions     from UCSD and...
Upcoming SlideShare
Loading in …5
×

glideinWMS Architecture - glideinWMS Training Jan 2012

615 views

Published on

An overview of the glideinWMS Architecture. Part of the glideinWMS Training session held in Jan 2012 at UCSD.

Published in: Technology, Education
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
615
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

glideinWMS Architecture - glideinWMS Training Jan 2012

  1. 1. glideinWMS training @ UCSD glideinWMS architecture by Igor Sfiligoi (UCSD)UCSD Jan 17th 2012 glideinWMS architecture 1
  2. 2. Outline ● A high level overview of the glideinWMS ● Description of the componentsUCSD Jan 17th 2012 glideinWMS architecture 2
  3. 3. glideinWMS glideinWMS from 10k feetUCSD Jan 17th 2012 glideinWMS architecture 3
  4. 4. Refresher - Condor ● A Condor pool is composed of 3 pieces Central manager Execution node Collector Execution node Negotiator Submit node Execution node Submit node Execution node Submit node Execution node Schedd Startd JobUCSD Jan 17th 2012 glideinWMS architecture 4
  5. 5. What is a glidein? ● A glidein is just a properly configured execution node submitted as a Grid job Central manager glidein Execution node Collector glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd JobUCSD Jan 17th 2012 glideinWMS architecture 5
  6. 6. What is glideinWMS? ● glideinWMS is an automated tool for submitting glideins on demand Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMSUCSD Jan 17th 2012 glideinWMS architecture 6
  7. 7. glideinWMS architecture ● glideinWMS has 3 logical piecesFrontend domain Monitor Submit node Configure Condor Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match Request CREAM Startd glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 7
  8. 8. glideinWMS architecture ● glideinWMS has 3 logical pieces ● glidein_startup – Configures and starts Condor execution daemons Runtime environment discovery and validation ● Factory – Knows about the sites and does the submission Grid knowledge and troubleshooting ● Frontend – Knows about user jobs and requests glideins Site selection logic and job monitoringUCSD Jan 17th 2012 glideinWMS architecture 8
  9. 9. Cardinality ● N-to-M relationship ● Each Frontend can talk to many Factories ● Each Factory may serve many Frontends VO Frontend VO Frontend Glidein Factory Collector Schedd Negotiator Collector Startd Startd Schedd User job User job Negotiator Startd Glidein Factory User jobUCSD Jan 17th 2012 glideinWMS architecture 9
  10. 10. Many operators ● Factory and Frontend are usually operated by different people ● Frontends VO specific ● Operated by VO admins ● Each sets policies for its users ● Factories generic ● Do not need to be affiliated with any group ● Factory ops main task is Grid monitoring and troubleshootingUCSD Jan 17th 2012 glideinWMS architecture 10
  11. 11. glideinWMS A (sort of) detailed view of glidein_startupUCSD Jan 17th 2012 glideinWMS architecture 11
  12. 12. Refresher – glideinWMS arch. ● glidein_startup configures and starts Condor Monitor Submit node Condor Configure Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 12
  13. 13. glidein_startup tasks ● Validate node (environment) ● Download Condor binaries Performed by plugins ● Configure Condor ● Start Condor daemon(s) ● Collect post-mortem monitoring info ● CleanupUCSD Jan 17th 2012 glideinWMS architecture 13
  14. 14. glidein_startup plugins ● Config files and scripts loaded via HTTP ● From both the factory and the frontend Web servers ● Can use local Web proxy (e.g. Squid) ● Mechanism tamper proof and cache coherent Factory node glidein_startup HTTPd ● Load files from factory Web Squid ● Load files from frontend Web Frontend node ● Run executables ● Start Condor Startd HTTPd ● CleanupUCSD Jan 17th 2012 glideinWMS architecture 14
  15. 15. glidein_startup scripts ● Standard plugins ● Basic Grid node validation (certs, disk space, etc.) ● Setup Condor (glexec, CCB, etc.) ● VO provided plugins ● Optional, but can be anything ● CMS@UCSD checks for CMS SW ● Factory admin can also provide them ● Details about the plugins can be found at http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.htmlUCSD Jan 17th 2012 glideinWMS architecture 15
  16. 16. glideinWMS A (sort of) detailed view of the glidein factoryUCSD Jan 17th 2012 glideinWMS architecture 16
  17. 17. Refresher – glideinWMS arch. ● The factory knowns about the grid and submits glideins Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 17
  18. 18. Glidein factory ● Glidein factory knows how to contact sites ● List in a local config ● Only trusted and tested sites should be included ● For each site (called entry) ● Contact info (Node, grid type, jobmanager) ● Site config (startup dir, glexec, OS type, …) ● VOs supported ● Other attributes (Site name, closest SE, ...) ● Admin maintained, periodically compared to BDII http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.htmlUCSD Jan 17th 2012 glideinWMS architecture 18
  19. 19. Glidein factory role ● The glidein factory is just a slave ● The frontend(s) tell it how many glideins to submit where ● Once the glideins start to run, they report to the VO collector and the factory is not involved ● The communication is based on ClassAds ● The factory has a Collector for this purpose Frontend node Factory node Frontend Collector FactoryUCSD Jan 17th 2012 glideinWMS architecture 19
  20. 20. Factory collector ● The factory collector handles all communication Factory node Frontend node Find sites Collector Frontend Request glideins . Advertise Retrieve . entry orders . Entry ... Entry Frontend node Frontend Spawn Factory http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.htmlUCSD Jan 17th 2012 glideinWMS architecture 20
  21. 21. Frontends ● The factory admin decides which Frontends to serve Frontend node ● Valid proxy Frontend with known DN needed to talk to the collector ● Factory config has further Factory node fine grained controls Collector Frontend node Factory FrontendUCSD Jan 17th 2012 glideinWMS architecture 21
  22. 22. Glidein submission ● The glidein factory (entry) uses Condor-G to submit glideins ● Condor-G does the heavy lifting ● The factory just monitors the progress glidein glidein Factory node CREAM Submit Entry Schedd . Monitor . . . . . glidein Submit Schedd Globus Entry glidein MonitorUCSD Jan 17th 2012 glideinWMS architecture 22
  23. 23. Credentials/Proxy ● Proxy typically provided by the frontend ● Although the factory can provide a default one (rarely used) ● Proxy delivered encrypted in the ClassAd ● Factory (entry) provides the encryption key (PKI) ● Proxy stored on disk ● Each VO mapped to a different UID Frontend node Factory node Get key Frontend Collector Schedd Deliver proxy (encrypted) EntryUCSD Jan 17th 2012 glideinWMS architecture 23
  24. 24. glideinWMS A (sort of) detailed view of the VO frontendUCSD Jan 17th 2012 glideinWMS architecture 24
  25. 25. Refresher – glideinWMS arch. ● The frontend monitors the user Condor pool, does the matchmaking and requests glideinsFrontend domain Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 25
  26. 26. VO frontend ● The VO frontend is the brain of a glideinWMS-based pool ● Like a site-level “negotiator” VO domain Find Find Submit node idle jobs entries Frontend node Monitor Submit node Frontend Condor Match Central manager Match Request Request glideins glideins Factory nodeUCSD Jan 17th 2012 glideinWMS architecture 26
  27. 27. Two level matchmaking ● The frontend triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Schedd glidein Execution node glidein Execution node Startd Globus Job Frontend FactoryUCSD Jan 17th 2012 glideinWMS architecture 27
  28. 28. Frontend logic ● The glideinWMS glidein request logic is based on the principle on “constant pressure” ● Frontend requests a certain number of “idle glideins” in the factory queue at all times ● It does not request a specific number of glideins ● This is done due to the asynchronous nature of the system ● Both the factory and the frontend are in a polling loop and talk to each other indirectlyUCSD Jan 17th 2012 glideinWMS architecture 28
  29. 29. Frontend logic ● Frontend matches job attrs against entry attrs ● It then counts the matched idle jobs ● A fraction of this number becomes the “pressure requests” (up to 1/3) ● The matchmaking expression is defined by the frontend admin ● Not the user ● Debatable if it is better or worse, but it does reduce frontend code complexityUCSD Jan 17th 2012 glideinWMS architecture 29
  30. 30. Frontend config ● The frontend owns the “glidein proxy” ● And delegates it to the factory(s) when requesting glideins ● Must keep it valid at all times (usually at OS level) ● The VO frontend can (and should) provide VO‑specific validation scripts ● The VO frontend can (and should) set the glidein start expression ● Used by the VO negotiator for final matchmakingUCSD Jan 17th 2012 glideinWMS architecture 30
  31. 31. glideinWMS And the summaryUCSD Jan 17th 2012 glideinWMS architecture 31
  32. 32. Summary ● Glideins are just properly configured Condor execute nodes submitted as Grid jobs ● The glideinWMS is a mechanism to automate glidein submission ● The glideinWMS is composed of three logical entities, two being actual services: ● Glidein factories know about the Grid ● VO frontend know about the users and drive the factoriesUCSD Jan 17th 2012 glideinWMS architecture 32
  33. 33. Pointers ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● The official project Web page is http://tinyurl.com/glideinWMS ● CMS frontend at UCSD http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.htmlUCSD Jan 17th 2012 glideinWMS architecture 33
  34. 34. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC systemUCSD Jan 17th 2012 glideinWMS architecture 34

×