Successfully reported this slideshow.
Your SlideShare is downloading. ×

glideinWMS Architecture - glideinWMS Training Jan 2012

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 34 Ad

More Related Content

Viewers also liked (19)

Advertisement

More from Igor Sfiligoi (20)

Recently uploaded (20)

Advertisement

glideinWMS Architecture - glideinWMS Training Jan 2012

  1. 1. glideinWMS training @ UCSD glideinWMS architecture by Igor Sfiligoi (UCSD) UCSD Jan 17th 2012 glideinWMS architecture 1
  2. 2. Outline ● A high level overview of the glideinWMS ● Description of the components UCSD Jan 17th 2012 glideinWMS architecture 2
  3. 3. glideinWMS glideinWMS from 10k feet UCSD Jan 17th 2012 glideinWMS architecture 3
  4. 4. Refresher - Condor ● A Condor pool is composed of 3 pieces Central manager Execution node Collector Execution node Negotiator Submit node Execution node Submit node Execution node Submit node Execution node Schedd Startd Job UCSD Jan 17th 2012 glideinWMS architecture 4
  5. 5. What is a glidein? ● A glidein is just a properly configured execution node submitted as a Grid job Central manager glidein Execution node Collector glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Job UCSD Jan 17th 2012 glideinWMS architecture 5
  6. 6. What is glideinWMS? ● glideinWMS is an automated tool for submitting glideins on demand Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMS UCSD Jan 17th 2012 glideinWMS architecture 6
  7. 7. glideinWMS architecture ● glideinWMS has 3 logical pieces Frontend domain Monitor Submit node Configure Condor Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match Request CREAM Startd glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 7
  8. 8. glideinWMS architecture ● glideinWMS has 3 logical pieces ● glidein_startup – Configures and starts Condor execution daemons Runtime environment discovery and validation ● Factory – Knows about the sites and does the submission Grid knowledge and troubleshooting ● Frontend – Knows about user jobs and requests glideins Site selection logic and job monitoring UCSD Jan 17th 2012 glideinWMS architecture 8
  9. 9. Cardinality ● N-to-M relationship ● Each Frontend can talk to many Factories ● Each Factory may serve many Frontends VO Frontend VO Frontend Glidein Factory Collector Schedd Negotiator Collector Startd Startd Schedd User job User job Negotiator Startd Glidein Factory User job UCSD Jan 17th 2012 glideinWMS architecture 9
  10. 10. Many operators ● Factory and Frontend are usually operated by different people ● Frontends VO specific ● Operated by VO admins ● Each sets policies for its users ● Factories generic ● Do not need to be affiliated with any group ● Factory ops main task is Grid monitoring and troubleshooting UCSD Jan 17th 2012 glideinWMS architecture 10
  11. 11. glideinWMS A (sort of) detailed view of glidein_startup UCSD Jan 17th 2012 glideinWMS architecture 11
  12. 12. Refresher – glideinWMS arch. ● glidein_startup configures and starts Condor Monitor Submit node Condor Configure Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 12
  13. 13. glidein_startup tasks ● Validate node (environment) ● Download Condor binaries Performed by plugins ● Configure Condor ● Start Condor daemon(s) ● Collect post-mortem monitoring info ● Cleanup UCSD Jan 17th 2012 glideinWMS architecture 13
  14. 14. glidein_startup plugins ● Config files and scripts loaded via HTTP ● From both the factory and the frontend Web servers ● Can use local Web proxy (e.g. Squid) ● Mechanism tamper proof and cache coherent Factory node glidein_startup HTTPd ● Load files from factory Web Squid ● Load files from frontend Web Frontend node ● Run executables ● Start Condor Startd HTTPd ● Cleanup UCSD Jan 17th 2012 glideinWMS architecture 14
  15. 15. glidein_startup scripts ● Standard plugins ● Basic Grid node validation (certs, disk space, etc.) ● Setup Condor (glexec, CCB, etc.) ● VO provided plugins ● Optional, but can be anything ● CMS@UCSD checks for CMS SW ● Factory admin can also provide them ● Details about the plugins can be found at http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.html UCSD Jan 17th 2012 glideinWMS architecture 15
  16. 16. glideinWMS A (sort of) detailed view of the glidein factory UCSD Jan 17th 2012 glideinWMS architecture 16
  17. 17. Refresher – glideinWMS arch. ● The factory knowns about the grid and submits glideins Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 17
  18. 18. Glidein factory ● Glidein factory knows how to contact sites ● List in a local config ● Only trusted and tested sites should be included ● For each site (called entry) ● Contact info (Node, grid type, jobmanager) ● Site config (startup dir, glexec, OS type, …) ● VOs supported ● Other attributes (Site name, closest SE, ...) ● Admin maintained, periodically compared to BDII http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.html UCSD Jan 17th 2012 glideinWMS architecture 18
  19. 19. Glidein factory role ● The glidein factory is just a slave ● The frontend(s) tell it how many glideins to submit where ● Once the glideins start to run, they report to the VO collector and the factory is not involved ● The communication is based on ClassAds ● The factory has a Collector for this purpose Frontend node Factory node Frontend Collector Factory UCSD Jan 17th 2012 glideinWMS architecture 19
  20. 20. Factory collector ● The factory collector handles all communication Factory node Frontend node Find sites Collector Frontend Request glideins . Advertise Retrieve . entry orders . Entry ... Entry Frontend node Frontend Spawn Factory http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.html UCSD Jan 17th 2012 glideinWMS architecture 20
  21. 21. Frontends ● The factory admin decides which Frontends to serve Frontend node ● Valid proxy Frontend with known DN needed to talk to the collector ● Factory config has further Factory node fine grained controls Collector Frontend node Factory Frontend UCSD Jan 17th 2012 glideinWMS architecture 21
  22. 22. Glidein submission ● The glidein factory (entry) uses Condor-G to submit glideins ● Condor-G does the heavy lifting ● The factory just monitors the progress glidein glidein Factory node CREAM Submit Entry Schedd . Monitor . . . . . glidein Submit Schedd Globus Entry glidein Monitor UCSD Jan 17th 2012 glideinWMS architecture 22
  23. 23. Credentials/Proxy ● Proxy typically provided by the frontend ● Although the factory can provide a default one (rarely used) ● Proxy delivered encrypted in the ClassAd ● Factory (entry) provides the encryption key (PKI) ● Proxy stored on disk ● Each VO mapped to a different UID Frontend node Factory node Get key Frontend Collector Schedd Deliver proxy (encrypted) Entry UCSD Jan 17th 2012 glideinWMS architecture 23
  24. 24. glideinWMS A (sort of) detailed view of the VO frontend UCSD Jan 17th 2012 glideinWMS architecture 24
  25. 25. Refresher – glideinWMS arch. ● The frontend monitors the user Condor pool, does the matchmaking and requests glideins Frontend domain Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 25
  26. 26. VO frontend ● The VO frontend is the brain of a glideinWMS-based pool ● Like a site-level “negotiator” VO domain Find Find Submit node idle jobs entries Frontend node Monitor Submit node Frontend Condor Match Central manager Match Request Request glideins glideins Factory node UCSD Jan 17th 2012 glideinWMS architecture 26
  27. 27. Two level matchmaking ● The frontend triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Schedd glidein Execution node glidein Execution node Startd Globus Job Frontend Factory UCSD Jan 17th 2012 glideinWMS architecture 27
  28. 28. Frontend logic ● The glideinWMS glidein request logic is based on the principle on “constant pressure” ● Frontend requests a certain number of “idle glideins” in the factory queue at all times ● It does not request a specific number of glideins ● This is done due to the asynchronous nature of the system ● Both the factory and the frontend are in a polling loop and talk to each other indirectly UCSD Jan 17th 2012 glideinWMS architecture 28
  29. 29. Frontend logic ● Frontend matches job attrs against entry attrs ● It then counts the matched idle jobs ● A fraction of this number becomes the “pressure requests” (up to 1/3) ● The matchmaking expression is defined by the frontend admin ● Not the user ● Debatable if it is better or worse, but it does reduce frontend code complexity UCSD Jan 17th 2012 glideinWMS architecture 29
  30. 30. Frontend config ● The frontend owns the “glidein proxy” ● And delegates it to the factory(s) when requesting glideins ● Must keep it valid at all times (usually at OS level) ● The VO frontend can (and should) provide VO‑specific validation scripts ● The VO frontend can (and should) set the glidein start expression ● Used by the VO negotiator for final matchmaking UCSD Jan 17th 2012 glideinWMS architecture 30
  31. 31. glideinWMS And the summary UCSD Jan 17th 2012 glideinWMS architecture 31
  32. 32. Summary ● Glideins are just properly configured Condor execute nodes submitted as Grid jobs ● The glideinWMS is a mechanism to automate glidein submission ● The glideinWMS is composed of three logical entities, two being actual services: ● Glidein factories know about the Grid ● VO frontend know about the users and drive the factories UCSD Jan 17th 2012 glideinWMS architecture 32
  33. 33. Pointers ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● The official project Web page is http://tinyurl.com/glideinWMS ● CMS frontend at UCSD http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.html UCSD Jan 17th 2012 glideinWMS architecture 33
  34. 34. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC system UCSD Jan 17th 2012 glideinWMS architecture 34

×