• Like
glideinWMS Architecture - glideinWMS Training Jan 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

glideinWMS Architecture - glideinWMS Training Jan 2012

  • 468 views
Published

An overview of the glideinWMS Architecture. Part of the glideinWMS Training session held in Jan 2012 at UCSD.

An overview of the glideinWMS Architecture. Part of the glideinWMS Training session held in Jan 2012 at UCSD.

Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to like this
No Downloads

Views

Total Views
468
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
3
Comments
1
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. glideinWMS training @ UCSD glideinWMS architecture by Igor Sfiligoi (UCSD)UCSD Jan 17th 2012 glideinWMS architecture 1
  • 2. Outline ● A high level overview of the glideinWMS ● Description of the componentsUCSD Jan 17th 2012 glideinWMS architecture 2
  • 3. glideinWMS glideinWMS from 10k feetUCSD Jan 17th 2012 glideinWMS architecture 3
  • 4. Refresher - Condor ● A Condor pool is composed of 3 pieces Central manager Execution node Collector Execution node Negotiator Submit node Execution node Submit node Execution node Submit node Execution node Schedd Startd JobUCSD Jan 17th 2012 glideinWMS architecture 4
  • 5. What is a glidein? ● A glidein is just a properly configured execution node submitted as a Grid job Central manager glidein Execution node Collector glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd JobUCSD Jan 17th 2012 glideinWMS architecture 5
  • 6. What is glideinWMS? ● glideinWMS is an automated tool for submitting glideins on demand Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMSUCSD Jan 17th 2012 glideinWMS architecture 6
  • 7. glideinWMS architecture ● glideinWMS has 3 logical piecesFrontend domain Monitor Submit node Configure Condor Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match Request CREAM Startd glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 7
  • 8. glideinWMS architecture ● glideinWMS has 3 logical pieces ● glidein_startup – Configures and starts Condor execution daemons Runtime environment discovery and validation ● Factory – Knows about the sites and does the submission Grid knowledge and troubleshooting ● Frontend – Knows about user jobs and requests glideins Site selection logic and job monitoringUCSD Jan 17th 2012 glideinWMS architecture 8
  • 9. Cardinality ● N-to-M relationship ● Each Frontend can talk to many Factories ● Each Factory may serve many Frontends VO Frontend VO Frontend Glidein Factory Collector Schedd Negotiator Collector Startd Startd Schedd User job User job Negotiator Startd Glidein Factory User jobUCSD Jan 17th 2012 glideinWMS architecture 9
  • 10. Many operators ● Factory and Frontend are usually operated by different people ● Frontends VO specific ● Operated by VO admins ● Each sets policies for its users ● Factories generic ● Do not need to be affiliated with any group ● Factory ops main task is Grid monitoring and troubleshootingUCSD Jan 17th 2012 glideinWMS architecture 10
  • 11. glideinWMS A (sort of) detailed view of glidein_startupUCSD Jan 17th 2012 glideinWMS architecture 11
  • 12. Refresher – glideinWMS arch. ● glidein_startup configures and starts Condor Monitor Submit node Condor Configure Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 12
  • 13. glidein_startup tasks ● Validate node (environment) ● Download Condor binaries Performed by plugins ● Configure Condor ● Start Condor daemon(s) ● Collect post-mortem monitoring info ● CleanupUCSD Jan 17th 2012 glideinWMS architecture 13
  • 14. glidein_startup plugins ● Config files and scripts loaded via HTTP ● From both the factory and the frontend Web servers ● Can use local Web proxy (e.g. Squid) ● Mechanism tamper proof and cache coherent Factory node glidein_startup HTTPd ● Load files from factory Web Squid ● Load files from frontend Web Frontend node ● Run executables ● Start Condor Startd HTTPd ● CleanupUCSD Jan 17th 2012 glideinWMS architecture 14
  • 15. glidein_startup scripts ● Standard plugins ● Basic Grid node validation (certs, disk space, etc.) ● Setup Condor (glexec, CCB, etc.) ● VO provided plugins ● Optional, but can be anything ● CMS@UCSD checks for CMS SW ● Factory admin can also provide them ● Details about the plugins can be found at http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.htmlUCSD Jan 17th 2012 glideinWMS architecture 15
  • 16. glideinWMS A (sort of) detailed view of the glidein factoryUCSD Jan 17th 2012 glideinWMS architecture 16
  • 17. Refresher – glideinWMS arch. ● The factory knowns about the grid and submits glideins Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 17
  • 18. Glidein factory ● Glidein factory knows how to contact sites ● List in a local config ● Only trusted and tested sites should be included ● For each site (called entry) ● Contact info (Node, grid type, jobmanager) ● Site config (startup dir, glexec, OS type, …) ● VOs supported ● Other attributes (Site name, closest SE, ...) ● Admin maintained, periodically compared to BDII http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.htmlUCSD Jan 17th 2012 glideinWMS architecture 18
  • 19. Glidein factory role ● The glidein factory is just a slave ● The frontend(s) tell it how many glideins to submit where ● Once the glideins start to run, they report to the VO collector and the factory is not involved ● The communication is based on ClassAds ● The factory has a Collector for this purpose Frontend node Factory node Frontend Collector FactoryUCSD Jan 17th 2012 glideinWMS architecture 19
  • 20. Factory collector ● The factory collector handles all communication Factory node Frontend node Find sites Collector Frontend Request glideins . Advertise Retrieve . entry orders . Entry ... Entry Frontend node Frontend Spawn Factory http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.htmlUCSD Jan 17th 2012 glideinWMS architecture 20
  • 21. Frontends ● The factory admin decides which Frontends to serve Frontend node ● Valid proxy Frontend with known DN needed to talk to the collector ● Factory config has further Factory node fine grained controls Collector Frontend node Factory FrontendUCSD Jan 17th 2012 glideinWMS architecture 21
  • 22. Glidein submission ● The glidein factory (entry) uses Condor-G to submit glideins ● Condor-G does the heavy lifting ● The factory just monitors the progress glidein glidein Factory node CREAM Submit Entry Schedd . Monitor . . . . . glidein Submit Schedd Globus Entry glidein MonitorUCSD Jan 17th 2012 glideinWMS architecture 22
  • 23. Credentials/Proxy ● Proxy typically provided by the frontend ● Although the factory can provide a default one (rarely used) ● Proxy delivered encrypted in the ClassAd ● Factory (entry) provides the encryption key (PKI) ● Proxy stored on disk ● Each VO mapped to a different UID Frontend node Factory node Get key Frontend Collector Schedd Deliver proxy (encrypted) EntryUCSD Jan 17th 2012 glideinWMS architecture 23
  • 24. glideinWMS A (sort of) detailed view of the VO frontendUCSD Jan 17th 2012 glideinWMS architecture 24
  • 25. Refresher – glideinWMS arch. ● The frontend monitors the user Condor pool, does the matchmaking and requests glideinsFrontend domain Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 glideinWMS architecture 25
  • 26. VO frontend ● The VO frontend is the brain of a glideinWMS-based pool ● Like a site-level “negotiator” VO domain Find Find Submit node idle jobs entries Frontend node Monitor Submit node Frontend Condor Match Central manager Match Request Request glideins glideins Factory nodeUCSD Jan 17th 2012 glideinWMS architecture 26
  • 27. Two level matchmaking ● The frontend triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Schedd glidein Execution node glidein Execution node Startd Globus Job Frontend FactoryUCSD Jan 17th 2012 glideinWMS architecture 27
  • 28. Frontend logic ● The glideinWMS glidein request logic is based on the principle on “constant pressure” ● Frontend requests a certain number of “idle glideins” in the factory queue at all times ● It does not request a specific number of glideins ● This is done due to the asynchronous nature of the system ● Both the factory and the frontend are in a polling loop and talk to each other indirectlyUCSD Jan 17th 2012 glideinWMS architecture 28
  • 29. Frontend logic ● Frontend matches job attrs against entry attrs ● It then counts the matched idle jobs ● A fraction of this number becomes the “pressure requests” (up to 1/3) ● The matchmaking expression is defined by the frontend admin ● Not the user ● Debatable if it is better or worse, but it does reduce frontend code complexityUCSD Jan 17th 2012 glideinWMS architecture 29
  • 30. Frontend config ● The frontend owns the “glidein proxy” ● And delegates it to the factory(s) when requesting glideins ● Must keep it valid at all times (usually at OS level) ● The VO frontend can (and should) provide VO‑specific validation scripts ● The VO frontend can (and should) set the glidein start expression ● Used by the VO negotiator for final matchmakingUCSD Jan 17th 2012 glideinWMS architecture 30
  • 31. glideinWMS And the summaryUCSD Jan 17th 2012 glideinWMS architecture 31
  • 32. Summary ● Glideins are just properly configured Condor execute nodes submitted as Grid jobs ● The glideinWMS is a mechanism to automate glidein submission ● The glideinWMS is composed of three logical entities, two being actual services: ● Glidein factories know about the Grid ● VO frontend know about the users and drive the factoriesUCSD Jan 17th 2012 glideinWMS architecture 32
  • 33. Pointers ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● The official project Web page is http://tinyurl.com/glideinWMS ● CMS frontend at UCSD http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.htmlUCSD Jan 17th 2012 glideinWMS architecture 33
  • 34. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC systemUCSD Jan 17th 2012 glideinWMS architecture 34