Your SlideShare is downloading. ×
0
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
glideinWMS Frontend Internals - glideinWMS Training Jan 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

glideinWMS Frontend Internals - glideinWMS Training Jan 2012

472

Published on

This presentation provides a detailed insight on the internal working of the glideinWMS Frontend. Part of the glideinWMS Training session held in Jan 2012 at UCSD.

This presentation provides a detailed insight on the internal working of the glideinWMS Frontend. Part of the glideinWMS Training session held in Jan 2012 at UCSD.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
472
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. glideinWMS Training @ UCSD glideinWMS frontend Internals by Igor Sfiligoi (UCSD)UCSD Jan 17th 2012 Frontend Internals 1
  2. Refresher - Glideins ● A glidein is just a properly configured Condor execution node submitted as a Grid job ● glideinWMS Central manager provides glidein Execution node Collector CREAM automation glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMSUCSD Jan 17th 2012 Frontend Internals 2
  3. Refresher – Glidein Frontend ● The frontend monitors the user Condor pool, does the matchmaking and requests glideins ● Factory a slave Configure Condor G.N. Submit node Frontend node Worker node Monitor Submit node Frontend Condor glidein Central manager Startd Match Globus Job Request glideins Factory node Condor glidein Execution node CREAM Factory glidein Execution node Submit glideinsUCSD Jan 17th 2012 Frontend Internals 3
  4. Refresher - Cardinality ● N-to-M relationship ● Each Frontend can talk to many Factories ● Each Factory may serve many Frontends VO Frontend VO Frontend Glidein Factory Collector Schedd Negotiator Collector Startd Startd Schedd User job User job Negotiator Startd Glidein Factory User jobUCSD Jan 17th 2012 Frontend Internals 4
  5. Frontend architecture ● The frontend is composed of: ● The Condor daemons ● The glideinWMS frontend proper ● Condor client – to talk to the factories ● Web server – deliver code and data to glideins + monitoring ● The glideinWMS frontend itself composed of: ● Group processes – do the real work ● Master frontend – controls the others and aggregates monitoringUCSD Jan 17th 2012 Frontend Internals 5
  6. Frontend arch - Picture Frontend DomainFactory Submit node Submit node Factory Central manager Frontend node Group Entry ... Group glidein Spawn Web Server Frontend UCSD Jan 17th 2012 Frontend Internals 6
  7. Condor processes ● Explained in enough detail in previous talk ● Will not repeat myself Central manager Collector Negotiator Submit node Submit node Submit node ScheddUCSD Jan 17th 2012 Frontend Internals 7
  8. Frontend processes ● Real work performed by Group process ● glideinFrontendElement.py ● One process x Group Frontend == Frontend Group in the rest of the talk ● They are controlled by master Frontend ● glideinFrontend.py ● Starts the other processes ● Aggregates monitoringUCSD Jan 17th 2012 Frontend Internals 8
  9. Frontend role ● The VO frontend is the brain of a glideinWMS-based pool ● Like a site-level “negotiator” VO domain Find Find Submit node idle jobs entries Monitor Submit node Frontend Condor Match Central manager Match Request Request glideinsFactory node glideins Factory node UCSD Jan 17th 2012 Frontend Internals 9
  10. Reminder - Two level matchmaking ● The frontend triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Schedd glidein Execution node glidein Execution node Startd Globus Job Frontend FactoryUCSD Jan 17th 2012 Frontend Internals 10
  11. Matchmaking logic ● The Frontend matchmaking policy is implemented centrally ● By the VO admin – not by the users ● It can use the attributes from both the job and Factory ClassAds ● Should be kept in sync with Negotiator policy ● Which is not centralized ● One way to define in the glidein START expression ● Unfortunately, one python expression other ClassAdsUCSD Jan 17th 2012 Frontend Internals 11
  12. Example matchmaking logic ● Frontend job.has_key("DESIRED_Sites") && glidein["attrs"].get("GLIDEIN_Site") in job["DESIRED_Sites"].split(",") ● Negotiator (via glidein START) GLIDECLIENT_Start = stringListMember(GLIDEIN_Site, DESIRED_Sites,",")=?=True More details at http://tinyurl.com/glideinWMS/doc.prd/factory/custom_vars.htmlUCSD Jan 17th 2012 Frontend Internals 12
  13. Communication Protocol ● No listen sockets ● All communication one way (Frontend->Factory) ● Each Factory provides a Collector ● Communication based on ClassAds ● All security implemented in the Collector ● Use standard cmdline tools for communication ● condor_status and condor_advertiseUCSD Jan 17th 2012 Frontend Internals 13
  14. Protocol sequence ● Polling loop ● Read Factory ClassAds from all factory Collectors ● Match against jobs ● Advertise own existence and requests ● Frontend sends 4 types of info ● Own identity ● Glidein submission regulation instructions ● Glidein parameters ● Pilot ProxyUCSD Jan 17th 2012 Frontend Internals 14
  15. Glidein submission regulation ● The glideinWMS glidein request logic is based on the principle on “constant pressure” ● Frontend Group requests a certain number of “idle glideins” in the factory queue at all times ● It does not request a specific number of glideins ● This is done due to the asynchronous nature of the system ● Both the factory entries and the frontend groups are in a polling loop and talk to each other indirectlyUCSD Jan 17th 2012 Frontend Internals 15
  16. Glidein requests ● Frontend matches job attrs against entry attrs ● It then counts the matched idle jobs ● A fraction of this number becomes the “pressure requests” (up to 1/3) ● This number is then capped (~20) ● The attribute in the ClassAd is ReqIdleGlideins ● The Frontend also advertises ReqMaxRunningGlideins ● Emergency breakUCSD Jan 17th 2012 Frontend Internals 16
  17. Scaling back ● The Frontend can also request that existing glideins in the Factory queues are removed ReqRemoveExcess ● NO – Default, never remove ● WAIT – Remove any glidein not yet at a site ● IDLE – Remove any glidein that has not started yet ● ALL – Remove all glideins ● Frontend pretty conservative ● Only requests removal if no user jobs in the queuesUCSD Jan 17th 2012 Frontend Internals 17
  18. Parameters ● Frontend can send attributes to glideins: ● Dynamically – as parameter in the ClassAd ● Statically – as entry in a config file ● Attributes typically static ● Current Frontend implementation does not really have much support for dynamicityUCSD Jan 17th 2012 Frontend Internals 18
  19. Pilot proxy delegation ● Pilot proxy is encrypted with factory pub key ● Then published in the ClassAd ● Only owner of priv. key can decrypt it Frontend node Factory node Get key Use Frontend Collector proxy glidein Deliver proxy Entry Globus (encrypted) glidein ● However ● Must make sure we are talking to a trusted Factory! – not just anyone providing a pub key ● More details in a few slidesUCSD Jan 17th 2012 Frontend Internals 19
  20. Pilot proxy selection ● A Frontend must have at least one pilot proxy ● But can have more than one ● Many proxies can be used for priority reasons ● When competing with non-pilot submission ● Want to have as many proxies as users served ● Proxy selection plugin basedUCSD Jan 17th 2012 Frontend Internals 20
  21. Pilot proxy plugins● Several standard plugins ● ProxyFirst – Only the first listed Most used ● ProxyAll – All listed ● ProxyUserCardinality – First N, with N=#users ● ProxyUserMapWRecycling – N, with pilot-to-user mapping● VO admin could implemented his own, if desired UCSD Jan 17th 2012 Frontend Internals 21
  22. Factory ClassAdUCSD Jan 17th 2012 Frontend Internals 22
  23. Frontend ClassAdUCSD Jan 17th 2012 Frontend Internals 23
  24. Security - Authorization Authentication based ● Mutual authorization on GSI/x509 ● The frontend admin decides Frontend node which Factories to talk to ● The factory admin decides Frontend which Frontends to serve ● Based on x509 Dns ● Both sides have whitelists Factory node Factory node Collector Frontend node Collector Factory Frontend Factory Frontend needs a service proxyUCSD Jan 17th 2012 Frontend Internals 24
  25. Trusting the factory key ● It is all just ClassAds! ● Anyone can publish a ClassAd and declare to be a factory ● However, Factory Collector knows who published it ● And advertises it as the attribute AuthenticatedIdentity ● Cannot be faked by the client a3 ● Frontend has a whitelist b3 c3 of trusted factories ID3 Frontend Collector Frontend a1 a2 b1 b2 c1 c2 ID1 ID2 FactoryUCSD Jan 17th 2012 Frontend Internals 25
  26. Security handles ● As we said, mutual authentication with Factory ● Frontend provides (and Factory whitelists) One set ● Service Proxy to talk to Factory Collector for whole Frontend ● Frontend Security name (all Groups) ● Proxy Security Class One per pilot proxy ● Frontend whitelists (obtained from Factory admins) ● Factory Collector DN ● Own mapping @Factory One set per factory collector ● Factory mapping @FactoryUCSD Jan 17th 2012 Frontend Internals 26
  27. Security within the VO domain ● Frontend process, Collector and schedds often not on the same node ● Need network security Could be even over WAN ● All processes must CMS setup has nodes in CA, IL and Europe whitelist each other ● Again, GSI based Schedd Monitor Schedd Frontend Condor Collector/NegotiatorUCSD Jan 17th 2012 Frontend Internals 27
  28. THE ENDUCSD Jan 17th 2012 Frontend Internals 28
  29. Pointers ● The official project Web page is http://tinyurl.com/glideinWMS ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.htmlUCSD Jan 17th 2012 Frontend Internals 29
  30. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC systemUCSD Jan 17th 2012 Frontend Internals 30

×