Successfully reported this slideshow.
Aug 2014 HTCondor Overview 1
glideinWMS Training
HTCondor Overview
by Igor Sfiligoi, UC San Diego
Aug 2014 HTCondor Overview 2
Overview
● These slides present a HTCondor overview,
with high level views of
– Deamons invol...
Aug 2014 HTCondor Overview 3
HTCondor Daemons
The basics
Collector
Schedd Startd
Negotiator
Submit side Execute sideGlue
Aug 2014 HTCondor Overview 4
HTCondor Daemons
The basics + the master
Collector
Schedd Startd
Negotiator
Submit side Execu...
Aug 2014 HTCondor Overview 5
HTCondor Daemons
● One startd per (logical) compute resource
– Can handle multiple CPUs
● One...
Aug 2014 HTCondor Overview 6
Communication flow
Collector
Schedd Startd
Negotiator
Push:
I am here and
these are my proper...
Aug 2014 HTCondor Overview 7
Claiming protocol
● Startds keep their state current in the Collector
– By periodically pushi...
Aug 2014 HTCondor Overview 8
Claiming protocol
● The schedd will contact the startd
– Once the connection is accepted,
the...
Aug 2014 HTCondor Overview 9
Communication flow
Collector
Schedd Startd
Negotiator
Claimed/Idle
Aug 2014 HTCondor Overview 10
HTCondor Daemons
Stage 2
Collector
Schedd Startd
Negotiator
Shadow Starter
A shadow and a st...
Aug 2014 HTCondor Overview 11
HTCondor Daemons
Stage 2
Collector
Schedd Startd
Negotiator
Shadow Starter
If the network co...
Aug 2014 HTCondor Overview 12
HTCondor Daemons
● The shadow takes ownership of a running job
– One per job
● The starter t...
Aug 2014 HTCondor Overview 13
Claiming protocol
● Once the job terminates
– The starter goes away
– The schedd will send a...
Aug 2014 HTCondor Overview 14
Matchmaking and latency
● The Negotiator pulls the startd slot list from the Collector
– In ...
Aug 2014 HTCondor Overview 15
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 16
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 17
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 18
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 19
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 20
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 21
Security considerations
The glidein use case
Collector
Schedd Startd
Collector center of all...
Aug 2014 HTCondor Overview 22
Security cost and scalability
The glidein use case
Collector
Schedd Startd
x509 too expensiv...
Aug 2014 HTCondor Overview 23
Security cost and scalability
The glidein use case
Collector
Schedd Startd
Spread the load o...
Aug 2014 HTCondor Overview 24
Network/Firewall considerations
Collector
Schedd Startd
Negotiator
Shadow Starter
HTCondor i...
Aug 2014 HTCondor Overview 25
Network/Firewall considerations
Collector
Schedd Startd
Shadow Starter
Execute nodes often b...
Aug 2014 HTCondor Overview 26
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnel
Collec...
Aug 2014 HTCondor Overview 27
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnel
Collec...
Aug 2014 HTCondor Overview 28
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnel
Collec...
Aug 2014 HTCondor Overview 29
Network/Firewall considerations
Collector
Schedd Startd
CCB protocol creates a tunnel
Collec...
Aug 2014 HTCondor Overview 30
CCB and scalability
Collector
Startd
A single central service cannot really handle all the l...
Aug 2014 HTCondor Overview 31
CCB and scalability
Collector
Startd
No real need to use a single CCB,
could use any number ...
Aug 2014 HTCondor Overview 32
CCB and scalability
Collector
Schedd Startd
✓Only callback request
goes through CCB Startd o...
Aug 2014 HTCondor Overview 33
CCB and scalability
Collector
Shared_Port_Daemon Startd
✓Only callback request
goes through ...
Aug 2014 HTCondor Overview 34
CCB and scalability
Collector
Shared_Port_Daemon Startd
✓
Socket is moved
to the schedd ✓
HT...
Aug 2014 HTCondor Overview 35
CCB and scalability
Collector
Shared_Port_Daemon Startd
Socket is moved
to the schedd
✓
Can ...
Aug 2014 HTCondor Overview 36
CCB and scalability
Collector
Startd
✓
Starter also accepts incoming connections
Thus needs ...
Aug 2014 HTCondor Overview 37
CCB and scalability
Collector
Startd
✓
Adding a shared_port_daemon will cut
number of CCB co...
Aug 2014 HTCondor Overview 38
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector
...
Centr...
Aug 2014 HTCondor Overview 39
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector
...
Centr...
Aug 2014 HTCondor Overview 40
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector
...
Centr...
Aug 2014 HTCondor Overview 41
High Availability setup
Collector
Schedd Startd
Negotiator
Collector
x N
Collector
...
Centr...
Aug 2014 HTCondor Overview 42
The end
Upcoming SlideShare
Loading in …5
×

glideinWMS Training 2014 - HTCondor Internals

512 views

Published on

These slides present a HTCondor overview, with high level views of Deamons involved, Communication paths and Scalability considerations

Published in: Technology
  • Be the first to comment

  • Be the first to like this

glideinWMS Training 2014 - HTCondor Internals

  1. 1. Aug 2014 HTCondor Overview 1 glideinWMS Training HTCondor Overview by Igor Sfiligoi, UC San Diego
  2. 2. Aug 2014 HTCondor Overview 2 Overview ● These slides present a HTCondor overview, with high level views of – Deamons involved – Communication paths – Scalability considerations
  3. 3. Aug 2014 HTCondor Overview 3 HTCondor Daemons The basics Collector Schedd Startd Negotiator Submit side Execute sideGlue
  4. 4. Aug 2014 HTCondor Overview 4 HTCondor Daemons The basics + the master Collector Schedd Startd Negotiator Submit side Execute sideGlue Master Master Master
  5. 5. Aug 2014 HTCondor Overview 5 HTCondor Daemons ● One startd per (logical) compute resource – Can handle multiple CPUs ● One schedd per submit node – Can handle multiple users ● Collector has the list of all other daemons ● Negotiator matches user jobs to machine slots ● Master starts all other processes – Will ignore it in the rest of the talk
  6. 6. Aug 2014 HTCondor Overview 6 Communication flow Collector Schedd Startd Negotiator Push: I am here and these are my properties One ClassAd x slot Push: I am (still) here Pull: Send me the list of idle jobs
  7. 7. Aug 2014 HTCondor Overview 7 Claiming protocol ● Startds keep their state current in the Collector – By periodically pushing updates (every 5 mins by default) ● On a matchmaking cycle, the Negotiator will – Pull the startd slot list from the collector ● In Unclaimed state only (unless preemption enabled) – Pull the job list from the schedds – Create a priority list of matches – Send the matches to relevant schedds
  8. 8. Aug 2014 HTCondor Overview 8 Claiming protocol ● The schedd will contact the startd – Once the connection is accepted, the schedd owns that slots ● The schedd will spawn a shadow – Which takes over the connection – The schedd moves on to other business ● Similarly, the startd spawns a starter – And advertise a Claimed state
  9. 9. Aug 2014 HTCondor Overview 9 Communication flow Collector Schedd Startd Negotiator Claimed/Idle
  10. 10. Aug 2014 HTCondor Overview 10 HTCondor Daemons Stage 2 Collector Schedd Startd Negotiator Shadow Starter A shadow and a starter are created for every running job Claimed/Busy
  11. 11. Aug 2014 HTCondor Overview 11 HTCondor Daemons Stage 2 Collector Schedd Startd Negotiator Shadow Starter If the network connection is lost, either side can re-establish it. Claimed/Busy ✗
  12. 12. Aug 2014 HTCondor Overview 12 HTCondor Daemons ● The shadow takes ownership of a running job – One per job ● The starter takes ownership of a claimed slot – One per slot ● Together they babysit the two sides until the jobs is done and the slot can be un-claimed ● Corollary: – Each schedd node will have O(10k) shadows
  13. 13. Aug 2014 HTCondor Overview 13 Claiming protocol ● Once the job terminates – The starter goes away – The schedd will send another job to the startd, unless ● The lease has expired ● There are no more suitable jobs – The existing shadow can be reused ● But does not need to ● If the schedd does not send a new job – Startd goes into UnClaimed state
  14. 14. Aug 2014 HTCondor Overview 14 Matchmaking and latency ● The Negotiator pulls the startd slot list from the Collector – In a single transaction ● The Negotiator pulls the job list from the schedds – Basically, one at a time! – But it does cluster similar jobs together at Schedd level – The idea being that it will not ask for more, if either the user runs out of priority or no more slots are available ● Negotiator thus sensitive to Network latencies – Matching Schedds far away may be limited by network latency not Negotiator CPU use
  15. 15. Aug 2014 HTCondor Overview 15 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust Mutual authentication between Startd and Collector using x509 whitelisting Mutual authentication between Startd and Collector/Negotiator using x509 whitelisting Negotiator Negotiator and Collector co-located, FS auth
  16. 16. Aug 2014 HTCondor Overview 16 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust After initial handshake, use shared secret After initial handshake, use shared secret Negotiator Negotiator and Collector co-located, FS auth Full x509 expensive, used only on daemon restart and periodically once every few days
  17. 17. Aug 2014 HTCondor Overview 17 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust Startd also sends shared secret for matchmaking
  18. 18. Aug 2014 HTCondor Overview 18 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust Negotiator only authorized user of Startd's shared secrets Negotiator Startd shared secret sent on job match Schedd may get many secrets, one per matched job
  19. 19. Aug 2014 HTCondor Overview 19 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust Use given shared secret for auth No other credentials in play
  20. 20. Aug 2014 HTCondor Overview 20 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust Shadow Starter Shadow and starter inherit the socket Also inherit shared secret, for reconnect
  21. 21. Aug 2014 HTCondor Overview 21 Security considerations The glidein use case Collector Schedd Startd Collector center of all trust If Startd goes in Unclaimed state, a new secret is created and sent
  22. 22. Aug 2014 HTCondor Overview 22 Security cost and scalability The glidein use case Collector Schedd Startd x509 too expensive for a single central service (both due to CPU use, and network latency issues) Mutual authentication between Startd and Collector using x509 whitelisting Mutual authentication between Startd and Collector/Negotiator using x509 whitelisting Glideins can start at 10+Hz rate
  23. 23. Aug 2014 HTCondor Overview 23 Security cost and scalability The glidein use case Collector Schedd Startd Spread the load over multiple child Collectors Child collectors forward all ads Mutual authentication between Startd and child Collector using x509 whitelisting New Schedds join rarely Collector Co-located, thus cheap x N Collector ... Randomly pick one of many child Collectors and then stick with it
  24. 24. Aug 2014 HTCondor Overview 24 Network/Firewall considerations Collector Schedd Startd Negotiator Shadow Starter HTCondor is conceptually a Peer-to-Peer system Everyone talks to everyone
  25. 25. Aug 2014 HTCondor Overview 25 Network/Firewall considerations Collector Schedd Startd Shadow Starter Execute nodes often behind firewalls and/or NATs ✓
  26. 26. Aug 2014 HTCondor Overview 26 Network/Firewall considerations Collector Schedd Startd CCB protocol creates a tunnel Collector implements the CCB ✓ Startd->Collector communication still direct A separate channel over long lived TCP socket
  27. 27. Aug 2014 HTCondor Overview 27 Network/Firewall considerations Collector Schedd Startd CCB protocol creates a tunnel Collector implements the CCB ✓ CCB delivers messages to the startd
  28. 28. Aug 2014 HTCondor Overview 28 Network/Firewall considerations Collector Schedd Startd CCB protocol creates a tunnel Collector implements the CCB ✓Only callback request goes through CCB Startd opens a long lived TCP connection to the Schedd All further communication on that channel from there on ✓
  29. 29. Aug 2014 HTCondor Overview 29 Network/Firewall considerations Collector Schedd Startd CCB protocol creates a tunnel Collector implements the CCB ✓ Shadow and Starter inherit this socket Shadow Starter ✓
  30. 30. Aug 2014 HTCondor Overview 30 CCB and scalability Collector Startd A single central service cannot really handle all the load Processes usually limited to O(1k) sockets We can have O(10k+) glideins
  31. 31. Aug 2014 HTCondor Overview 31 CCB and scalability Collector Startd No real need to use a single CCB, could use any number of dedicated CCB Collectors Collector x N Collector ... The standard strategy is to just piggy-back on the “Child Collector” paradigm Randomly pick one of many CCBs and then stick with it
  32. 32. Aug 2014 HTCondor Overview 32 CCB and scalability Collector Schedd Startd ✓Only callback request goes through CCB Startd opens a long lived TCP connection to the Schedd ✓ The Schedd now needs to accept incoming connections Default HTCondor mechanism of “one port x connection” does not scale (only ~30k usable ports in IP)
  33. 33. Aug 2014 HTCondor Overview 33 CCB and scalability Collector Shared_Port_Daemon Startd ✓Only callback request goes through CCB Startd opens a long lived TCP connection to the shared_port_daemon ✓ HTCondor added shared_port_daemon to multiplex requests on a single port Schedd Specifying that it is for the schedd
  34. 34. Aug 2014 HTCondor Overview 34 CCB and scalability Collector Shared_Port_Daemon Startd ✓ Socket is moved to the schedd ✓ HTCondor added shared_port_daemon to multiplex requests on a single port Schedd Same node, local UNIX command
  35. 35. Aug 2014 HTCondor Overview 35 CCB and scalability Collector Shared_Port_Daemon Startd Socket is moved to the schedd ✓ Can be used by starter to contact the Shadow, too Shadow Same node, local UNIX command Starter
  36. 36. Aug 2014 HTCondor Overview 36 CCB and scalability Collector Startd ✓ Starter also accepts incoming connections Thus needs a CCB connection Starter ✓Plus there is normally one for the Master, too And there is one Starter per slot x N Starter
  37. 37. Aug 2014 HTCondor Overview 37 CCB and scalability Collector Startd ✓ Adding a shared_port_daemon will cut number of CCB connections to exactly one Starter x N Starter Shared_Port_Daemon Route incoming requests to appropriate daemon
  38. 38. Aug 2014 HTCondor Overview 38 High Availability setup Collector Schedd Startd Negotiator Collector x N Collector ... Central Manager Node Using a single CM node risky; if it dies, the pool dies with it. Having multiple child collectors does not help
  39. 39. Aug 2014 HTCondor Overview 39 High Availability setup Collector Schedd Startd Negotiator Collector x N Collector ... Central Manager Node HTCondor allows for 2 or more CM nodes Schedds and Startds talk to all of them Including one CCB per CM node Collector Negotiator Collector x N Collector ... Central Manager Node
  40. 40. Aug 2014 HTCondor Overview 40 High Availability setup Collector Schedd Startd Negotiator Collector x N Collector ... Central Manager Node There can be only one active Negotiator, to make user priority decision HAD daemons maintain only one alive with others in standby mode Collector Negotiator Collector x N Collector ... Central Manager Node HADHAD
  41. 41. Aug 2014 HTCondor Overview 41 High Availability setup Collector Schedd Startd Negotiator Collector x N Collector ... Central Manager Node Schedd “HA” typically just “partition the jobs between many schedds” Temporary Schedd downtimes result in other schedds taking over the slots HAD daemons maintain only one alive with others in standby mode Collector Negotiator Collector x N Collector ... Central Manager Node HADHAD No real need for Startd HA Schedd x M ... True Schedd HA possible, but requires shared FS
  42. 42. Aug 2014 HTCondor Overview 42 The end

×