Successfully reported this slideshow.
Your SlideShare is downloading. ×

Understanding priorities in HTCondor

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 26 Ad

More Related Content

Viewers also liked (20)

Similar to Understanding priorities in HTCondor (20)

Advertisement

More from Igor Sfiligoi (20)

Advertisement

Understanding priorities in HTCondor

  1. 1. glideinWMS for users Understanding priorities in HTCondor by Igor Sfiligoi (UCSD) CERN, Dec 2012 HTCondor priorities 1
  2. 2. Scope of this talk This talk provides an overview of how priorities work in HTCondor, both between users and among jobs of the same user, and how the user can affect policies. Reader is expected to already have a basic understanding of HTCondor. CERN, Dec 2012 HTCondor priorities 2
  3. 3. HTCondor Architecture ● As a reminder Execute node Central manager Execute node Submit node Execute node Condor Submit node Execute node Submit node Execute node Condor Condor CERN, Dec 2012 HTCondor priorities 3
  4. 4. HTCondor Architecture ● And with relevant daemon names Execute node Central manager Execute node Submit node Execute node Negotiator Submit node Execute node Submit node Execute node Schedd Condor CERN, Dec 2012 HTCondor priorities 4
  5. 5. User Priorities CERN, Dec 2012 HTCondor priorities 5
  6. 6. What is a user? ● Before talking about priorities between users we need to define what IS a user ● A “HTCondor user” is represented as Yes, priorities are based Owner@Domain on the User not the Owner ● In most setups, the Owner is the “Login User Name” on the submit node ● The Domain may either represent the submit node itself, or a set of submit nodes that share the same Owner identification policies Both rules defined by the HTCondor admin and cannot be changed by the final user CERN, Dec 2012 HTCondor priorities 6
  7. 7. User priorities ● By default, the Negotiator treats all users equally ● You get fair-share out of the box ● Each user is assigned a priority number ● The lower, the better ● Two users with the same priority number on average get half of Slots each ● User priority asymptotically steers toward the number of Slots used ● Both up and down http://research.cs.wisc.edu/htcondor/manual/v7.8/3_4User_Priorities.html#SECTION00444000000000000000 CERN, Dec 2012 HTCondor priorities 7
  8. 8. Special users ● If not all users are equally important, the Negotiator supports ● Accounting groups – When you need to group users ● Priority factors – Works on user-by-user basis ● The two mechanisms can be combined http://research.cs.wisc.edu/htcondor/manual/v7.8/3_4User_Priorities.html CERN, Dec 2012 HTCondor priorities 8
  9. 9. Accounting groups ● Users can be joined in accounting groups ● The Negotiator defines the groups, but jobs specify which group they belong to ● Each group can be given a quota ● Can be absolute or relative to the size of the pool ● Sum of running jobs in the group cannot exceed it ● If quotas >100%, can be used for relative prio ● Here higher is better Jobs without any group may never ● Each group will be given, get anything on average, quota /sum(quotas) of slots G CERN, Dec 2012 HTCondor priorities 9
  10. 10. Mapping jobs to A.G. ● Users must specify which group they belong to ● No automatic mapping or validation in Condor ● Based on trust ● Jobs must add to their submit file +AccountingGroup = "<group>.<owner>" Universe Universe = vanilla = vanilla Executable = cosmos Executable = cosmos Arguments = -k 1543.3 Arguments = -k 1543.3 Output Output = cosmos.out = cosmos.out Input Input = cosmos.in = cosmos.in Log Log = cosmos.log = cosmos.log +AccountingGroup = "group_higgs.frieda" +AccountingGroup = "group_higgs.frieda" Queue 1 Queue 1 CERN, Dec 2012 HTCondor priorities 10
  11. 11. Mapping jobs to A.G. ● Users must specify which group they belong to ● No automatic mapping or validation in Condor ” “AccountingGroup@Domain ● Based on trust is effectively the identifier used by the Negotiator ● Jobs must add to their submitPriority purposes for file +AccountingGroup = "<group>.<owner>" With the default being A.G.==Owner Universe Universe = vanilla = vanilla Executable = cosmos Executable = cosmos Arguments = -k 1543.3 Arguments = -k 1543.3 Output Output = cosmos.out = cosmos.out Input Input = cosmos.in = cosmos.in Log Log = cosmos.log = cosmos.log +AccountingGroup = "group_higgs.frieda" +AccountingGroup = "group_higgs.frieda" Queue 1 Queue 1 CERN, Dec 2012 HTCondor priorities 11
  12. 12. Priority Factors ● Each user can be assigned a Priority Factor ● PF>1 will reduce a user's priority – If users X and Y have PFX=(N-1)*PFY, on average user X gets 1/N of slots (with user Y the rest) ● Can manage with cmdline tool condor_userprio $ condor_userprio -all -allusers |grep user1@node1 $ condor_userprio -all -allusers |grep user1@node1 group1.user1@node1 8016.22 8.02 10.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37 group1.user1@node1 8016.22 8.02 10.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37 $ condor_userprio -setfactor group1.user1@node1 1000 $ condor_userprio -setfactor group1.user1@node1 1000 Only superuser can set The priority factor of group1.user1@node1 was set to 1000.000000 The priority factor of group1.user1@node1 was set to 1000.000000 $ condor_userprio -all -allusers |grep user1@node1 $ condor_userprio -all -allusers |grep user1@node1 group1.user1@node1 8016.22 8.02 1000.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37 group1.user1@node1 8016.22 8.02 1000.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37 ● Admin likely have set high default PF (e.g. 1000) – PF cannot go below 1 http://research.cs.wisc.edu/htcondor/manual/v7.8/2_7Priorities_Preemption.html#sec:user-priority-explained CERN, Dec 2012 HTCondor priorities 12
  13. 13. Efficiency trade-off ● After getting a Slot, the schedd will keep it for an extended period of time Configurable, ● i.e. will schedule several jobs but it is a trade-off. of the same user on it In glideinWMS, lifetime of the glidein ● For efficiency reasons by default – Negotiator can take a few mins to do the matching ● As a side effect ● A low priority user may keep the execute node even if jobs from a higher priority user show up CERN, Dec 2012 HTCondor priorities 13
  14. 14. Preemption ● HTCondor has the notion of preemption ● If a job from a higher priority user shows up, the Negotiator may instruct an execute node to kill the running job and re-negotiate ● Yes, all work done to that point is lost (unless the job is able to checkpoint) ● Disabled by default on glideinWMS systems CERN, Dec 2012 HTCondor priorities 14
  15. 15. Submit node limits ● HTCondor resource usage on the submit node scales with the number of running jobs ● So an admin will likely set a limit MAX_JOBS_RUNNING ● If the submit node gets close to the limit, you are likely to see “weird behavior” ● The negotiator will try to be fair, and distribute the remaining wiggle room to several users with a similar priority number ● Remember: User priority is a dynamic property CERN, Dec 2012 HTCondor priorities 15
  16. 16. Monitoring per-user usage ● Submitter ClassAds provide per-user info ● But one ClassAd per submitter node $ condor_status -submitters $ condor_status -submitters Name Machine Running IdleJobs HeldJobs Name Machine Running IdleJobs HeldJobs uscms3024@cmsanalysi glidein-2. 802 299 1 uscms3024@cmsanalysi glidein-2. 802 299 1 uscms3024@cmsanalysi submit-2.t 2063 1131 0 Actual ClassAds uscms3024@cmsanalysi submit-2.t 2063 1131 0 uscms3044@cmsanalysi submit-2.t 663 344 0 uscms3044@cmsanalysi submit-2.t 663 344 0 uscms3045@cmsanalysi submit-2.t 0 1 0 uscms3045@cmsanalysi submit-2.t 0 1 0 RunningJobs IdleJobs HeldJobs RunningJobs IdleJobs HeldJobs uscms3024@cmsanalysi 2865 1430 1 uscms3024@cmsanalysi 2865 1430 1 Summary uscms3044@cmsanalysi 663 344 0 uscms3044@cmsanalysi 663 344 0 uscms3045@cmsanalysi 0 1 0 uscms3045@cmsanalysi 0 1 0 Total 3528 1775 1 Total 3528 1775 1 ● The long format contains info about limits CERN, Dec 2012 HTCondor priorities 16
  17. 17. Job Priorities CERN, Dec 2012 HTCondor priorities 17
  18. 18. Priority-FIFO ● So, a user will have many jobs ● In which order will they be executed? ● HTCondor guarantees the Priority-FIFO policy ● Each jobs has a priority associated with it User-specific – will not affect priority between users ● Jobs in the same priority class will start in FIFO order ● Jobs with higher priority always start before jobs with lower priority – i.e. higher priority is better CERN, Dec 2012 HTCondor priorities 18
  19. 19. Non-uniform environments ● Of course, everything is contingent to matching ● P-FIFO only applies to jobs that match at least one Slot ● If not all Slots are uniform ● Lower priority (or submitted late) Jobs may start before high priority (or submitted early) Jobs if the latter do not match any Unclaimed Slots CERN, Dec 2012 HTCondor priorities 19
  20. 20. Job restarts ● If an execute node dies for whatever reason, HTCondor will try to re-start the job that was running there somewhere else ● In a typical (glidein) setup, it will get the next available matching slot for that user ● i.e. it will not preempt a lower priority job CERN, Dec 2012 HTCondor priorities 20
  21. 21. Multiple submit nodes ● The same user may have submitted jobs on many submit nodes ● Here assuming they share the same Domain name ● Each submit node will handle its jobs on its own ● No guarantee on the execution order between jobs on different node ● HTCondor will try to Round-Robin between them ● In 7.9.x, HTCondor can be configured to treat the Job priority as a global property NEW ● i.e. first high priority jobs, no matter which submitter ● But still no guarantee within the prio. class CERN, Dec 2012 HTCondor priorities 21
  22. 22. Priorities in glideinWMS CERN, Dec 2012 HTCondor priorities 22
  23. 23. None ● The glideinWMS layer does not handle priorities in any shape or form ● All jobs from all users treated the same ● Although it may create different execute node requirements for some of them – But it is effectively a binary decision CERN, Dec 2012 HTCondor priorities 23
  24. 24. The End CERN, Dec 2012 HTCondor priorities 24
  25. 25. Pointers ● HTCondor Home Page http://research.cs.wisc.edu/htcondor/ ● HTCondor support htcondor-users@cs.wisc.edu htcondor-admin@cs.wisc.edu CERN, Dec 2012 HTCondor priorities 25
  26. 26. Acknowledgments ● The creation of this document was sponsored by grants from the US NSF and US DOE, and by the University of California system CERN, Dec 2012 HTCondor priorities 26

×