Cw13 0.01-final
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Cw13 0.01-final

  • 428 views
Uploaded on

Presentation on miscreant jobs in HTCondor presented at HTCondor week 2013. Showing how to reduce the number of bad jobs run and increase the chances of good jobs running quickly.

Presentation on miscreant jobs in HTCondor presented at HTCondor week 2013. Showing how to reduce the number of bad jobs run and increase the chances of good jobs running quickly.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
428
On Slideshare
428
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Making  HTCondor  Energy  Efficient  by  iden5fying  miscreant  jobs  Stephen  McGough,  Ma@hew  Forshaw    &  Clive  Gerrard  Newcastle  University  Stuart  Wheater  Arjuna  Technologies  Limited  
  • 2. Mo5va5on   Policy  and  Simula5on   Conclusion  Task  lifecycle  5me  HTC  user  submits    task  resource  selec5on  Interac5ve  user  logs    in  Task  evic5on  resource  selec5on  Computer  reboot  resource  selec5on  Interac5ve  user  logs    in  …  resource  selec5on  Task  evic5on  Task  evic5on  Task  comple5on  5me  …  Good  Bad  ‘Miscreant’  –  has  been  evicted  but  don’t  know  if  it’s  good  or  bad    Mo5va5on  
  • 3. Mo5va5on   Policy  and  Simula5on   Conclusion  Mo5va5on  •  We  have  run  a  high-­‐throughput  cluster  for  ~6  years  –  Allowing  many  researchers  to  perform  more  work  quicker  •  University  has  strong  desire  to  reduce  energy  consump5on  and  reduce  CO2  produc5on  –  Currently  powering  down  computer  &  buying  low  power  PCs  –  “If  a  computer  is  not  ‘working’  it  should  be  powered  down”  •  Can  we  go  further  to  reduce  wasted  energy?  –  Reduce  5me  computers  spend  running  work  which  does  not  complete  –  Prevent  re-­‐submission  of  ‘bad’  jobs  –  Reduce  the  number  of  resubmissions  for  ‘good’  jobs  •  Aims  –  Inves5gate  policy  for  reducing  energy  consump5on  –  Determine  the  impact  on  high-­‐throughput  users  Mo5va5on  
  • 4. Mo5va5on   Policy  and  Simula5on   Conclusion  Can  we  fix  the  number  of  retries?  0 200 400 600 800 1000 1200 1400 1600 1800 2000100200300400500600700800900Number of evictionsCumulativewastedseconds(millions)Good JobsBad Jobs    •  ~57  years  of  compu5ng  5me  during  2010  •  ~39  years  of  wasted  5me  –  ~27  years  for  ‘bad’  tasks  :  average  45  retries  :  max  1946  retries  –  ~12  years  for  ‘good’  tasks  :  average  1.38  retries  :  max  360  retries  •  100%  ‘good’  task  comple5on  -­‐>  360  retries  •  S5ll  wastes  ~13  years  on  ‘bad’  tasks  –  95%  ‘good’  task  comple5on  -­‐>  3  retries  :  9,808  good  tasks  killed  (3.32%)  –  99%  ‘good’  task  comple5on  -­‐>  6  retries  :    2,022  good  tasks  killed  (0.68%)  Mo5va5on  
  • 5. Mo5va5on   Policy  and  Simula5on   Conclusion  0 200 400 600 800 1000 1200 14000200,000400,000600,000800,0001,000,0001,200,0001,400,0001,600,0001,800,000Idle time (minutes)CumulativefreesecondsCan  we  make  tasks  short  enough?  •  Make  tasks  short  enough  to  reduce  miscreants  •  Average  idle  interval  –  371  minutes  •  But  to  ensure  availability  of  intervals  – 95%  :  need  to  reduce    5me  limit  to  2  minutes  – 99%  :  need  to  reduce      5me  limit  to  1  minute    •  Imprac5cal  to  make  tasks  this  short  Mo5va5on  
  • 6. Mo5va5on   Policy  and  Simula5on   Conclusion  Cluster  Simula5on  •  High  Level  Simula5on  of  Condor  – Trace  logs  from  a  twelve  month  period  are  used  as  input  •  User  Logins  /  Logouts  (computer  used)  •  Condor  Job  Submission  5mes  (‘good’/’bad’  and  dura5on)  end applications are unlikely to migrate to virtual desktopsor user owned devices due to hardware requirements andicensing conditions, so we expect to need to maintain apool of hardware that will be useful for Condor for someime.PUE values have been assigned at the cluster level withvalues in the range of 0.9 to 1.4. These values have not beenempirically evaluated but used here to steer jobs. In mostcases the cluster rooms have a low enough computer densitynot to require cooling giving these clusters a PUE value of1.0. However, two clusters are located in rooms that requireair conditioning, giving these a PUE of 1.4. Likewise, fourclusters are based in a basement room, which is cold allyear round; hence computer heat is used to offset heatingrequirements for the room, giving a PUE value of 0.9.By default computers within the cluster will enter thesleep state after a given interval of inactivity. This time willdepend on whether the cluster is open or not. During openhours computers will remain in the idle state for one hourbefore entering the sleep state whilst outside of these hourshe idle interval before sleep is reduced to 15 minutes. Thispolicy (P2) was originally trialled under Windows XP wherehe time for computers to resume from the shutdown statewas considerable (sleep was an unreliable option for ourenvironment). Likewise the time interval before a Condorob could start using a computer (M1) was set to be 15minutes during cluster opening hours and 0 minutes outsideTable I: Computer TypesType Cores Speed Power ConsumptionActive Idle SleepNormal 2 ⇠3Ghz 57W 40W 2WHigh End 4 ⇠3Ghz 114W 67W 3WLegacy 2 ⇠2Ghz 100-180W 50-80W 4WFigure 3 illustrates the interactive logins for this periodshowing the high degree of seasonality within the data. Itis easy to distinguish between week and weekends as wellas where the three terms lie along with the vacations. Thisrepresents 1,229,820 interactive uses of the computers.Figure 4 depicts the profile for the 532,467 job submis-sions made to Condor during this period. As can be seenthe job submissions follow no clearly definable pattern. Notethat out of these submissions 131,909 were later killed by theoriginal Condor user. In order to simulate these killed jobsthe simulation assumes that these will be non-terminatingjobs and will keep on submitting them to resources until thetime at which the high-throughput user terminates them. Thegraph is clipped on Thursday 03/06/2010 as this date had93,000 job submissions.For the simulations we will report on the total powerconsumed (in MWh) for the period. In order to determinethe effect on high-throughput users of a policy we will alsoreport the average overhead observed by jobs submitted toCondor (in seconds). Where overhead is defined to be theamount of time in excess of the execution duration of the job.Other statistics will be reported as appropriate for particular!"!#$!"#$#$#!$#!!$!"#$%&()*"$#+,,+(-,."-/&%/,012%Figure 4: Condor job submission profileAc5ve  User  /  Condor  Idle  Sleep  WOLZZZCycle Stealing UsersInteractive UsersHigh-ThroughputManagementZZZManagement policyPolicy  and  Simula5on  
  • 7. Mo5va5on   Policy  and  Simula5on   Conclusion  0 5 10 15 20 25 30101102103104105Number of retries (n)NumberofgoodtaskskilledN1 C1N1 C2N1 C3N2 C1N2 C2N2 C3N3 C1N3 C2N3 C3n  realloca5on  policies  •  N1(n):  Abandon  task  if  deallocated  n  5mes.  •  N2(n):  Abandon  task  if  deallocated  n  5mes  ignoring  interac5ve  users.  •  N3(n):  Abandon  task  if  deallocated  n  5mes  ignoring  planned  machine  reboots.  •  C1:  Tasks  allocated  to  resources  at  random,  favouring  awake  resources  •  C2:  Target  less  used  computers  (longer  idle  5mes)  •  C3:  Tasks  are  allocated  to  computers  in  clusters  with  least  amount  of  5me  used  by  interac5ve  users  0 5 10 15 20 25 3033.544.555.56x 106Number of retries (n)Energyconsumption(MWh)N1 C1N1 C2N1 C3N2 C1N2 C2N2 C3N3 C1N3 C2N3 C30 5 10 15 20 25 30024681012x 105Number of retries (n)OverheadsonallgoodtasksN1 C1N1 C2N1 C3N2 C1N2 C2N2 C3N3 C1N3 C2N3 C3Policy  and  Simula5on  
  • 8. Mo5va5on   Policy  and  Simula5on   Conclusion  0 10 20 30 40 50 60 70 80 90 100101102103104105Accrued time (hours)NumberofgoodtaskskilledA1 C1A1 C2A1 C3A2 C1A2 C2A2 C3A3 C1A3 C2A3 C3Accrued  Time  Policies  •  Impose  a  limit  on  cumula5ve  execu5on  5me  for  a  task.  •  A1(t):  Abandon  if  accrued  5me  >  t  and  task  deallocated.  •  A2(t):  As  A1,  discoun5ng  interac5ve  users..  •  A3(t):  As  A1,  discoun5ng  reboots.   0 10 20 30 40 50 60 70 80 90 1003.544.555.56x 106Accrued time (hours)Energyconsumption(MWh)A1 C1A1 C2A1 C3A2 C1A2 C2A2 C3A3 C1A3 C2A3 C30 10 20 30 40 50 60 70 80 90 1003456789101112x 105Accrued time (hours)OverheadsonallgoodtasksA1 C1A1 C2A1 C3A2 C1A2 C2A2 C3A3 C1A3 C2A3 C3Policy  and  Simula5on  
  • 9. Mo5va5on   Policy  and  Simula5on   Conclusion  Individual  Time  Policy  •  Impose  a  limit  on  individual  execu5on  5me  for  a  task.  –  Nightly  reboots  bound  this  to  24  hours.  –  What  is  the  impact  of  lowering  this?  •  I1(t):  Abandon  if  individual  5me  >  t.  0 5 10 15 20 25 303.544.555.56x 106Individual time (hours)Energyconsumption(MWh)I1 C1I1 C2I1 C30 5 10 15 20 25100101102103104105Individual time (hours)NumberofgoodtaskskilledI1 C1I1 C2I1 C30 5 10 15 20 25 303456789101112x 105Individual time (hours)OverheadsonallgoodtasksI1 C1I1 C2I1 C3Policy  and  Simula5on  
  • 10. Mo5va5on   Policy  and  Simula5on   Conclusion  20 40 60 80 100 120 140106107108109101010111012Maximum execution duration (hours)Energyconsumption(MWh)m=10, n=10m=10, n=20m=10, n=30m=20, n=10m=20, n=20m=20, n=30m=30, n=10m=30, n=20m=30, n=30m=40, n=10m=40, n=20m=40, n=30Dedicated  Resources  D1(m,d):  Miscreant  tasks  are  permi@ed  to  con5nue  execu5ng  on  a  dedicated  set  of  m  resources  (without  interac5ve  users  or  reboots),  with  a  maximum  dura5on  d.  20 40 60 80 100 120 1400246810121416Maximum execution duration (hours)Numberofgoodtaskskilledm=10, n=10m=10, n=20m=10, n=30m=20, n=10m=20, n=20m=20, n=30m=30, n=10m=30, n=20m=30, n=30m=40, n=10m=40, n=20m=40, n=3020 40 60 80 100 120 1401.121.141.161.181.21.221.241.261.281.3x 106Maximum execution duration (hours)Overheadsonallgoodtasksm=10, n=10m=10, n=20m=10, n=30m=20, n=10m=20, n=20m=20, n=30m=30, n=10m=30, n=20m=30, n=30m=40, n=10m=40, n=20m=40, n=30Policy  and  Simula5on  
  • 11. Mo5va5on   Policy  and  Simula5on   Conclusion  Conclusion  •  Simple  policies  can  be  used  to  reduce  the  effect  of  miscreant  tasks  in  a  mul5-­‐use  cycle  stealing  cluster.  –  N2  (total  evic5ons  ignoring  users)  •  Order  of  magnitude  reduc5on  in  energy  consump5on  –  Reduce  amount  of  effort  wasted  on  tasks  that  will  never  complete  •  Policies  may  be  combined  to  achieve  further  improvements.  –  Adding  in  dedicated  computers  Conclusion  
  • 12. Ques5ons?  stephen.mcgough@ncl.ac.uk  m.j.forshaw@ncl.ac.uk    More  info:  McGough,  A  Stephen;  Forshaw,  Ma@hew;  Gerrard,  Clive;  Wheater,  Stuart;    Reducing  the  Number  of  Miscreant  Tasks  ExecuBons  in  a  MulB-­‐use  Cluster,    Cloud  and  Green  Compu5ng  (CGC),  2012