• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Programming the cloud with Skywriting
 

Programming the cloud with Skywriting

on

  • 2,379 views

An introduction to cloud programming models and the Skywriting project. Talk originally given at the University of Cambridge, on 11th June 2010. ...

An introduction to cloud programming models and the Skywriting project. Talk originally given at the University of Cambridge, on 11th June 2010.

More information about the Skywriting project can be found here: http://www.cl.cam.ac.uk/netos/skywriting/

Statistics

Views

Total Views
2,379
Views on SlideShare
2,371
Embed Views
8

Actions

Likes
0
Downloads
26
Comments
0

2 Embeds 8

https://www.linkedin.com 7
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Programming the cloud with Skywriting Programming the cloud with Skywriting Presentation Transcript

    • Programming the cloud
 with Skywriting" Derek Murray"
    • Outline" •  Existing parallel programming models" •  Introduction to Skywriting" •  Ongoing work"
    • SMP programming" •  Symmetric multiprocessing" –  All cores share the same address space" –  Usually assumes cache-coherency" •  Multi-threaded programs" –  Threads created dynamically" –  Shared, writable data structures" –  Synchronisation" •  Atomic compare-and-swap" •  Mutexes, Semaphores" •  {Software, Hardware} Transactional Memory"
    • Distributed programming" •  Shared nothing*" –  Processors communicate by message passing" –  Standard assumption for large supercomputers" –  ...and data centres" –  ...and recent manycore machines (e.g. Intel SCC)" •  Explicit messages" –  MPI, Pregel, Actor-based" •  Implicit messages: task parallelism" –  MapReduce, Dryad"
    • Task parallelism" •  Master-worker architecture" –  Master maintains task queue" –  Workers execute independent tasks in parallel" •  Fault tolerance" –  Re-execute tasks on failed workers" –  Speculatively re-execute “slow” tasks" •  Load balancing" –  Workers consume tasks at their own rate" –  Task granularity must be optimised"
    • Task graph" A runs before B" A" B" B depends on A"
    • MapReduce" •  Two kinds of task: map and reduce" •  User-specified map and reduce functions" –  map() a record to a list of key-value pairs" –  reduce() a key and the list of related values" •  Tasks apply functions to data partitions" M" R" M" R" M" R"
    • Dryad" •  Task graph is first class" –  Vertices run arbitrary code" –  Multiple inputs and inputs" –  Channels specify data transport" •  Graph must be acyclic and finite" –  Permits topological sorting" –  Prevents unbounded iteration"
    • Architecture" Code" Driver
 Results" program"
    • while  (!converged)          do  work  in  parallel;  
    • Existing systems" Code" Driver
 Results" program" Code" Driver program" Results" while  (…)   Code"  submitJob();   Results" Code" Results"
    • Existing systems" Code" Driver
 Results" program" Code" Driver program" while  (…)    submitJob();  
    • Skywriting" while  (…)    doStuff();   Code" Results" •  Turing-complete language for job specification" •  Whole job executed on the cluster"
    • Spawning a Skywriting task" function  f(arg1,  arg2)  {  …  }   result  =  spawn(f,  [arg1,  arg2]);   //  result  is  a  “future”   value_of_result  =  *result;  
    • Building a task graph" function  f(x,  y)  {  …  }   function  g(x,  y)  {  …  }   f   function  h(x,  y)  {  …  }   a   a   a  =  spawn(f,  [7,  8]);   g   g   b  =  spawn(g,  [a,  0]);   c  =  spawn(g,  [a,  1]);   d  =  spawn(h,  [b,  c]);   b   c   return  d;   h   d  
    • Iterative algorithm" current  =  …;   do  {          prev  =  current;          a  =  spawn(f,  [prev,  0]);          b  =  spawn(f,  [prev,  1]);          c  =  spawn(f,  [prev,  2]);          current  =  spawn(g,  [a,  b,  c]);          done  =  spawn(h,  [current]);   while  (!*done);    
    • Iterative algorithm" f   f   f   g   h   f   f   f  
    • Aside: recursive algorithm" function  f(x)  {          if  (/*  x  is  small  enough  */)  {                  return  /*  do  something  with  x  */;          }  else  {                  x_lo  =  /*  bottom  half  of  x  */;                  x_hi  =  /*  top  half  of  x  */;                  return  [spawn(f,  [x_lo]),                                  spawn(f,  [x_hi])];          }   }  
    • Performance case studies" •  All experiments used Amazon EC2" –  m1.small instances, running Ubuntu 8.10" •  Microbenchmark" •  Smith-Waterman"
    • Job creation overhead" 60" Overhead (seconds)" 50" 40" 30" Hadoop" 20" Skywriting" 10" 0" 0" 20" 40" 60" 80" 100" Number of workers"
    • Smith-Waterman data flow"
    • Parallel Smith-Waterman"
    • Parallel Smith-Waterman" 350" 300" Time (seconds)" 250" 200" 150" 100" 50" 0" 1" 10" 100" 1000" 10000" Number of tasks"
    • Future work: manycore" •  Trade-offs are different" –  Centralised master may become a bottleneck" –  Switch to local work-queues and work-stealing" –  Distributed scoreboards for futures" –  Optimised interpreter/compilation?" •  Multi-scale hybrid approaches" –  Multiple cores" –  Multiple machines" –  Multiple clouds..."
    • Future work: message-passing" •  Language designed for batch processing" –  However, batches may be infinitely long!" •  Can we apply it to streaming data?" –  Log files" –  Financial reports" –  Sensor data" •  Can we include data dependencies?" –  High- and low-frequency streams"
    • Conclusions" •  Turing-complete programming language for cloud computing" •  Runs real jobs with low overhead" •  Lots more still to do!"
    • Questions?" •  Email" –  Derek.Murray@cl.cam.ac.uk" •  Project website" –  http://www.cl.cam.ac.uk/netos/skywriting/