Programming the cloud

with Skywriting"

         Derek Murray"
Outline"
•  Existing parallel programming models"

•  Introduction to Skywriting"

•  Ongoing work"
SMP programming"
•  Symmetric multiprocessing"
  –  All cores share the same address space"
  –  Usually assumes cache-coh...
Distributed programming"
•  Shared nothing*"
  –  Processors communicate by message passing"
  –  Standard assumption for ...
Task parallelism"
•  Master-worker architecture"
  –  Master maintains task queue"
  –  Workers execute independent tasks ...
Task graph"

     A runs before B"

A"                      B"

     B depends on A"
MapReduce"
•  Two kinds of task: map and reduce"
•  User-specified map and reduce functions"
  –  map() a record to a list ...
Dryad"
•  Task graph is first class"
  –  Vertices run arbitrary code"
  –  Multiple inputs and inputs"
  –  Channels speci...
Architecture"

               Code"

 Driver
      Results"
program"
while	
  (!converged)	
  
	
  	
  	
  	
  do	
  work	
  in	
  parallel;	
  
Existing systems"

                        Code"

     Driver
           Results"
    program"
                        Cod...
Existing systems"

                        Code"

     Driver
           Results"
    program"
                        Cod...
Skywriting"
                  while	
  (…)	
  
                  	
  doStuff();	
  
                       Code"



      ...
Spawning a Skywriting task"

function	
  f(arg1,	
  arg2)	
  {	
  …	
  }	
  

result	
  =	
  spawn(f,	
  [arg1,	
  arg2]);...
Building a task graph"
function	
  f(x,	
  y)	
  {	
  …	
  }	
  
function	
  g(x,	
  y)	
  {	
  …	
  }	
             f	
  ...
Iterative algorithm"
current	
  =	
  …;	
  
do	
  {	
  
	
  	
  	
  	
  prev	
  =	
  current;	
  
	
  	
  	
  	
  a	
  =	
...
Iterative algorithm"
f	
      f	
     f	
  


         g	
             h	
  


f	
      f	
     f	
  
Aside: recursive algorithm"
function	
  f(x)	
  {	
  
	
  	
  	
  	
  if	
  (/*	
  x	
  is	
  small	
  enough	
  */)	
  {	...
Performance case studies"
•  All experiments used Amazon EC2"
  –  m1.small instances, running Ubuntu 8.10"


•  Microbenc...
Job creation overhead"
                      60"
Overhead (seconds)"



                      50"
                      40...
Smith-Waterman data flow"
Parallel Smith-Waterman"
Parallel Smith-Waterman"
                  350"
                  300"
Time (seconds)"




                  250"
        ...
Future work: manycore"
•  Trade-offs are different"
  –  Centralised master may become a bottleneck"
  –  Switch to local ...
Future work: message-passing"
•  Language designed for batch processing"
  –  However, batches may be infinitely long!"
•  ...
Conclusions"
•  Turing-complete programming language
   for cloud computing"

•  Runs real jobs with low overhead"

•  Lot...
Questions?"
•  Email"
  –  Derek.Murray@cl.cam.ac.uk"
•  Project website"
  –  http://www.cl.cam.ac.uk/netos/skywriting/	
...
Upcoming SlideShare
Loading in …5
×

Programming the cloud with Skywriting

2,020 views
1,907 views

Published on

An introduction to cloud programming models and the Skywriting project. Talk originally given at the University of Cambridge, on 11th June 2010.

More information about the Skywriting project can be found here: http://www.cl.cam.ac.uk/netos/skywriting/

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,020
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
27
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Programming the cloud with Skywriting

  1. 1. Programming the cloud
 with Skywriting" Derek Murray"
  2. 2. Outline" •  Existing parallel programming models" •  Introduction to Skywriting" •  Ongoing work"
  3. 3. SMP programming" •  Symmetric multiprocessing" –  All cores share the same address space" –  Usually assumes cache-coherency" •  Multi-threaded programs" –  Threads created dynamically" –  Shared, writable data structures" –  Synchronisation" •  Atomic compare-and-swap" •  Mutexes, Semaphores" •  {Software, Hardware} Transactional Memory"
  4. 4. Distributed programming" •  Shared nothing*" –  Processors communicate by message passing" –  Standard assumption for large supercomputers" –  ...and data centres" –  ...and recent manycore machines (e.g. Intel SCC)" •  Explicit messages" –  MPI, Pregel, Actor-based" •  Implicit messages: task parallelism" –  MapReduce, Dryad"
  5. 5. Task parallelism" •  Master-worker architecture" –  Master maintains task queue" –  Workers execute independent tasks in parallel" •  Fault tolerance" –  Re-execute tasks on failed workers" –  Speculatively re-execute “slow” tasks" •  Load balancing" –  Workers consume tasks at their own rate" –  Task granularity must be optimised"
  6. 6. Task graph" A runs before B" A" B" B depends on A"
  7. 7. MapReduce" •  Two kinds of task: map and reduce" •  User-specified map and reduce functions" –  map() a record to a list of key-value pairs" –  reduce() a key and the list of related values" •  Tasks apply functions to data partitions" M" R" M" R" M" R"
  8. 8. Dryad" •  Task graph is first class" –  Vertices run arbitrary code" –  Multiple inputs and inputs" –  Channels specify data transport" •  Graph must be acyclic and finite" –  Permits topological sorting" –  Prevents unbounded iteration"
  9. 9. Architecture" Code" Driver
 Results" program"
  10. 10. while  (!converged)          do  work  in  parallel;  
  11. 11. Existing systems" Code" Driver
 Results" program" Code" Driver program" Results" while  (…)   Code"  submitJob();   Results" Code" Results"
  12. 12. Existing systems" Code" Driver
 Results" program" Code" Driver program" while  (…)    submitJob();  
  13. 13. Skywriting" while  (…)    doStuff();   Code" Results" •  Turing-complete language for job specification" •  Whole job executed on the cluster"
  14. 14. Spawning a Skywriting task" function  f(arg1,  arg2)  {  …  }   result  =  spawn(f,  [arg1,  arg2]);   //  result  is  a  “future”   value_of_result  =  *result;  
  15. 15. Building a task graph" function  f(x,  y)  {  …  }   function  g(x,  y)  {  …  }   f   function  h(x,  y)  {  …  }   a   a   a  =  spawn(f,  [7,  8]);   g   g   b  =  spawn(g,  [a,  0]);   c  =  spawn(g,  [a,  1]);   d  =  spawn(h,  [b,  c]);   b   c   return  d;   h   d  
  16. 16. Iterative algorithm" current  =  …;   do  {          prev  =  current;          a  =  spawn(f,  [prev,  0]);          b  =  spawn(f,  [prev,  1]);          c  =  spawn(f,  [prev,  2]);          current  =  spawn(g,  [a,  b,  c]);          done  =  spawn(h,  [current]);   while  (!*done);    
  17. 17. Iterative algorithm" f   f   f   g   h   f   f   f  
  18. 18. Aside: recursive algorithm" function  f(x)  {          if  (/*  x  is  small  enough  */)  {                  return  /*  do  something  with  x  */;          }  else  {                  x_lo  =  /*  bottom  half  of  x  */;                  x_hi  =  /*  top  half  of  x  */;                  return  [spawn(f,  [x_lo]),                                  spawn(f,  [x_hi])];          }   }  
  19. 19. Performance case studies" •  All experiments used Amazon EC2" –  m1.small instances, running Ubuntu 8.10" •  Microbenchmark" •  Smith-Waterman"
  20. 20. Job creation overhead" 60" Overhead (seconds)" 50" 40" 30" Hadoop" 20" Skywriting" 10" 0" 0" 20" 40" 60" 80" 100" Number of workers"
  21. 21. Smith-Waterman data flow"
  22. 22. Parallel Smith-Waterman"
  23. 23. Parallel Smith-Waterman" 350" 300" Time (seconds)" 250" 200" 150" 100" 50" 0" 1" 10" 100" 1000" 10000" Number of tasks"
  24. 24. Future work: manycore" •  Trade-offs are different" –  Centralised master may become a bottleneck" –  Switch to local work-queues and work-stealing" –  Distributed scoreboards for futures" –  Optimised interpreter/compilation?" •  Multi-scale hybrid approaches" –  Multiple cores" –  Multiple machines" –  Multiple clouds..."
  25. 25. Future work: message-passing" •  Language designed for batch processing" –  However, batches may be infinitely long!" •  Can we apply it to streaming data?" –  Log files" –  Financial reports" –  Sensor data" •  Can we include data dependencies?" –  High- and low-frequency streams"
  26. 26. Conclusions" •  Turing-complete programming language for cloud computing" •  Runs real jobs with low overhead" •  Lots more still to do!"
  27. 27. Questions?" •  Email" –  Derek.Murray@cl.cam.ac.uk" •  Project website" –  http://www.cl.cam.ac.uk/netos/skywriting/  

×