Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Composing and scaling data platforms

Composing and scaling data platforms

  • Login to see the comments

Composing and scaling data platforms

  1. 1. Composing  and  Scaling  Data  Platforms   Rahul  Kumar  
  2. 2. Data  Representation Architecture Parallelism Talk  Highlights
  3. 3.  As  software  engineer  we  are  inevitably  affected  by  the  tools  we  surrounded  ourself  with   Process all  act  to  shape  the  software  we  build. Language Frameworks
  4. 4. Likewise  database,  which  have  trodden  a  very  specific  path,  inevitably  affect  the  way   we  treat  mutability  and  share  state  in  our  application.  
  5. 5. 5 Today’s data platforms range greatly in complexity. From simple caching layers or Polyglot Persistence right through to wholly integrated data pipelines. There are many paths. They go to many different places. So the aim for this talk is to explain how and why some of these popular approaches work. http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/ This  talk  is  based  on  Ben  Stopford’s  actual  presentation.  
  6. 6. 6 Computer  work  best  with  sequential  workload When we’re dealing with data, we’re really just arranging locality. Locality to the CPU. Locality to the other data we need.
  7. 7. 7 Accessing  data  sequentially  is  an  important   component  of  this.     Computers  are  just  good  at  sequential  operations.     Sequential  operations  can  be  predicted.    
  8. 8. 8 Random  vs  Sequential  Addressing If  you’r  taking  data  from  disk  sequentially  it  will    be  pre-­‐fetched  in  to     the  disk  buffer,     the  page  cache  and     the  different  levels  of  CPU  caching. But it does little to help the addressing of data at random, be it in main memory, on disk or over the network. In fact pre-fetching actually hinders random workloads as the various caches and frontside bus fill with data which is unlikely to be used.
  9. 9. 9 Streaming  data  sequentially  from  disk  can  actually   outperform  randomly  addressed  main  memory.     So  disk  may  not  always  be  quite  the  tortoise  we   think  it  is,     at  least  not  if  we  can  arrange  sequential  access.    
  10. 10. 10 We  want  to  keep  writes  and  reads  sequential,  as  it  works  well  with  the   hardware.     We  can  append  writes  to  the  end  of  the  file  efficiently.     We  can  read  by  scanning  the  the  file  in  its  entirety.     Any  processing  we  wish  to  do  can  happen  as  the  data  streams  through  the   CPU.     We  might  filter,  aggregate  or  even  do  something  more  complex.    
  11. 11. 11 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  12. 12. 12 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  13. 13. 13
  14. 14. 14 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  15. 15. 15 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  16. 16. 16 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  17. 17. 17 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  18. 18. 18 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  19. 19. 19 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  20. 20. 20 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  21. 21. 21 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  22. 22. 22 Parallelism
  23. 23. 23 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  24. 24. 24 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  25. 25. 25 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  26. 26. 26 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  27. 27. 27 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  28. 28. 28 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  29. 29. 29 Architecture
  30. 30. 30 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  31. 31. 31 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  32. 32. 32 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  33. 33. 33 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  34. 34. 34 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  35. 35. 35 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  36. 36. 36 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  37. 37. 37 http://www.benstopford.com/2015/04/28/elements-­‐of-­‐scale-­‐composing-­‐and-­‐scaling-­‐data-­‐platforms/
  38. 38. Thank You

×