Cloud Camp Chicago Dec 2012 Slides

696
-1

Published on

The slides from the December 2012 Cloud Camp Chicago. The slides include slides from our speakers: Dave Falck, Model Metrics: node.js on AWS; Paul Mantz, CohesiveFT: Working with APIs; Bob Chojnacki, Jellyvision Labs: Hadoop on AWS; Karl Zimmerman, Steadfast: Keep control with the Private Cloud

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
696
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cloud Camp Chicago Dec 2012 Slides

  1. 1. Sponsored by Welcome to Cloud Chicago Hosted by Live Tweet on the second screen by using: #cloudcamp @cloudcamp_chi 1Thursday, December 13, 12
  2. 2. Agenda 6:00pm Registration, Food, Drinks and Networking 6:30 Opening Remarks, Patrick Kerpan, CoehsiveFT 6:45 Lightning Talks Dave Falck, Model Metrics: node.js on AWS Paul Mantz, CohesiveFT: Working with APIs Bob Chojnacki, Jellyvision Labs: Hadoop on AWS Karl Zimmerman, Steadfast: Keep control with the Private Cloud 7:45 Unpanel: “Who’s in Control of Your Cloud? Security and Visibility” Emceed by Mike Dorosh, IBM & Patrick Kerpan, CoehsiveFT 8:30 Breakout Sessions 9:00 Wrap Up - Drinks, anyone? #cloudcamp @cloudcamp_chiThursday, December 13, 12
  3. 3. Sponsored by Dave Falck, Customer Solutions Engineer Hosted by #cloudcamp @cloudcamp_chiThursday, December 13, 12
  4. 4. Node.js  +  AWS   @davidfalck  
  5. 5. Why  the  Node.js  Buzz?    *  LinkedIn’s  entire  mobile  software  stack  is  completely   built  in  Node  *  Why?  Scale.  *  Huge  performance  gains  compared  to  what  they  were   using  before  (Ruby  on  Rails)  *  Went  from  running  15  servers  with  15  instances  (virtual   servers)  on  each  physical  machine,  to  just  four   instances  that  can  handle  double  the  traffic.      
  6. 6. What  is  Node.js?    *  Javascript  platform  based  on  Google  Chrome  V8  JS   Engine    *  Ryan  Dahl  (Joyent)  *  Event-­‐driven,  non-­‐blocking  I/O  model  to  allow  your   applications  to  scale  while  keeping  you  from  having  to   deal  with  threads,  polling,  timeouts,  and  event  loops  *  FAST   *  Used  for  real-­‐time,  data-­‐intensive  apps  (mobile!)  *  POPULAR    
  7. 7. Node.js  on  GitHub  
  8. 8. Hello  World  var  http  =  require(http);  http.createServer(function  (req,  res)  {      res.writeHead(200,  {Content-­‐Type:  text/plain});      res.end(Hello  Worldn);  }).listen(1337,  127.0.0.1);  
  9. 9. What  makes  Node.js  so  fast?  *  Thread-­‐based  networking  is  inefficient  and  difficult  *  Node  shows  much  better  memory  efficiency  under  high-­‐ loads  than  systems  which  allocate  2mb  thread  stacks  for   each  connection.    *  Users  of  Node  are  free  from  worries  of  dead-­‐locking  the   process  (*there  are  no  locks*)  *  Almost  no  function  in  Node  directly  performs  I/O,  so  the   process  never  blocks.    *  Because  nothing  blocks,  less-­‐than-­‐expert  programmers   are  able  to  develop  fast  systems  
  10. 10. Under  the  Node.js  hood     Javascript?  
  11. 11. Under  the  Node.js  hood    *  Javascript!   *  Platform  independent   *  Easy  to  use   *  Ubiquitous  *  Google  Chrome’s  V8  Javascript  Engine   *  Translates  JS  into  machine  code  (not  interpreted)  
  12. 12. When  not  to  use  Node.js    *  Node.js  is  not  ideal  for  CPU  intensive  jobs  like  sorting,   transformations,  number  crunching,  analytics…  *  Traditional  CRUD  web  apps  that  need  to  be  highly   concurrent,  performance  degradation  will  occur  when   the  data  is  needed  to  be  transformed…    *  You  can  offload  processing  to  another  language  that   is  better  at  making  use  of  the  CPU  *  Cultural  fit?  Too  new?    You  decide…  
  13. 13. Node.js  +  AWS  *  Dec  6th:  AWS  released  developer  preview  of  node.js   libraries  to  access  AWS:   *  DynamoDB   *  S3   *  EC2     *  SWS  *  Allows  you  to  manage  parallel  calls  to  several  AWS   web  services  
  14. 14. Node.js  +  Other  Clouds  *  Azure    *  Joyent  *  EngineYard  *  Heroku  
  15. 15. More  info  *  http://nodejs.org  *  http://en.wikipedia.org/wiki/Nodejs  *  http://aws.typepad.com/aws/2012/12/aws-­‐sdk-­‐for-­‐ nodejs-­‐now-­‐available-­‐in-­‐preview-­‐form.html  *  http://www.jamesward.com/2011/06/21/getting-­‐ started-­‐with-­‐node-­‐js-­‐on-­‐the-­‐cloud/  *  http://venturebeat.com/2011/08/16/linkedin-­‐node/  
  16. 16. Sponsored by Paul Mantz, Software Engineer Hosted by #cloudcamp @cloudcamp_chiThursday, December 13, 12
  17. 17. APIs in Cloud Environments Paul Mantz Copyright CohesiveFT - Dec 13, 2012 1Thursday, December 13, 12
  18. 18. API Command-Line Clients • Benefits to Creating API Command-Line Clients • Lowers barrier of entry • Familiar to technical consumers • Advanced usage cases • Integrates into existing toolsets Copyright CohesiveFT - Dec 13, 2012 2Thursday, December 13, 12
  19. 19. API Command-Line Clients Excellent Internal Developer Tool • Excellent for testing and rapid development • Useful operations tool Copyright CohesiveFT - Dec 13, 2012 3Thursday, December 13, 12
  20. 20. API Command-Line Clients Reference Implementation • Gives developers an example to integrate the API • Helps users model workflows • DSL Copyright CohesiveFT - Dec 13, 2012 4Thursday, December 13, 12
  21. 21. API Command-Line Clients Excellent Demo Tool • Quick installation, often one file Copyright CohesiveFT - Dec 13, 2012 5Thursday, December 13, 12
  22. 22. Sponsored by Bob Chojnacki, Programmer Hosted by #cloudcamp @cloudcamp_chiThursday, December 13, 12
  23. 23. Big  Data  in  the  Cloud  A  Journey  into  the  unknown  
  24. 24. Who  Jellyvision  is  and  why  are   analy9cs  important  to  us  •  We  create  interac9ve  experiences   –  Desktop   –  Mobile  •  …  which  ask  ques9ons,  inform  people,  generate  leads  •  “Virtual  Advisors”  •  We  also  collect  analy9cs  in  real  9me  to  generate  reports   about:   –  How  people  answered  a  ques9on   –  Where  they  dropped  out   –  Lots  of  impressive  stats!    
  25. 25. The  Problem  •  Longer  term  projects  and  high  volume   projects  causing  MySQL  to  bust  at  the  seams  •  Some  types  of  reports  taking  too  long,  or   causing  MySQL  to  crash  if  we  include  too   much  data  •  In  all  fairness,  we  could  probably  tune  MySQL,   throw  it  on  bigger  servers,  more  memory  •  Diminishing  returns  •  MySQL  is  fine  for  collec9ng  the  data…  
  26. 26. The  Solu9on  •  Hadoop!  •  Why  Hadoop?  Lots  of  possibili9es  out  there,   but  which  one  to  use?  Cassandra,  CouchDB,   Hadoop,  Membase,  MongoDB,  Neo4j,  …  •  Big  Data  meetups  tended  to  have  lots  of   people  using  Hadoop  •  And  I  knew  others  using  it.  •  And  Hortonworks  had  a  fancy  point  and  click   solu9on  I  could  use  to  get  started  quickly  
  27. 27. Op9ons  with  op9ons  •  Now  that  I  picked  Hadoop,  I  had  several   op9ons,  and  op9ons  within  op9ons  to  use  to   analyze  my  data:   –  Hive,  Pig,  MapReduce,  Java,  R  •  I  knew  Java  •  MapReduce  seemed  to  make  sense  •  I’ll  probably  play  with  Hive  and  Pig  next  
  28. 28. It’s  All  About  The  Data  •  Visit  data  •  Event  data  •  Denormaliza9on  of  data  •  Generated  a  ton  of  fake  data:   –  Started  with  600K  visits,  3M  events   –  Moved  up  to  1.8M  visits,  60M  events  
  29. 29. Make  it  so  •  First  experience:  Hortonworks  Virtual  Sandbox   –  Single  node  AMI  at  Amazon   –  Hadoop  1.0   –  600K  visits,  3M  events  •  On  our  exis9ng  placorm  we  needed  to  break  reports  up  into   smaller  chunks  for  some  data  because  MySQL  could  not  handle  it.  •  Results!  What  would  have  taken  hours,  took  only  5  minutes  on  a   single  node  Hadoop  "cluster”  •  In  reality,  some  of  the  queries  I  could  also  run  with  command-­‐line   tools  (wc,  grep,  awk)  on  the  data  considerably  faster  than  even   Hadoop.  •  Important  lessons  learned  so  far:   –  Think  outside  the  RDBMS:  they  are  great,  but  it  may  not  make  sense   for  all  types  data  
  30. 30. Looking  at  more  real  data  •  Now,  lets  generate  data  that  is  much  closer  to  some  of  our  product  •  Instead  of  one  ques9on  and  answer,  how  about  15  ques9ons?    Add   in  some  other  events  gives  a  total  of  34  events.  •  Throw  in  some  people  returning,  some  of  them  mul9ple  9mes  •  Throw  in  some  people  who  dont  start  the  conversa9on,  etc.  •  Run  my  lijle  auto-­‐data-­‐generator  and  BOOM!  20  million  events   and  4.4GB  later  I  have  my  data…  •  …  which  took  up  too  much  disk  space  to  run  on  the  demo  system  I   was  using.    Might  as  well  turbo-­‐charge  this  puppy...  
  31. 31. More  disk  space!  •  Full  install  of  Hadoop  (Hortonworks  HDP)  •  Single  node  •  600K  visits,  20M  events   –  6m  29s,  ~30s  aner  map  phase  completed  •  1.8M  visits,  60M  events   –  18m  3s,  ~90s  aner  map  phase  completed  
  32. 32. More  nodes  •  3  nodes:  11m  •  4  nodes:  9m  16s  •  Yay!  Nodes!  
  33. 33. Caveats  •  Not  using  Hadoop  to  its  fullest  /  basically  a   weekend  job  •  Algorithms  employed  in  this  example  probably   wont  end  up  it  a  book  alongside  Knuth’s  
  34. 34. Next  steps  •  Make  sure  results  on  real  data  lines  up  •  Integrate  with  team  to  generate  reports  they   need  
  35. 35. End  stuff  •  Thanks  to  the  folks  at  Hortonworks  who   answered  my  fran9c  and  spas9c  ques9ons.  
  36. 36. Sponsored by Karl Zimmerman, President Hosted by #cloudcamp @cloudcamp_chiThursday, December 13, 12
  37. 37. Keep Your Control.Private Cloud with Karl Zimmerman, CEO of Steadfast.
  38. 38. Private Cloud:What do we mean? Private cloud is a form of cloud computing where the customer has some control/ownership of the service implementation. It is a scalable, elastic IaaS solution based on cloud computing but with more control over resources.
  39. 39. Private Cloud:What are the advantages? Security Availability No vendor lock-in Ease of management
  40. 40. Private Cloud:Security Dedicated & segregated resources More options to integrate with existing security
  41. 41. Private Cloud:Availability Understanding and control of the infrastructure Get the resources you need, when you need them Youre not subject to the whims of other users
  42. 42. Private Cloud:Vendor Lock-In No "secret sauce." Utilize true open source
  43. 43. Private Cloud:Management Easier to find employees with general IT knowledge Utilize a broader array of tools and software Get support/assistance from multiple levels
  44. 44. Private Cloud:To SummarizePrivate cloud can deliver what you need out of a publiccloud, but giving you more control. Losing control oversecurity, availability and issues like vendor lock-in andmanagement vanish into thin air like, well, a cloud. And thefact that it doesn’t have to cost you more is a plus, too.
  45. 45. Sponsored by Unpanel: “Who’s in Control of Your Cloud? Security and Visibility” Hosted by Emceed by: Mike Dorosh, Program Manager –Cloud Technical Partnerships, IBM  & Patrick Kerpan CEO, CoehsiveFT #cloudcamp @cloudcamp_chiThursday, December 13, 12
  46. 46. #cloudcamp @cloudcamp_chiThursday, December 13, 12
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×