Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Finding Out More with Data Analytics and AWS

540 views

Published on

Finding Out More with Data Analytics and AWS

Published in: Technology
  • Login to see the comments

  • Be the first to like this

Finding Out More with Data Analytics and AWS

  1. 1. Finding out more from youradvertising campaigns using data analytics
  2. 2. 1 instance for 100 hours
  3. 3. 1 instance for 100 hours Video encoding Customer activity analysis Twitter parsing …
  4. 4. 1 instance for 100 hours =100 instances for 1 hour
  5. 5. Why would you need 100 instances? (or even more)
  6. 6. Big Data
  7. 7. Many definitionsVery large volume Three V’s: Social with low density Velocity interactions and of information Variety web activity data Volume Amongst many others..
  8. 8. Storage Big Data Compute Unconstrained data growth 95% of the 1.2 zettabytes of ZB data in the digital universe is unstructured 70% of of this is user- EB generated content Unstructured data growth explosive, with estimates of PB compound annual growth (CAGR) at 62% from 2008 –GB TB 2012. Source: IDC
  9. 9. Storage Big Data Compute Where does it come from?Web sites Sensor dataBlogs/Reviews/Emails/Pictures Weather, water, smart gridsSocial Graphs Images/videosFacebook, Linked-in, Contacts Traffic, security camerasApplication server logs TwitterWeb sites, games 50m tweets/day 1,400% growth per year
  10. 10. Why now?
  11. 11. Storage Big Data Compute Why now?Web sites Sensor dataBlogs/Reviews/Emails/Pictures Weather, water, smart gridsSocial Graphs Images/videosFacebook, Linked-in, Contacts Traffic, security camerasApplication server logs TwitterWeb sites, games 50m tweets/day 1,400% growth per year
  12. 12. Storage Big Data Compute Why now?Web sites Sensor data Weather, water, smart gridsMobile connected worldBlogs/Reviews/Emails/PicturesSocial Graphs Images/videosFacebook, Linked-in, Contacts Traffic, security cameras (more people using, easier to collect)Application server logs TwitterWeb sites, games 50m tweets/day 1,400% growth per year
  13. 13. Storage Big Data Compute Why now?Web sites Sensor data Weather, water, smart grids More aspects of dataBlogs/Reviews/Emails/PicturesSocial Graphs Images/videosFacebook, Linked-in, Contacts Traffic, security cameras (variety, depth, location, frequency)Application server logs TwitterWeb sites, games 50m tweets/day 1,400% growth per year
  14. 14. Storage Big Data Compute Why now?Web sites Sensor data Weather, water, smart gridsPossible to understandBlogs/Reviews/Emails/PicturesSocial Graphs Images/videosFacebook, Linked-in, Contacts Traffic, security cameras (not just answer specific questions)Application server logs TwitterWeb sites, games 50m tweets/day 1,400% growth per year
  15. 15. What’s different?
  16. 16. We can collect more
  17. 17. There is more
  18. 18. And data has gravity…
  19. 19. Storage Big Data Compute Data has gravityApp Data App http://blog.mccrory.me/2010/12/07/data-gravity-in-the-clouds/
  20. 20. Storage Big Data Compute …and inertia at volume… Data http://blog.mccrory.me/2010/12/07/data-gravity-in-the-clouds/
  21. 21. Storage Big Data Compute …easier to move applications to the data Data http://blog.mccrory.me/2010/12/07/data-gravity-in-the-clouds/
  22. 22. Cloud has the power to process
  23. 23. Lorem ipsum dolor sitStorage Big Data Compute met, consectetur Bring compute capacity to the data dipiscing elit. Etiam Lorem ipsum dolor uis ligula neque, eget amet, consecte enenatis sem. Personal adipiscing elit. EtiaSuspendisse non eros quis ligula neque, eg ulla, at placerat nibh.Cras id lectus mattis est Very large dataset venenatis se Suspendisse non er llamcorper blandit.seeks strong & nulla, at placerat nibhProin ut nisi vitae enim ulputate tempor. consistent compute for Cras id lectus mattisPhasellus id commodo est ullamcorper ros. Mauris necshort term relationship, blandit. Proin ut nisi ignissim turpis. Nunc vitae enim vulputate possibly longer. GSOH a tempor. Phasellus id Cras id lectus mattis plus aws.amazon.com commodo eros. Mauris nec dignissim est ullamcorper turpis. Nunc
  24. 24. Storage Big Data Compute From one instance…
  25. 25. Storage Big Data Compute …to thousands
  26. 26. Storage Big Data Compute and back again…
  27. 27. The revolution
  28. 28. have data
  29. 29. have datacan store
  30. 30. have datacan store can analyse
  31. 31. economically
  32. 32. fast
  33. 33. Who is your customer really? What do people really like?What is happening socially with your products? How do people really use your products?
  34. 34. 34
  35. 35. Lesson 1: don’t leave your Amazon account logged in at homeLesson 2: use the data you have to drive proactive marketing
  36. 36. 1 instance for 100 hours =100 instances for 1 hour
  37. 37. Small instance = $8
  38. 38. Amazon Elastic MapReduce
  39. 39. But what is it?
  40. 40. A frameworkSplits data into piecesLets processing occur Gathers the results
  41. 41. S3 + DynamoDB Input dataCode Elastic Name Output MapReduce node S3 + SimpleDB Queries HDFS + BI Via JDBC, Pig, Hive Elastic cluster
  42. 42. Very large click log (e.g TBs)
  43. 43. Lots of actions by John SmithVery large click log (e.g TBs)
  44. 44. Lots of actions by John SmithVery large click log (e.g TBs) Split the log into many small pieces
  45. 45. Process in an Lots of actions EMR cluster by John SmithVery large click log (e.g TBs) Split the log into many small pieces
  46. 46. Process in an Lots of actions EMR cluster by John SmithVery large click log (e.g TBs) Split the Aggregate log into the results many small from all pieces the nodes
  47. 47. Process in an Lots of actions EMR cluster by John SmithVery large What click log John (e.g TBs) Split the Aggregate log into the results Smith many small from all did pieces the nodes
  48. 48. Very large What click log John (e.g TBs) Insight in a fraction of the time Smith did
  49. 49. 1 instance for 100 hours =100 instances for 1 hour
  50. 50. Small instance = $8
  51. 51. 1 instance for 1,000 hours =1,000 instances for 1 hour
  52. 52. Small instance = $80
  53. 53. Features powered by Amazon Elastic MapReduce: People Who Viewed this Also Viewed Review highlights Auto complete as you type on search Search spelling suggestions Top searches Ads200 Elastic MapReduce jobs per day Processing 3TB of data
  54. 54. Data Analytics3.5 billion records Execute batch processing data sets ranging in size from dozens of “Our first client71 million unique cookies Gigabytes to Terabytes campaign experienced1.7 million targeted ads a 500% increase in Building in-house infrastructure torequired per day analyze these click stream datasets their return on ad requires investment in expensive spend from a similar “headroom” to handle peak demand. campaign a year before” User recently purchased a Targeted Ad sports movie and is searching (1.7 Million per day) for video games
  55. 55. Want to try some of this?

×