Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Move Beyond ETL: Tapping the True Business Value of Hadoop

762 views

Published on

Published in: Technology, Business
  • Be the first to comment

Move Beyond ETL: Tapping the True Business Value of Hadoop

  1. 1. Photo Credit: http://www.crosseyedlife.com/teaching-resources/
  2. 2. Menu Who am I? Early adopters of Hadoop Next generation use cases Changing big data architectures Art of the possible My request Questions Appetiz er Main Dessert
  3. 3. Who am I? Google, Software Engineer Personalized Search Personalized Recommendations WibiData, CTO Real-time Personalization Platform Customer Use Cases
  4. 4. EARLY ADOPTERS OF HADOOP
  5. 5. Early AdopterEarly Majority
  6. 6. Collect Everything Keep Everything Ask Anything
  7. 7. Collect Everything Collect Everything Collect Everything Collect Everything Collect Everything Maybe I should, too? Keep Everything
  8. 8. Blind Spots 1. New, high-value use cases 1. Architectural changes to support broader use cases 1. The ultimate strategic goals of early adopters
  9. 9. NEXT GENERATION USE CASES Blind Spot Number 1
  10. 10. Recommendations
  11. 11. Recommendations
  12. 12. Search
  13. 13. Prediction and Prevention
  14. 14. Targeted Offers
  15. 15. Customer Experience Optimization
  16. 16. Clearly, early adopters have moved beyond ETL.
  17. 17. Life After ETL Understanding 360-degree customer views Visualization Graphs Exploration Trends Customer segmentation ROI Prediction Action Recommendations Prevention Mobile Offers Recommendations Localization Search Personalization
  18. 18. Evolution of Enterprise Data Collect Organize Understand ActUnderstandUnderstand
  19. 19. CHANGING ARCHITECTURE Blind Spot Number 2
  20. 20. Sometimes, supporting a new use case requires a different architecture.
  21. 21. Evolution of Enterprise Data Collect Organize Understand Act Collect Organize Understand
  22. 22. Key Ingredients Data Consolidation Organization Experimentation Try something! Rapid iteration Tuning Deployment Evaluation Real time Required to Understand Required to Act
  23. 23. Web Web Web HDFS Logs Txns POS Third Party Data 1. Collect
  24. 24. MapReduce Web Web Web HDFS Logs Txns POS Third Party Data 1. Collect 2. Organize Data Warehouse
  25. 25. Web Web Web HDFS POS Third Party 1. Collect 2. Organize 3. Understand MapReduce Data Warehouse
  26. 26. Web Web Web HDFS POS Third Party 1. Collect 2. Organize 3. Understand 4. Act MapReduceHBase Data Warehouse
  27. 27. Key Ingredients Data Consolidation Organization Experimentation Try something! Rapid iteration Tuning Deployment Evaluation Real time Required to Understand Required to Act Did we get any of these?
  28. 28. Early Adopter Migration Strategies Add serving capability Key-value store Indexing Add stream processing Storm Samza Lambda architecture Add both
  29. 29. Web Web Web HDFS POS Third Party 1. Collect 2. Organize 3. Understand 4. Act MapReduceHBase Data Warehouse HBaseStorm Query BatchServingSpeed
  30. 30. Key Ingredients Data Consolidation Organization Experimentation Try something! Rapid iteration Tuning Deployment Evaluation Real time Required to Understand Required to Act Did we get any of these?
  31. 31. Web Web Web HDFS POS Third Party 1. Collect 2. Organize 3. Understand 4. Act MapReduce Data Warehouse HBaseStorm Query BatchServingSpeed
  32. 32. ART OF THE POSSIBLE Blind Spot Number 3 Photo credit: http://mediahub.olive.co.uk/blog/the-art-of-the-possible
  33. 33. You can’t build a data platform to solve a problem you haven’t identified yet.
  34. 34. What’s Next? Collect Organize Understand Act ?
  35. 35. What’s Next? Collect OrganizeUnderstand Act
  36. 36. Where is the Value? Collect Organize Understand Act 0% 20% 40% 60% 80% 100% Collect Organize Understand Act
  37. 37. “As the amount of data goes up, the importance of human judgment should go down” - Andrew McAfee HBR Blog
  38. 38. Question Hypothesis PredictionTesting Analysis Hire smarter people Faster EDW Hire smarter peopleFaster Deployment Faster EDW Testing
  39. 39. What does this all mean? The real value is in next generation “action” use cases The architecture for “action” is different Design for your problem, since you don’t know the art of the possible. Requirements first, then technology
  40. 40. My Request Stop building faster data warehouses. You already understand your data. Turn your understanding into action.
  41. 41. Questions? Garrett Wu http://www.wibidata.com gwu@wibidata.com

×