Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bigger Data For Your Budget

1,253 views

Published on

How to turn your Big Data into Big Insights without breaking the bank.

Published in: Technology
  • Be the first to comment

Bigger Data For Your Budget

  1. 1. VDave Porter Dave Porter – SproutCore Architect, Appnovation davep@appnovation.com Bigger Data For Your Budget CANADIAN HEADQUARTERS 152 West Hastings Street Vancouver BC, V6B 1G8 UNITED STATES OFFICE 3414 Peachtree Road, #1600 Atlanta Georgia, 30326-1164 UNITED KINGDOM OFFICE 3000 Hillswood Drive Hillswood Business Park Chertsey KT16 0RS, UK www.appnovation.com info@appnovation.com How to turn your Big Data into Big Insights without breaking the bank
  2. 2. VDave Porter John Kreisa VP Marketing, Hortonworks Dave Porter SproutCore Architect, Appnovation Technologies Speakers
  3. 3. VDave Porter Appnovation is one of the world’s TOP OPEN SOURCE DEVELOPMENT SHOPS.
  4. 4. VDave Porter LOCATIONS VANCOUVER OFFICE 152 West Hastings Street Vancouver BC, V6B 1G8 ATLANTA OFFICE 3414 Peachtree Road, #1600 Atlanta Georgia, 30326-1164 LONDON OFFICE 3000 Hillswood Drive Hillswood Business Park Chertsey KT16 0RS, UK
  5. 5. VDave Porter
  6. 6. VDave Porter Bigger Data For Your Budget
  7. 7. VDave Porter Databases Server logs Raw transactional data Human-Quality Input WHAT IS BIG DATA?
  8. 8. VDave Porter Website Traffic Patterns Financial Transactions Science People WHERE IS IT COMING FROM?
  9. 9. VDave Porter
  10. 10. VDave Porter Curing Cancer Beating XDR-TB Finding Earth 2.0 in Outer Space Seeing Deeper Into Your Business THE PROMISE OF BIG DATA
  11. 11. VDave Porter
  12. 12. VDave Porter Retail Inventory System WHAT CAN BIG DATA DO FOR ME?
  13. 13. VDave Porter Retail Inventory System Overnight Batch Cycle WHAT CAN BIG DATA DO FOR ME?
  14. 14. VDave Porter Retail Inventory System Hourly Cycle WHAT CAN BIG DATA DO FOR ME?
  15. 15. VDave Porter Collecting & Storing Processing & Analyzing THE BIG DATA CHALLENGES
  16. 16. VDave Porter Collecting & Storing …on expensive hardware Processing & Analyzing …with expensive software THE BIG DATA CHALLENGES
  17. 17. VDave Porter Bigger Data For Your Budget
  18. 18. VDave Porter Open Source Software, Running on Commodity Hardware. BIGGER DATA FOR YOUR BUDGET
  19. 19. VDave Porter BIGGER DATA FOR YOUR BUDGET
  20. 20. VDave Porter Gnomes … with flashlights (and notepads) HADOOP: BIGGER DATA FOR YOUR BUDGET
  21. 21. VDave Porter + HADOOP: BIGGER DATA FOR YOUR BUDGET
  22. 22. © Hortonworks Inc. 2013 A Brief History of Apache Hadoop Page 22 2013 Focus on INNOVATION 2005: Yahoo! creates team under E14 to work on Hadoop Focus on OPERATIONS 2008: Yahoo team extends focus to operations to support multiple projects & growing clusters Yahoo! begins to Operate at scale Enterprise Hadoop Apache Project Established Hortonworks Data Platform 2004 2008 2010 20122006 STABILITY 2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with 24 key Hadoop engineers from Yahoo
  23. 23. © Hortonworks Inc. 2013 Hortonworks Snapshot Page 23 • We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform • We engineer, test & certify HDP for enterprise usage • We employ the core architects, builders and operators of Apache Hadoop • We drive innovation within Apache Software Foundation projects • We are uniquely positioned to deliver the highest quality of Hadoop support • We enable the ecosystem to work better with Hadoop Develop Distribute Support We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution Endorsed by Strategic Partners Headquarters: Palo Alto, CA Employees: 180+ and growing Investors: Benchmark, Index, Yahoo
  24. 24. © Hortonworks Inc. 2013 Hortonworks Process for Enterprise Hadoop Page 24 Upstream Community Projects Downstream Enterprise Product Hortonworks Data Platform Design & Develop Distribute Integrate & Test Package & Certify Apache HCatalo g Apache Pig Apache HBase Other Apache Projects Apache Hive Apache Ambari Apache Hadoop Test & Patch Design & Develop Release No Lock-in: Integrated, tested & certified distribution lowers risk by ensuring close alignment with Apache projects Virtuous cycle when development & fixed issues done upstream & stable project releases flow downstream Stable Project Releases Fixed Issues
  25. 25. © Hortonworks Inc. 2013 Enhancing the Core of Apache Hadoop Deliver high-scale storage & processing with enterprise-ready platform services Unique Focus Areas: • Bigger, faster, more flexible Continued focus on speed & scale and enabling near-real-time apps • Tested & certified at scale Run ~1300 system tests on large Yahoo clusters for every release • Enterprise-ready services High availability, disaster recovery, snapshots, security, … Page 25 HADOOP CORE Hortonworkers are the architects, operators, and builders of core Hadoop Distributed Storage & Processing PLATFORM SERVICES Enterprise Readiness
  26. 26. © Hortonworks Inc. 2013 Page 26 HADOOP CORE DATA SERVICES Provide data services to store, process & access data in many ways Unique Focus Areas: • Apache HCatalog Metadata services for consistent table access to Hadoop data • Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools Distributed Storage & Processing Hortonworks enables Hadoop data to be accessed via existing tools & systems Store, Process and Access Data PLATFORM SERVICES Enterprise Readiness Data Services for Full Data Lifecycle
  27. 27. © Hortonworks Inc. 2013 Operational Services for Ease of Use Page 27 OPERATIONAL SERVICES Include complete operational services for productive operations & management Unique Focus Area: • Apache Ambari: Provision, manage & monitor a cluster; complete REST APIs to integrate with existing operational tools; job & task visualizer to diagnose issues Only Hortonworks provides a complete open source Hadoop management tool Manage & Operate at Scale DATA SERVICES Store, Process and Access Data HADOOP CORE Distributed Storage & Processing PLATFORM SERVICES Enterprise Readiness
  28. 28. © Hortonworks Inc. 2013 OS Cloud VM Appliance Page 28 PLATFORM SERVICES HADOOP CORE DATA SERVICES OPERATIONAL SERVICES Manage & Operate at Scale Store, Process and Access Data Enterprise Readiness Only Hortonworks allows you to deploy seamlessly across any deployment option • Linux & Windows • Azure, Rackspace & other clouds • Virtual platforms • Big data appliances HORTONWORKS DATA PLATFORM (HDP) Distributed Storage & Processing Deployable Across a Range of Options
  29. 29. © Hortonworks Inc. 2013 OS Cloud VM Appliance HDP: Enterprise Hadoop Distribution Page 29 PLATFORM SERVICES HADOOP CORE DATA SERVICES OPERATIONAL SERVICES Manage & Operate at Scale Store, Process and Access Data HORTONWORKS DATA PLATFORM (HDP) Distributed Storage & Processing Hortonworks Data Platform (HDP) Enterprise Hadoop • The ONLY 100% open source and complete distribution • Enterprise grade, proven and tested at scale • Ecosystem endorsed to ensure interoperability Enterprise Readiness
  30. 30. © Hortonworks Inc. 2013 Existing Data Architecture Page 30 APPLICATIONSDATASYSTEMS TRADITIONAL REPOS RDBMS EDW MPP DATASOURCES OLTP, POS SYSTEMS OPERATIONAL TOOLS MANAGE & MONITOR Traditional Sources (RDBMS, OLTP, OLAP) DEV & DATA TOOLS BUILD & TEST Business Analytics Custom Applications Enterprise Applications
  31. 31. © Hortonworks Inc. 2013 An Emerging Data Architecture Page 31 APPLICATIONSDATASYSTEMS TRADITIONAL REPOS RDBMS EDW MPP DATASOURCES MOBILE DATA OLTP, POS SYSTEMS OPERATIONAL TOOLS MANAGE & MONITOR Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media) DEV & DATA TOOLS BUILD & TEST Business Analytics Custom Applications Enterprise Applications HORTONWORKS DATA PLATFORM
  32. 32. © Hortonworks Inc. 2013 Interoperating With Your Tools Page 32 APPLICATIONSDATASYSTEMS TRADITIONAL REPOS DEV & DATA TOOLS OPERATIONAL TOOLS Viewpoint Microsoft Applications HORTONWORKS DATA PLATFORM DATASOURCES MOBILE DATA OLTP, POS SYSTEMS Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media)
  33. 33. © Hortonworks Inc. 2013 Big Data Transactions, Interactions, Observations Hadoop Patterns of Use Page 33 Business Case HORTONWORKS DATA PLATFORM Refine Explore Enrich
  34. 34. © Hortonworks Inc. 2013 Operational Data Refinery Page 34 DATASYSTEMSDATASOURCES 1 3 1 Capture Capture all data Process Parse, cleanse, apply structure & transform Exchange Push to existing data warehouse for use with existing analytic tools 2 3 Refine Explore Enric h 2 APPLICATIONS Collect data and apply a known algorithm to it in trusted operational process TRADITIONAL REPOS RDBMS EDW MPP HORTONWORKS DATA PLATFORM Business Analytics Custom Applications Enterprise Applications Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media)
  35. 35. © Hortonworks Inc. 2013 Big Data Exploration & Visualization Page 35 DATASYSTEMSDATASOURCES Refine Explore Enrich APPLICATIONS 1 Capture Capture all data Process Parse, cleanse, apply structure & transform Exchange Explore and visualize with analytics tools supporting Hadoop 2 3 Collect data and perform iterative investigation for value 3 2 TRADITIONAL REPOS RDBMS EDW MPP 1 HORTONWORKS DATA PLATFORM Business Analytics Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media)
  36. 36. © Hortonworks Inc. 2013 Application Enrichment Page 36 DATASYSTEMSDATASOURCES Refine Explore Enrich APPLICATIONS 1 Capture Capture all data Process Parse, cleanse, apply structure & transform Exchange Incorporate data directly into applications 2 3 Collect data, analyze and present salient results for online apps 3 1 2 TRADITIONAL REPOS RDBMS EDW MPP Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media) Custom Applications Enterprise Applications NOSQL HORTONWORKS DATA PLATFORM
  37. 37. VDave Porter John Kreisa VP Marketing, Hortonworks Dave Porter SproutCore Architect, Appnovation Technologies Speakers
  38. 38. VDave Porter Next Steps Hortonworks.com /sandbox Hortonworks.com /hadoop-training @Appnovation DaveP@Appnovation.com JKriesa@Hortonworks.com @hortonworks @hortonworks_U Appnovation.com /Blog Blog LEAR N
  39. 39. VDave Porter Thank You For Your Participation! CANADIAN HEADQUARTERS 152 West Hastings Street Vancouver BC, V6B 1G8 UNITED STATES OFFICE 3414 Peachtree Road, #1600 Atlanta Georgia, 30326-1164 UNITED KINGDOM OFFICE 3000 Hillswood Drive Hillswood Business Park Chertsey KT16 0RS, UK www.appnovation.com info@appnovation.com

×