• Save
Hadoop's Opportunity to Power Next-Generation Architectures
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Hadoop's Opportunity to Power Next-Generation Architectures

on

  • 3,344 views

Shaun Connolly's presentation/keynote at Hadoop Summit 2012.

Shaun Connolly's presentation/keynote at Hadoop Summit 2012.

Statistics

Views

Total Views
3,344
Views on SlideShare
3,096
Embed Views
248

Actions

Likes
10
Downloads
0
Comments
0

8 Embeds 248

http://eventifier.co 104
http://rubymania.wordpress.com 63
http://eventifier.com 60
http://192.168.6.179 8
https://hwtest.uservoice.com 6
http://localhost 4
http://www.brijj.com 2
http://www.eventifier.co 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Hadoop's Opportunity to Power Next-Generation Architectures Presentation Transcript

  • 1. Hadoop’s Opportunity to PowerNext-Generation ArchitecturesShaun Connolly, Hortonworks StrategyJune 13, 2012
  • 2. How many people are lucky enoughto say that they were at the forefront of something big?
  • 3. Transactions InteractionsObservations
  • 4. Big Data = Transactions + Interactions + Observations BIG DATA User Generated Content Sensors / RFID / DevicesPetabytes Mobile Web Sentiment Social Interactions & Feeds User Click Stream Spatial & GPS Web logs WEB A/B testing CoordinatesTerabytes External Offer history Dynamic Pricing Demographics Affiliate Networks Business Data Feeds CRM Gigabytes Segmentation Search Marketing HD Video, Audio, Images ERP Offer details Behavioral Speech to Text Purchase detail Targeting Megabytes Purchase record Customer Touches Product/Service Logs Dynamic Funnels Payment record SMS/MMS Support Contacts Increasing Data Variety and Complexity Source: Contents of above graphic created in partnership with Teradata, Inc.
  • 5. There is still work to be done to ensure HADOOP powers the BIG DATA WAVE
  • 6. Many Communities Must Work As One• Be diligent stewards of the open source core• Be tireless innovators Open Source beyond the core Vendors• Provide robust data platform services & open APIs• Enable ecosystem at each End Users layer of the stack• Make platform enterprise- ready & easy to use
  • 7. Top 10 Influencers of the Decade 1.  Google 2.  Apple 3.  Apache Software Foundation 4.  Microsoft 5.  Linux Foundation 6.  Eclipse Foundation 7.  Twitter 8.  Free Software Foundation 9.  Android Project 10. VMwareSource: SD Times, http://www.sdtimes.com/link/36666
  • 8. Top 10 Influencers of the Decade #3Source: SD Times, http://www.sdtimes.com/link/36666
  • 9. Diligent Stewards & Tireless InnovatorsPig AvroHive CascadingHBase AccumuloZookeeper WhirrHCatalog ChukwaAmbari SnappySqoop SparkOozie HAMA GiraphFlume OpenMPIMahout 1.0 2.0 Beyond
  • 10. [Integrating Hadoop withexisting IT investments isvitally important.] Larry Feinsmith
  • 11. Connecting Transactions + Interactions + Observations Audio, Retain runtime models and Video,Images historical data for ongoing 4 Business refinement & analysis Transactions Docs, Text, & Interactions XML Web Logs, Web, Mobile, CRM, Clicks ERP, SCM, … Big DataSocial, Refinery ClassicGraph, 3 Share refined data and 1 ETLFeeds runtime models processingSensors, 2Devices, RFID Store, aggregate, and transform multi-structured BusinessSpatial, data to unlock value Intelligence GPS & Analytics Retain historical data toEvents, Other unlock additional value 5 Dashboards, Reports, Visualization, …
  • 12. Next-Generation Big Data Architecture Audio, Web, Mobile, CRM, Video,Images ERP, SCM, … Business Transactions Docs, Text, & Interactions XML Web Logs, Clicks Big DataSocial, Refinery SQL NoSQL NewSQLGraph,Feeds EDW MPP NewSQLSensors,Devices, RFID Arrows powered by BusinessSpatial, GPS ETL, data Intelligence movement, and data & Analytics integrationEvents, technologies Other Dashboards, Reports, Visualization, …
  • 13. Data Services & Open APIs are Vital Raw hadoop data Table access Inconsistent metadata Tool specific access HCatalog Aligned metadata RESTful APIApache HCatalog: Hadoop’s centralized metadata serviceü  Provide consistent metadata and data models across toolsü  Share data as tables in and out of HDFSü  Enable flexible, thin-client access via RESTful APIs
  • 14. Data Services & Open APIs In Action Analyze website visits by the 1 Web Log files via WebHDFS APIs 4 type of end results Website WebInteractions Logs Big Data Order Refinery DB DataCustomer DB Data Customer & Order data via Talend Process, analyze, and join data 2 3 & HCatalog for schema via Talend, Pig, & HCatalog
  • 15. Let’s Head to the Demo Kitchen
  • 16. Ecosystem Completes the PuzzleApplications, Business Tools, & Dev ToolsData Management & MovementInfrastructure & Systems Management
  • 17. Solution Architectures: Make Hadoop Enterprise-Ready & Easy to UseApplications, Business Tools, & Dev ToolsData Management & MovementInfrastructure & Systems Management
  • 18. Our Opportunity…and Our Role By the end of 2015, more than half the worlds data will be processed by Apache Hadoop.1 Be diligent stewards of the open source core2 Be tireless innovators beyond the core3 Provide robust data platform services & open APIs4 Enable the ecosystem at each layer of the stack5 Make the platform enterprise-ready & easy to use