• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
The Next Generation of Big Data Analytics
 

The Next Generation of Big Data Analytics

on

  • 2,495 views

Apache Hadoop has evolved rapidly to become a leading platform for managing and processing big data. If your organization is examining how you can use Hadoop to store, transform, and refine large ...

Apache Hadoop has evolved rapidly to become a leading platform for managing and processing big data. If your organization is examining how you can use Hadoop to store, transform, and refine large volumes of multi-structured data, please join us for this session where we will discuss, the emergence of "big data" and opportunities for deriving business value, the evolution of Apache Hadoop and future directions, essential components required in a Hadoop-powered platform, and solution architectures that integrate Hadoop with existing data discovery and data warehouse platforms.

Statistics

Views

Total Views
2,495
Views on SlideShare
2,488
Embed Views
7

Actions

Likes
10
Downloads
0
Comments
0

1 Embed 7

https://hwtest.uservoice.com 7

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

The Next Generation of Big Data Analytics The Next Generation of Big Data Analytics Presentation Transcript

  • The Next Generationof Big Data AnalyticsAugust 22, 2012 © Hortonworks Inc. 2012 © 2012 Teradata Corporation 1
  • Today’s Speakers Jim Walker Cesar Rojas Eric LindenDir. Product Marketing Dir. Solutions Marketing Technical Marketing Hortonworks Teradata Aster Teradata Aster © Hortonworks Inc. 2012 © 2012 Teradata Corporation 2
  • Big Data Changes the Game Transactions + InteractionsPetabytes BIG DATA Mobile Web + Observations Sentiment User Click Stream SMS/MMS = BIG DATA Speech to Text Social Interactions & Feeds Terabytes WEB Web logs Spatial & GPS Coordinates A/B testing Sensors / RFID / Devices Behavioral Targeting Gigabytes CRM Business Data Feeds Dynamic Pricing Segmentation External Demographics Search Marketing Customer Touches User Generated Content ERP Megabytes Affiliate Networks Purchase detail Support Contacts HD Video, Audio, Images Dynamic Funnels Purchase record Offer details Offer history Product/Service Logs Payment record Increasing Data Variety and Complexity © Hortonworks Inc. 2012 © 2012 Teradata Corporation 3
  • Next Generation Data Architecture Drivers Business •  Enable new business models & drive faster growth (20%+) Drivers •  Find insights for competitive advantage & optimal returns •  Data continues to grow exponentially Technical •  Data is increasingly everywhere and in many formats Drivers •  Legacy solutions unfit for new requirements growth Financial •  Cost of data systems, as % of IT spend, continues to grow Drivers •  Cost advantages of commodity hardware & open source © Hortonworks Inc. 2012 © 2012 Teradata Corporation 4
  • Fueling Adoption of the Next Generation Empower the Ecosystem •  Apache Hadoop has to just work with what you already have •  Apache Hadoop must be a seamless part of holistic data management strategy •  Leverage existing assets and tools… Extend them with new and powerful data Data Platform Services & Open APIs Hortonworks Data Platform © Hortonworks Inc. 2012 © 2012 Teradata Corporation 5
  • Hortonworks Data Platform (HDP) •  Simplify deployment to get started quickly and easily •  Monitor, manage any size cluster with familiar console and tools •  Only platform to include data integration services to interact 1 with any data source •  Metadata services opens the platform for integration with Hortonworks Data Platform existing applications Delivers enterprise grade functionality on a proven Apache Hadoop distribution to ease management, •  Dependable high availability simplify use and ease integration into the enterprise architectureThe only 100% open source data platform for Apache Hadoop © Hortonworks Inc. 2012 © 2012 Teradata Corporation 6
  • Shift in Paradigm Classic BI Structured & Repeatable AnalysisBusiness determines what IT structures the data to questions to ask answer those questions SQL Performance & Structure “Capture only what’s needed”“Capture in case it’s needed” MapReduce Processing Flexibility IT delivers platform for Big Data Analytics storing, refining, & Business explores data for Multi-structured & Iterative Analysis questions worth answeringanalyzing all data sources © Hortonworks Inc. 2012 © 2012 Teradata Corporation 7
  • Transactions + Interactions + Observations Audio, Retain runtime models and Video,Images historical data for ongoing 5 Business Web, Mobile, CRM, refinement & analysis ERP, SCM, … Transactions Docs, & Interactions Text, XML Web Logs, Clicks Big Data 4 DataSocial, Refinery Discovery & ClassicGraph, 1 ETLFeeds Investigative processing AnalyticsSensors, 3 Share refinedDevices, RFID data & runtime 2 Store, aggregate, and models Interactive transform multi-structured dataSpatial, data to unlock value Business exploration GPS Intelligence & Analytics Retain historical data toEvents, Other unlock additional value 6 Dashboards, Reports, Visualization, … © Hortonworks Inc. 2012 © 2012 Teradata Corporation 8
  • Unified Big Data Architecture •  Engineers •  Data Scientists Java, C/C++, Pig, Python, R, SAS, •  Quants SQL, Excel, BI, Visualization, etc. •  Business Analysts Discovery Integrated Platform Data Warehouse Capture, Store, Refine Audio/ Web & Machine CRM SCM ERP Images Text Video Social Logs Sources of data © Hortonworks Inc. 2012 © 2012 Teradata Corporation 9
  • Next Generation Big Data AnalyticsThe Data Discovery Cycle Analytical IdeaOperational DB Operationalize Zero-ETL Data or EDW or Move On Load/Integration Evaluate Results SQL & non-SQL Analysis © Hortonworks Inc. 2012 © 2012 Teradata Corporation 10
  • Key Elements of a Data Discovery Platform Highly Efficient & Performant Big Data Platform 1 That Allows Quick Iterations Hybrid Capabilities that Provide both Legacy 2 (SQL, BI) and New (MapReduce) Interfaces Significant Out-of-the-Box Analytical Apps that 3 Minimize Development Democratize Big Data & Maximize Enterprise Adoption © Hortonworks Inc. 2012 © 2012 Teradata Corporation 11
  • Teradata Aster Data Discovery Platform Analysts Customers Business Users Data Scientists Your Analytics & Advanced Reporting Applications Pattern Matching Graph Statistical ELT •  50+ pre-built analytic modulesDevelop •  Visual IDE; develop apps in hours Java, C, Python, Perl … •  Many programming languages SQL SQL-MapReduce •  SQL-MapReduce framework •  Analyze both non-relational +Process Platform Services relational data (e.g. query planning, dynamic workload management, security …) •  Linear, incremental scalability •  Commodity-hardware based Store Relational Relational •  Software only, cloud, or appliance Row Column •  Relational-data architecture can be extended for non-relational types External HDFS Data (Using SQL-H and HCatalog) © Hortonworks Inc. 2012 © 2012 Teradata Corporation 12
  • Aster MapReduce Portfolio: the App Store ofBig Data 50+ out-of-the-box SQL-MapReduce analytic applications Path Analysis Text Analysis Discover patterns in rows of Derive patterns and extract features sequential data in textual data Statistical Analysis Segmentation High-performance processing of Discover natural groupings of data common statistical calculations points Marketing Analytics Data Transformation Analyze customer interactions to Transform data for more advanced optimize marketing decisions analysis © Hortonworks Inc. 2012 © 2012 Teradata Corporation 13
  • Aster SQL-H Enables Data Discovery on Hadoop Data Aster SQL-H™ A Business User’s Bridge to Analyze Hadoop DataAster SQL-H gives analysts and data scientists a better wayto analyze data stored cheaply in Hadoop •  Allow standard ANSI SQL to Hadoop data •  Leverage existing BI tool investments •  Enable 50+ prebuilt SQL-MapReduce Apps and IDE •  Improve self-sufficiency for analysts going against Hadoop © Hortonworks Inc. 2012 © 2012 Teradata Corporation 14
  • Analyst Point of View Gap 1: Analysts Engineers Data Scientists Quants Business Analysts Java, C/C++, Pig, Python, R, SAS, SQL, Excel, BI, Visualization, etc. MapReduce (Processing) Discovery Active Data Gap 2: File system lacks Platform Warehouse optimizers, data locality, indexes Database and Analytic Processing Layer Data Storage and Refining Audio/ Web & Machine Images Text CRM SCM ERP Video Social Logs © Hortonworks Inc. 2012 © 2012 Teradata Corporation 15
  • Analyst’s Goal: Get Insights from Data inHadoop Engineers Data Scientists Quants Business Analysts Aster MapReduce Portfolio Teradata Analytics Portfolio Custom Code and Development SQL & SQL-MapReduce SQL MR, Pig, Hive Teradata Aster Teradata IT is the optimizer Discovery Platform IDW © Hortonworks Inc. 2012 © 2012 Teradata Corporation 16
  • Analytics on Hadoop Data with Aster SQL-H Engineers Data Scientists Quants Business Analysts Aster MapReduce Portfolio Aster MapReduce Portfolio Teradata Analytics Portfolio SQL-H SQL & MapReduce SQL & SQL-MapReduce SQL SQL Teradata Aster Teradata Discovery Platform IDW © Hortonworks Inc. 2012 © 2012 Teradata Corporation 17
  • Aster SQL-H™ Integration with HCatalog Aster is the execution layer, all analytical processing is Aster Layer: SQL-H done with Aster SQL- MapReduce functions (no Hive or Hadoop-MR) Hadoop Data Filtering MR HCatalog is the metadataData Hive HCatalog repository Pig HDFS HDFS is the data repository © Hortonworks Inc. 2012 © 2012 Teradata Corporation 18
  • When to Use What? •  The best approach by workload and data type •  Processing as a Function of Schema Requirements by Data Type Loading and Refining Analytics Low Cost Storage Data Pre- Reporting (User-driven, & Retention Processing, Transformations interactive) Prep, CleansingStable Teradata / TeradataSchema Teradata Teradata Teradata Hadoop (SQL analytics) Aster AsterEvolving Aster /Schema Hadoop (joining with Aster (SQL + MapReduce Hadoop structured data) Analytics) AsterFormat,No Schema Hadoop Hadoop Hadoop (MapReduce Analytics) © Hortonworks Inc. 2012 © 2012 Teradata Corporation 19
  • Customer Churn PreventionChallenge Cross-Channel•  Know when churn will occur Customer Interactions•  Data Mining tools predict probability but do not 17,000 Customers, 1 Month identify cause eventsWith Hadoop•  Capture, retention and transformation of customer images (e.g. checks) and customer voice recordsWith Aster & Teradata•  SQL-MapReduce listens and predicts the customer churn event –  Identifies all interaction patterns prior to acquisition or attritionBusiness Impact•  10-300x less effort to pinpoint a customer in the middle of a decision © Hortonworks Inc. 2012 © 2012 Teradata Corporation 20
  • More Accurate Customer Churn Prevention Hadoop captures, Aster does path stores and transform and sentiment images and call Social & Web data analysis with multi- records structured data Multi-Structured Raw Data Call Data Analysis Aster Data Call Center Voice Discovery + Records Hadoop Image Data Platform Marketing Automation Images & Capture, Retention Analytic Results Dimensional Data Documents & (Customer Transformation Retention Campaign) Traditional Data Flow Layer Data Sources ETL Tools Teradata Integrated DW © Hortonworks Inc. 2012 © 2012 Teradata Corporation 21
  • Aster-Hadoop Integration DemoChurn Attrition © Hortonworks Inc. 2012 © 2012 Teradata Corporation 22
  • Use Cases: Optimize Outcomes at Scale Media optimize Content Intelligence optimize Detection Investment optimize Algorithms Advertising optimize Performance Fraud optimize Prevention Regulation optimize Compliance Retail / Wholesale optimize Inventory turns Manufacturing optimize Supply chains Healthcare optimize Patient outcomes Education optimize Learning outcomes Government optimize Citizen services Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation. © Hortonworks Inc. 2012 © 2012 Teradata Corporation 23
  • Why Hortonworks and TeradataFamiliar business analysis on Apache Hadoop big data•  50+ advanced SQL-MapReduce functions (Aster MapReduce Portfolio)•  SQL-MapReduce development environment to build more functionsStraightforward Database to Apache Hadoop Integration•  ANSI SQL-based interface to standard HCatalog metadata/schema in HadoopInteroperability with existing ecosystem & skillsets•  BI tools (Tableau, MicroStrategy, Cognos), ETL tools, SQL analysts & existing applicationsEase of maintenance, skillset and tools compliant•  Leverage existing DBA skill-sets without additional overhead•  Apache Ambari provides management and monitoring of the Hadoop cluster and integrates with current administration tools © Hortonworks Inc. 2012 © 2012 Teradata Corporation 24
  • Learn MoreBig Analytics Best Practices Jim Walker Hortonworkswww.asterdata.com/BigAnalyticsSeries jim@hortonworks.com Cesar RojasApache Hadoop Teradata Aster cesar.rojas@teradata.com& the Big Data Refinerywww.hortonworks.com Eric Linden Teradata Aster eric.linden@teradata.comTwitter: @hortonworks @asterdata @jaymce © Hortonworks Inc. 2012 © 2012 Teradata Corporation 25