Left Brain, Right Brain: How to Unify Enterprise Analytics
 

Left Brain, Right Brain: How to Unify Enterprise Analytics

on

  • 1,009 views

The Briefing Room with Robin Bloor and Teradata ...

The Briefing Room with Robin Bloor and Teradata
Live Webcast on Jan. 29, 2013

Despite its name, effective Data Science requires a certain amount of artistic flair. Analysts must be creative about how and where they find the insights that will drive business value. One classic roadblock to that kind of frictionless process? Programming. Not everyone can code Java, which makes the unstructured domain of Hadoop quite challenging for the average business analyst.

Check out the slides from this episode of the Briefing Room to hear veteran Analyst Dr. Robin Bloor explain how a new generation of analytical platforms will solve the complexity of unifying structured and unstructured data. He'll be briefed by Steve Wooledge of Teradata Aster who will tout his company's Big Data Appliance, which leverages the SQL-H bridge, an innovation designed to connect Hadoop with SQL.

Visit: http://www.insideanalysis.com

Statistics

Views

Total Views
1,009
Views on SlideShare
1,002
Embed Views
7

Actions

Likes
1
Downloads
25
Comments
0

3 Embeds 7

http://insideanalysis.com 5
http://rebeccajozwiak.com 1
https://bloorgroup.webex.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Left Brain, Right Brain: How to Unify Enterprise Analytics Left Brain, Right Brain: How to Unify Enterprise Analytics Presentation Transcript

    • The Briefing Room
    • Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.comTwitter Tag: #briefr The Briefing Room
    • Mission !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers!Twitter Tag: #briefr The Briefing Room
    • JANUARY: Big Data February: Analytics March: Open Source April: IntelligenceTwitter Tag: #briefr The Briefing Room
    • Big DataTwitter Tag: #briefr NEW SOURCES New Insights NEW  Challenges  The Briefing Room Copyrighted property. May not be copied or downloaded without permission from 123RF Limited.
    • Analyst: Robin Bloor  Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.comTwitter Tag: #briefr The Briefing Room
    • Teradata Aster !   Teradata is known for its data analytics solutions with a focus on integrated data warehousing, big data analytics and business applications !   It offers a broad suite of technology platforms and solutions; data management applications; and data mining capabilities !   Teradata Aster is its MapReduce platform to handle big data analytics on multi-structured dataTwitter Tag: #briefr The Briefing Room
    • Steve Wooledge Steve is Senior Director of Product Marketing for Teradata Aster and has 10 years of industry experience.Twitter Tag: #briefr The Briefing Room
    • Bringing Big Data into the Light:Teradata Big Analytics ApplianceSteve Wooledge – Sr. Director, Product Marketing, Teradata AsterJanuary 2013
    • TOPICSWHAT IS DIFFERENT ABOUT BIG DATA ANALYTICS?MAKING BIG ANALYTICS & DISCOVERY FAST AND EASYTERADATA ASTER BIG ANALYTICS APPLIANCEConfidential and proprietary. Copyright © 2012 Teradata Corporation.10 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
    • What is Different about Big Analytics and Discovery?Confidential and proprietary. Copyright © 2012 Teradata Corporation.
    • The Lytro and Big Data12 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • “Interactive, Living Pictures”13 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • See Your Business in High-DefinitionBig Analytics & Discovery Unlocks Hidden Value Classic BI Structured & Repeatable Analysis Business determines what IT structures the data to questions to ask answer those questions “Capture only what’s needed” IT delivers a platform for Big Data Analytics storing, refining, and Multi-structured & Iterative Analysis Business explores data for analyzing all data sources questions worth answering “Capture in case it’s needed”14 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Iterative Analytics Accelerates Discovery Analytical Idea Operational DB or EDW Operationalize or Move On 5x Zero-ETL Data Load/Integration Faster Discovery Process with Aster - Evaluate vs. Days Hours Results SQL and non-SQL Analysis15 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Need for a Unified Data Architecture for New InsightsEnabling Any User for Any Data Type from Data Capture to Analysis Java, C/C++, Python, R, SAS, SQL, Excel, BI, Visualization Reporting and Execution Discover and Explore in the Enterprise Capture, Store and Refine Audio/ Web & Machine Images Docs Text CRM SCM ERP Video Social Logs16 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Big Data Comes with BIG HEADACHES “ Even free software like Hadoop is causing companies to spend more money…Many CIOs believe data is inexpensive because storage has become inexpensive. But data is inherently messy—it can be wrong, it can be duplicative, and it can be irrelevant— which means it requires handling, which is where the real expenses come in. ” “ Through 2015, 85% of Fortune 500 organizations will be unable to exploit big data for competitive advantage.Source: The Wall Street Journal. “CIOs’ Big Problem with Big Data”. Aug 2012Source: Gartner. “Information Innovation: Innovation Key Initiative Overview”. April 2012 ”17 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS Big Data Analytics DISCOVERY INTEGRATED PLATFORM DATA WAREHOUSE Big Data Management CAPTURE | STORE | REFINE AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP18 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • TERADATA UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS DISCOVERY INTEGRATED PLATFORM DATA WAREHOUSE CAPTURE | STORE | REFINE AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP19 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • TERADATA UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems VIEWPOINT LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS SUPPORT DISCOVERY Aster Teradata INTEGRATED PLATFORM Connector DATA WAREHOUSE Aster Connector for SQL-H SQL-H Teradata Connector Hadoop for Hadoop Aster Loader Teradata Loader CAPTURE | STORE | REFINE20 ConfidentialVIDEOproprietary. Copyright © 2013 Teradata Corporation. AUDIO & and IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP
    • Shift from a Single Platform to an Ecosystem “Big Data requirements are solved by a range of platforms including analytical databases, discovery platforms and NoSQL solutions beyond Hadoop.” Source: “Big Data Comes of Age”. EMA and 9sight Consulting. Nov 2012.21 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • How Does Big Analytics and Discovery Add Business Value?Confidential and proprietary. Copyright © 2012 Teradata Corporation.
    • Customer Behavior Analysis BI Tools Database Tools Monitoring Tools EMAIL ONLINE STORE VISION BRANCH CALL CENTER CUSTOMER CUSTOMERCORRESPOND- BANKING PLATFORM TELLER DATA DATA PROFILE DATA SURVEY DATA ENCE DATA DATA DATA 23 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Events Preceding Account Closure24 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Events Preceding Account ClosureSELECT * FROM npath ( ON ( SELECT … WHERE u.event_description IN ( SELECT aper.event FROM attrition_paths_event_rank aper Interactive Analytics ORDER BY aper.count DESC LIMIT 10) … ) Reducing the “Noise” to find the “Signal” PATTERN ((OTHER|EVENT){1,20}$) SYMBOLS (…) RESULT (…) )) n;25 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • How Do We Make Big Analytics & Discovery Possible?Confidential and proprietary. Copyright © 2012 Teradata Corporation.
    • Key Requirements of a Discovery Platform Highly Efficient & Performant Big Data Platform 1 That Allows Quick Iterations Hybrid Capabilities that supports SQL, statistics, 2 and new MapReduce analytics Significant Out-of-the-Box Analytical Functions 3 that Minimize Development Democratize Big Data & Maximize Enterprise Adoption27 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Aster Big Analytics ApplianceFirst Deeply Integrated SQL, MapReduce and Hadoop Appliance UNIQUE FEATURES 1.  Integrated, modular Aster Database and 100% Open-Source Hortonworks HDP 2.  First and only ANSI SQL & HCatalog integration via SQL-H™ 3.  Industry’s only ANSI-standard SQL & MapReduce integration via SQL-MapReduce® 4.  Industry’s most manageable & supportable Apache Hadoop appliance via Teradata Viewpoint™ & TVI™ 5.  Most complete MapReduce App Portfolio with 70+ pre-built MapReduce functions 6.  Fully engineered and supported by Teradata, with Level-4 support by Hortonworks world-class Hadoop team Benefits •  Leverage existing investments in standard BI, ETL tools & people with SQL skills •  Industry’s highest performance platform for Big Analytics •  Lowest TCO (technology + people), highest ROI, and fastest time to value28 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Aster Analytics PortfolioThe App Store of Big Data PATH ANALYSIS TEXT ANALYSIS Discover Patterns in Rows of Derive Patterns and Extract Sequential Data Features in Textual Data STATISTICAL ANALYSIS High-Performance Processing SEGMENTATION of Common Statistical Discover Natural Groupings Calculations of Data Points MARKETING DATA ANALYTICS TRANSFORMATION Analyze Customer Transform Data for More Interactions to Optimize Advanced Analysis Marketing Decisions29 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Unified Big Data Analytics ArchitectureIntegrated Analytics and Navigation BI Tools, SQL, ETL BIG ANALYTICS TERADATA IDW APPLIANCE Unified Big Analytics Architecture Behavior SentimentsMulti- DiscoveryStructured PlatformData FacebookUnstructured TwitterData Pinterest Social Revenue Media Iterative Operationalized Best Decision Information Analytics Possible Discovery30 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Aster Big Analytics ApplianceSolution Value Add NEW •  Single vendor for lowest TCO SQL BI Analytic Hadoop •  Common system management tools Tools SQL Apps Tools Troubleshooting, and Support Common Management, Aster MapReduce •  Supports standard BI and ETL tools Portfolio of Functions •  Use Hadoop tools like Hive and Pig Hive, Pig, SQL- SQL MapReduce … •  Analytics Library w/ 70+ functions NEW SQL-H •  SQL interface to MapReduce and Hadoop •  Pre-tuned HDFS and MapReduce parameters for Big Data workloads Aster Database •  Store and manage data in Apache Hadoop or Aster Database NEW InfiniBand (40 GB/s) Interconnect Fabric •  Processing, storage, and networking designed for Big Data workloads Big Analytics Appliance Hardware •  40 GB/s InfiniBand network NEW31 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • ESG Benchmark Report SummaryThird Party Validation of Aster and Hadoop “Fit” Scope •  Identical hardware for Aster and Hadoop •  Clickstream, sentiment, and traditional retail data •  Compare “time to insight” and “time to develop” RESULTS Discovery Process: Analytics: Development: Loading: Transforms: Aster Aster 35x Faster Aster Hadoop Hadoop 5x Faster (range: 4–416x) 3x Faster 1.8x Faster 1.3x Faster Hadoop MapReduce 32 Hours Aster 5x Faster Discovery Cycle-TimeAster SQL-MapReduce 6 Hours (Development + Execution Time) FULL REPORT AVAILABLE AT www.asterdata.com/esg32 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Comparing Advanced Analytic Developmentand ExecutionExample: Determine Spikes In Hourly Pageviews Apache Hadoop Teradata Aster “By using SQL-MapReduce, “This is also why the execution Asterand find allMRfewerpageviews/hr to 1 takespages <100 steps pagename •  Write Java job to group records by time•  in Aster nPath is SQL as regular expressions 1 •  Input parameters in much faster.” Use Aster developbyanalytics” hour fields •  Sort the yy/mm/dd and •  Single Pass of the data •  Java reduce phase to place all same-keyed records into temporary arrays •  SQL handles group-by, counts, sorts 2 Execute •  MapReduce perform regular pattern matching •  Compute counts for low/high/low hourly page views over a sequence of rows “Rather than using MapReduce “Map•  or Reduce relational table data 3+ •  Create custom partitioner 3 Outputs written to requires •  Create custom grouping comparator processing for each step in the shuffling andtools to visualize results •  Create custom key comparator •  Use SQL or BI produces higher analysis, SQL is used in place latency than SQL”Execute •  Execute each Mapper and Reducer of a•  Map (or Reduce) phase Multiple passes of data and•  MapReduce is used only in Save output to flat files making it unstructured, stepsDB interfaces (e.g. ODBC/JDBC) 5 that cannot be •  No relational semantics and preventing use of expressed in with other tools (e.g., SSH/FTP) •  Retrieve results SQL.” Development Time: 4 hoursSource: Enterprise Strategy Group, Lab Validation Report, September 2012 Development Time: 1 hour (4x faster) Execution Time: 149 seconds Execution Time: 3 seconds (50x faster) 33 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Aster Big Analytics Appliance— Key InnovationsConfidential and proprietary. Copyright © 2012 Teradata Corporation.
    • Aster SQL-H™A Business User’s Bridge to Analyze Hadoop Data NEWAster SQL-H Gives Analysts andData Scientists a Better Way to Aster: SQL-HAnalyze Data Stored in Hadoop•  Allow standard ANSI SQL access to Hadoop Hadoop data MR•  Leverage existing BI tool and enable Data Filtering self service Data Hive HCatalog•  Enable 50+ prebuilt SQL-MapReduce Apps and IDE Pig Hadoop Layer: HDFS35 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Hortonworks Data PlatformEnterprise-Ready Hadoop The ONLY 100% open source data platform for Hadoop •  Tightly aligned with core Apache code lines •  All code committed back to open source •  Engineered integration with Teradata Viewpoint and Ambari •  HCatalog - centralized metadata services for easy data sharing •  Dependable full stack high availability •  Capacity scheduler for better multi-tenancy •  Intuitive graphical data integration tools36 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Viewpoint IntegrationEasier, Faster, and Better System ManagementCommon Management Console forAster, Teradata and Apache HadoopAster-Specific Query PortletsPortlets •  Query Monitor•  Aster Node Monitoring Admin Portlets•  Aster Completed •  Teradata System Processes •  Roles ManagerTrend/ Other Portlets •  System HealthVisualization •  Canary queriesPortlets •  Aster Alerting•  Capacity Heat Map•  Metrics Graph•  Metrics Analysis37 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Vital Infrastructure (TVI)Integrated hardware & software solution for systems management PROACTIVE RELIABILITY, AVAILABILITY, AND MANAGEABILITY1U server virtualizes system and cabinet management softwareServer Management VMS•  Cabinet Management Interface Controller (CMIC)•  Service Work Station (SWS)•  Automatically installed on base/first cabinetVMS allows full Eliminates need Supports TVI Support forrack solutions for expansion Teradata Aster andwithout additional racks, reducing hardware and Hadoopcabinet for customers’ floor Aster/Hadooptraditional SWS space & energy software costs 62–70% of Incidents Discovered through TVI38 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • How Can You Get Started? Aster ExpressConfidential and proprietary. Copyright © 2012 Teradata Corporation.
    • Making it easy to try Aster Big Analytics SolutionsAster Express, Aster Live, Aster Big Analytics Appliance Aster LiveAster Express Aster Big Analytics Appliance40 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Aster Express Tutorials Make it Easy to Startwww.asterdata.com/asterexpress41 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Teradata Aster Big Analytics Appliance SummaryBring Big Data to Life with Big Analytics & DiscoveryINDUSTRY’S FIRST UNIFIED BIG ANALYTICS APPLIANCEUNIFIED INTERFACES FOR ITERATIVE SQL AND MAPREDUCE ANALYTICSTERADATA-TRUSTED RELIABILITY, AVAILABILITY & MANAGEABILITYEASY TO DEPLOY, MANAGE & USE Get Started Now! asterdata.com/AsterExpress42 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • When to Use Which?The best approach by workload and data typeProcessing as a Function of Schema Requirements and Stage of Data Pipeline “Simple math Data Pre- Low Cost at scale” Joins, Analytics Processing, Storage and (Score, filter, Unions, (Iterative and Reporting Refining, Fast Loading sort, avg., Aggregates data mining) Cleansing count...) Financial Analysis, Ad-Hoc/OLAP Stable Teradata/ Enterprise-Wide BITeradata Teradata Teradata and Reporting Teradata Teradata Schema Hadoop Spatial/Temporal Active Execution Interactive Data Discovery Aster Evolving Aster / Aster / (SQL + Schema Hadoop Web Clickstream, Set-Top Box Analysis Hadoop Hadoop Aster Aster Aster MapReduce CDRs, Sensor Logs, JSON Analytics) Social Feeds, Text, Image Processing Aster Format,No Schema Hadoop Hadoop Audio/Video Storage and Refining Hadoop Hadoop Hadoop Aster Aster Aster Aster (MapReduce Analytics) Storage and Batch Transformations 44 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • When to Use Which?The best approach by workload and data typeProcessing as a Function of Schema Requirements and Stage of Data Pipeline “Simple math Data Pre- Low Cost at scale” Joins, Analytics Processing, Storage and (Score, filter, Unions, (Iterative and Reporting Refining, Fast Loading sort, avg., Aggregates data mining) Cleansing count...) Stable Teradata/ Hadoop Teradata Teradata Teradata Teradata Teradata Schema Aster Evolving Aster / Aster / (SQL + Hadoop Hadoop Hadoop Aster Aster Aster Schema MapReduce Analytics) Aster Format, Hadoop Hadoop Hadoop Hadoop Hadoop Aster Aster Aster Aster (MapReduceNo Schema Analytics) 45 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Ease of Development and ReuseAnalytic Foundation : 70+ out-of-the-box modulesModules Business-ready SQL-MapReduce Functions •  nPath: complex sequential analysis for time series analysis and behavioral pattern analysisPath Analysis •  Sessionization: identifies sessions from time series data in a singleDiscover patterns in rows of pass over the datasequential data •  Attribution: operator to help ad networks and websites to distribute “credit” •  Histogram: function to provide capability of generating •  Decision Trees: Native implementation of parallel random forests. •  Approximate percentiles and distinct counts: calculateStatistical percentiles and counts within specific varianceAnalysis •  Correlation: calculation that characterizes the strength of the relation between different columnsHigh-performance processing of •  Regression: performs linear or logistic regression between an outputcommon statistical calculations variable and a set of input variables •  Averages: calculate moving, weighted, exponential or volume- weighted averages over a window of dataRelational •  Graph analysis: finds shortest path from a distinct node to all otherAnalysis nodes in a graph •  Tokenization: splits strings into individual words to assist textDiscover important relationships processingamong Confidential and proprietary. Copyright © 2013 Teradata Corporation.46 data
    • Ease of Development and ReuseAnalytic Foundation : 50+ out-of-the-box modulesModules SQL-MapReduce Analytic Functions •  Text Processing: counts occurrences of words, identifies roots, &Text Analysis tracks relative positions of words & multi-word phrases •  Text Partition: analyzes text data over multiple rowsDerive patterns in textual data •  Levenshtein Distance: computes the distance between two words •  k-Means: clusters data into a specified number of groupings •  Canopy: partitions data into overlapping subsets within which k-Cluster means is performedAnalysis •  Minhash: buckets highly-dimensional items for cluster analysis •  Basket analysis: creates configurable groupings of related itemsDiscover natural groupings of data from transaction records in single passpoints •  Collaborative Filter: predicts the interests of a user by collecting interest information from many usersData •  Unpack: extracts nested data for further analysis •  Pack: compress multi-column data into a single columnTransformation •  Antiselect: returns all columns except for specified columnTransform data for more advanced •  Multicase: case statement that supports row match for multipleanalysis cases47 Confidential and proprietary. Copyright © 2013 Teradata Corporation.
    • Perceptions & Questions Analyst: Robin BloorTwitter Tag: #briefr The Briefing Room
    • The Bloor Group
    • Big Data Is About Analytics DATA AIN’T WHAT IT USED TO BE Machine generated data (logs) Web data Social media data Public data services Supply chain data Real-time data flows THE ANALOGY OF STRIP-MINING IS RELEVANT BECAUSE THE SCALE OF DATA ANALYTICS HAS EXPANDED DRAMATICALLY The Bloor Group
    • The Data Analytics Issue The Bloor Group
    • What Hadoop Is NOT A MULTIUSER HIGHLY TUNED ENGINE AN ANALYTICS PLATFORM A SOLUTION But it IS: A USEFUL, FLEXIBLE AND VERY ECONOMIC DATA STORE – WITH PLUG-INS The Bloor Group
    • About Data Analytics It is all about TIME TO INSIGHT – as long as that is followed by action Fast time to insight requires FLEXIBLE management of high performance data flows - for the benefit of the data analyst The data analyst needs to be able to MARSHAL the data Then maybe, just maybe, he will deserve the title of DATA SCIENTIST The Bloor Group
    • Clearly the Teradata Aster Big Analytics Appliance is apowerful data flow engine, so: !   How does Aster Data achieve its performance lift with MapReduce? !   How is it most usually deployed? !   Can it do data cleansing in flight? !   Can it perform analytic tasks? The Bloor Group
    • !   Why an appliance? What is gained and what is sacrificed?!   Which sectors/businesses do you expect to be able to make best use of this technology?!   Which companies/products do you regard as competitors (either direct or near)?!   Which companies/products do you partner with?!   How does the appliance fit in the cloud? The Bloor Group
    • Twitter Tag: #briefr The Briefing Room
    • Upcoming Topics This month: Big Data February: Analytics March: Open Source April: Intelligence www.insideanalysis.comTwitter Tag: #briefr The Briefing Room
    • Thank You for Your AttentionTwitter Tag: #briefr The Briefing Room