Technology Strategies for Big Data Analytics,

827
-1

Published on

SAS Presentation delivered at the Big Analytics Roadshow, 2012 in New York City on December 12, 2012

Presentation title: Technology Strategies for Big Data Analytics, by Bernard Blais, Global Strategist and Principal Manager, SAS

The exploding volume, complexity and velocity of big data present an increasing challenge to organizations, but also a significant opportunity to derive valuable insights. As organizations are tasked with managing massive data sets, it’s clear that the value of big data will be derived from the analytics that can be performed on it. Analytics is the key to identifying patterns, managing risks and tackling previously unsolvable problems. This presentation provides an overview of how to comprehensively tackle big data, including emerging strategies for information management, analytics, and high performance analytics.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
827
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
52
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Technology Strategies for Big Data Analytics,

  1. 1. TECHNOLOGY STRATEGIES FOR BIG DATA ANALYTICS BERNARD BLAIS PRINCIPAL, GLOBAL TECHNOLOGY PRACTICEC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  2. 2. THE CHALLENGE? VOLUME VARIETY DATA SIZE VELOCITY TODAY THE FUTUREC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  3. 3.  A flexible architecture that supports many data types and usage patterns Technology  Upstream use of analytics to optimize data relevance Checklist for  Real-time visualization and advanced Big Data analytics to accelerate understanding Analytics and action  Collaborative approaches to align Business and IT executivesC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  4. 4. THE ANALYTICS LIFECYCLE IDENTIFY / FORMULATE BUSINESS EVALUATE / PROBLEM DATA MANAGER MONITOR DATA SCIENTIST RESULTS PREPARATION Domain Expert Data Exploration Makes Decisions Data Visualization Evaluates Processes and ROI How can DEPLOY you create MODEL DATA competitive EXPLORATION advantage? IT SYSTEMS / DATA MINER / MANAGEMENT VALIDATE STATISTICIAN MODEL TRANSFORM Model Validation & SELECT Exploratory Analysis Model Deployment BUILD Descriptive Segmentation Model Monitoring MODEL Predictive Modeling Data PreparationC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  5. 5. HIGH- PERFORMANCE KEY COMPONENTS ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  6. 6. HIGH- PERFORMANCE KEY COMPONENTS ANALYTICS IDENTIFY / FORMULATE EVALUATE / PROBLEM DATA BUSINESS MONITOR DATA SCIENTIST MANAGER RESULTS PREPARATION Data Exploration Data Visualization Domain Expert Makes Decisions How can Evaluates Processes and ROI DEPLOY you create MODEL DATA competitive EXPLORATION advantage? IT SYSTEMS / DATA MINER / MANAGEMENT VALIDATE STATISTICIAN MODEL TRANSFORM Model Validation & SELECT Exploratory Analysis Model Deployment BUILD Descriptive Segmentation Model Monitoring MODEL Predictive Modeling Data PreparationC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  7. 7. HIGH- PERFORMANCE KEY COMPONENTS ANALYTICS DEPLOY In Database / FASTER In Memory DECISIONS CORE PREPARE Grid Computing / OPPORTUNITY BIGGER DATA In Memory DEVELOP BETTER RESULTS In MemoryC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  8. 8. HIGH- PERFORMANCE SAS® GRID COMPUTING ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  9. 9. HIGH- PERFORMANCE SAS® IN-DATABASE ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  10. 10. HOW DO WE MANAGE DATA IN THE PHYSICAL WORLD? 1. Acquire 2. Determine Relevance 3. Store Trash Cache Storage Copyright © 2012, SAS Institute Inc. All rights reserved.
  11. 11. HOW DO WE MANAGE INFORMATION IN THE IT WORLD? Users Systems Relevance is traditionally A Big Data Analytics strategydetermined at query time . . . requires a new approach . . . “Acquire, Store, Analyze” Queries “Stream it, Score it, Store it” Data Acquisition Data Transformations Data Normalization DATA Copyright © 2012, SAS Institute Inc. All rights reserved.
  12. 12. INFORMATION STREAM IT, SCORE IT, STORE IT MANAGEMENT ENTERPRISE DECISIONS / ACTIONS / DATA LOW COST STORAGE RAW RELEVANT DATAC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  13. 13. CUSTOMER TRADITIONAL ANALYTICS PROCESS CASE STUDY 3 HRS DATA MODEL MODEL EXPLORATION DEVELOPMENT DEPLOYMENTC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  14. 14. CUSTOMER HIGH-PERFORMANCE ANALYTICS PROCESS CASE STUDY Past Approach In-Database Approach • Daily process begins • Daily process begins at with flat file creation at 6:30am 4:00am with EDW load. – SLA delivered at ~9:30am. Business operational data loaded • All Value • File transferred to SQL Server, limited to ~350K customer directly to EDW. No flat file or - Scope of customer analysis: 350K vs. 40M records based on specific criteria. intermediate processing is needed. - Monthly collections: $1M-$3M per month • 300 step process to support • 10 step process data mining life cycle. • Scoring and customer selection done in-database against ALL customer rows 12 minutes 30 MINUTES TO SCORE ~350k 4 MINUTES TO SCORE ~40M customers customersC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  15. 15. HIGH- PERFORMANCE SAS® IN-MEMORY ANALYTICS ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  16. 16. IN-MEMORY EXPLORATION AND VISUALIZATION ARCHITECTURE > 1.1 BILLION RECORDS 10 SECONDSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  17. 17. IN-MEMORY MODEL DEVELOPMENT & DEPLOYMENT ARCHITECTURE 5½ HRS 82 SECONDSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  18. 18. CUSTOMER Customer SegmentationCASE STUDY Tailored and Real-time Marketing Campaigns Billions of Purchase Transactions Copyright © 2012, SAS Institute Inc. All rights reserved.
  19. 19. CUSTOMER TRADITIONAL ANALYTICS PROCESS CASE STUDY 167 Hours DATA MODEL MODEL EXPLORATION DEVELOPMENT DEPLOYMENTC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  20. 20. CUSTOMER IN-MEMORY ANALYTICS PROCESS 167 Hours CASE STUDY DEVELOPMENT EXPLORATION DEPLOYMENT Bottom-line Impact: MODEL MODEL DATA Tens of Millions of Dollars 84 SECONDSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  21. 21. SAS HIGH- PEFORMANCE ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  22. 22. SAS HIGH- PEFORMANCE ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  23. 23. SAS HIGH- PEFORMANCE ANALYTICSC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  24. 24. BEST PRACTICE Business Analytics Maturity Assessment Overview: Two-day on-site discovery session focused on understanding the client’s business and IT objectives, key initiatives, existing information management and analytics architecture, top challenges, and priorities. Process: • Review current business requirements, timeframes, critical success factors, and key business metrics (e.g. customer retention, customer acquisition). • Review operational data sources to support business priorities. • Review analytical priorities, strategy, process, and gaps. Deliverables: • Technology roadmap to optimize the client’s current and future IT-enabled analytical process. • Projected high-level ROI analysis resulting from proposed analytical architecture and process improvements.C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  25. 25. PROVEN VALUE PROPOSITION SAS ACROSS MULTIPLE INDUSTRIES FINANCIAL PUBLIC TELCO RETAIL SERVICES INDUSTRY SERVICES SECTOR COMPANY Risk Revenue Campaign Inventory Promotions USE CASE Management Leakage Optimization Management Management VALUE • 356X faster • Better able to • 15% better • Markdown • More precise risk audit campaign optimization – than calculations response from 30 hours competition • Detect issues rates to 2 hours • Faster in/out pre-refund • Coupon markets redemption rate +15%C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  26. 26. RETAIL USE In-database Model Scoring CASE 4½ HRS Overview:  The largest customer behavior marketing company in the world, Catalina Marketing analyzes and predicts shoppers’ buying behaviors to generate customized point-of-sale color coupons, advertisements and informational messages for retail stores and pharmacies nationwide. Process and Deliverables:  Leveraging In-database scoring, automated the execution of scoring models against their entire 140 million consumer database; Impact:  Catalina Marketing has reduced its model-scoring times from 4.5 hours to around 60 seconds using SAS Scoring Accelerator. As a result, it is able to use more complex, varied models to obtain analytical results faster for more efficient, reliable decisions -- improving brand performance on behalf of its food, drug, and mass advertising and marketing partners. 60 SECONDS  Implementation of marketing campaigns in days vs. more than 1 month before.C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  27. 27. FINANCIALSERVICES USE Credit Risk on Banking Data CASE Overview: Data Source: Bank loan portfolio covering:  3 million loans;    5,000 stress scenarios; 40 time horizons; Transition matrix approach 3 MINUTES Process and Deliverables:  Estimates of credit losses under stress over multiple horizons.  Completed compute time: under 3 minutes. Impact:  Fast estimates of credit losses under stress over multiple horizons, enables the Bank to make changes to lending practices throughout the dayC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  28. 28. PUBLIC SECTOR USE Text Mining on Unstructured Data CASE 5½ HRS Overview:  USA’s National Highway Traffic Safety Administration  700,000 accident reports on Vehicles make and models, manufacturing date, purchase date, failures, mileage, number of cylinders, etc… Car components, Accidents information, etc Process and Deliverables:  Text Mining on accident reports. Analyze, Understand, Validate and Predict contents.  Report on content categorization. Text mining process runs in 1 minute 22 second on a High Performance Analytics Server, instead of in 5 ½ hours on a regular server. Impact:  99% time improvement means the whole process can now be considered an ITERATIVE, DYNNAMIC process 82 SECONDS  Analyst can run it 20 times before lunch, each time fine-tuning the model and improving the output, instead of maybe twice during the whole week.C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  29. 29. UTILITIES USE Forecasting On Smart Meter Data CASE Overview:  Oklahoma Gas & Electric Company (OG&E) serves nearly 800,000 customers in Oklahoma and western Arkansas. It was named the 2011 Utility of the Year.  Forecast energy demand with SAS Analytics, plan for future changes to its energy portfolio and optimize programs that encourage wiser use of energy. 12 records Process and Deliverables:  Use smart meter data coming from customers every 15 minutes (versus once a month) to create and measure the effectiveness of programs that reduce energy consumption. Impact: 30,000 records  What previously took one to three days can now be done in a matter of hours.  Weve gone from receiving 12 records for each customer to over 30,000 records per year.C op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  30. 30. CONCLUSION What High Performance Analytics Really Mean  It’s not just about incredible speed, it’s also about:  Confidence: No more sampling, subsetting, summarizing  Accuracy: More complex models, more variables  Efficiency: Leverage the Analytical Brain on valuable tasks  Agility: Adapt and (re)Act fasterC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  31. 31.  A flexible architecture that supports many data types and usage patterns Technology  Upstream use of analytics to optimize data relevance Checklist for  Real-time visualization and advanced Big Data analytics to accelerate understanding Analytics and action  Collaborative approaches to align Business and IT executivesC op yr i g h t © 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2012, SAS Institute Inc. All rights reserved.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×