Overview - IBM Big Data Platform

  • 1,136 views
Uploaded on

Overview - IBM Big Data Platform

Overview - IBM Big Data Platform

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,136
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
133
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © 2013 IBM Corporation1 IBM Corporation Overview - Big Data & Analytics Vikas K Manoria Technical Consultant – Big Data & Analytics vmanoria@in.ibm.com
  • 2. IBM Big Data & Analytics © 2013 IBM Corporation2 Agenda What is Big Data? – Concepts – Characteristics Business Motivation – Big Data Challenges – How Big Data Impacts Every Aspect of Your Business – A Big Data Journey IBM Big Data Platform – InfoSphere Data Explorer – InfoSphere BigInsights – IBM PureData Systems, InfoSphere Warehouse – InfoSphere Streams Big Data Use Cases Get Started
  • 3. IBM Big Data & Analytics © 2013 IBM Corporation3 What is Big Data? All kinds of data – Large volumes – Valuable insight, but difficult to extract – May be extremely time sensitive Big Data is a Hot Topic Because Technology Makes it Possible to Analyze ALL Available Data “Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery and/or analysis.” Source: Matt Eastwood, IDC
  • 4. IBM Big Data & Analytics © 2013 IBM Corporation4 Characteristics of Big Data V4 = Volume Velocity Variety Veracity Collectively analyzing the broadening Variety Responding to the increasing Velocity Cost efficiently processing the growing Volume Establishing the Veracity of big data sources 1 in 3 business leaders don’t trust the information they use to make decisions 50x 35 ZB 20202010 30 Billion RFID sensors and counting 80% of the worlds data is unstructured
  • 5. IBM Big Data & Analytics 2009 800,000 petabytes 2020 35 zettabytes as much Data and Content Over Coming Decade 44x Business leaders frequently make decisions based on information they don’t trust, or don’t have1in3 83% of CIOs cited “Business intelligence and analytics” as part of their visionary plans to enhance competitiveness Business leaders say they don’t have access to the information they need to do their jobs 1in2 of CEOs need to do a better job capturing and understanding information rapidly in order to make swift business decisions 60% … And Organizations Need Deeper Insights Of world’s data is unstructured 80% Information is at the Center of a New Wave of Opportunity… 5 © 2013 IBM Corporation
  • 6. IBM Big Data & Analytics Merging the Traditional and Big Data Approaches IT Structures the data to answer that question IT Delivers a platform to enable creative discovery Business Explores what questions could be asked Business Users Determine what question to ask Monthly sales reports Profitability analysis Customer surveys Brand sentiment Product strategy Maximum asset utilization Big Data Approach Iterative & Exploratory Analysis Traditional Approach Structured & Repeatable Analysis 6 © 2013 IBM Corporation
  • 7. IBM Big Data & Analytics © 2013 IBM Corporation7 Imagine the Possibilities of Harnessing Your Data Resources Big data challenges exist in every organization today Retailer reduces time to run queries by 80% to optimize inventory Stock Exchange cuts queries from 26 hours to 2 minutes on 2 PB Government cuts acoustic analysis from hours to 70 Milliseconds Utility avoids power failures by analyzing 10 PB of data in minutes Telco analyses streaming network data to reduce hardware costs by 90% Hospital analyses streaming vitals to detect illness 24 hours earlier
  • 8. IBM Big Data & Analytics Integrate and Govern all Data Sources Integration, Data Quality, Security, ILM, MDM Leveraging Big Data Requires Multiple Platform Capabilities 8 Manage Streaming Data Stream Computing Understand and Navigate Federated Big Data Sources Federated Discovery and Navigation Data WarehousingStructure and Control Data Manage and Store Huge Volume of any Data Hadoop File System MapReduce Analyze Unstructured Data Text Analytics Engine
  • 9. IBM Big Data & Analytics © 2013 IBM Corporation9 IBM’s Business-centric Big Data Platform Enables you to start with a critical business needs and expand the foundation for future requirements “Big data” isn’t just a technology— it’s a business strategy for capitalizing on information resources Getting started is crucial Success at each entry point is accelerated by products within the big data platform Build the foundation for future requirements by expanding further into the big data platform
  • 10. IBM Big Data & Analytics • Financial and tax preparation software and services • $4.15B rev 2012 A Big Data Journey: Anticipating and Improving Customer Interactions Project 1: Big Data Foundation -Data Warehousing, Data Quality, Customer Data Hub -Single view of the customer Project 2: Analytics -Customer behavior and segmentation analysis -Reduced customer churn 10% -$10M new revenue in 12months Project 3: Unstructured Data Analytics -Social media analysis, Log Analysis, Text Analytics -Augment customer profiles with new data sources -Data warehouse cost optimization -Data Exploration Project 4: Real Time Analytics -No latency analytics -Real time behavior prediction -Real time customer segmentation 10
  • 11. IBM Big Data & Analytics Cloud | Mobile | Security Gather, extract and explore data using best of breed visualization Speed time to value with analytic and application accelerators IBM Big Data Platform Systems Management Applications & Development Visualization & Discovery Analyze streaming data and large data bursts for real-time insights Govern data quality and manage information lifecycle Cost-effectively analyze Petabytes of structured and unstructured information Deliver deep insight with advanced in-database analytics and operational analytics Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse Contextual Discovery Index and federated discovery for contextual collaborative insights Solutions Analytics and Decision Management Big Data Infrastructure Big Data Platform and Application Frameworks
  • 12. IBM Big Data & Analytics ETL, MDM, Data Governance Metadata and Governance Zone 12 Warehousing Zone Enterprise Warehouse Data Marts An example of the big data platform in practice Ingestion and Real-time Analytic Zone Streams Connectors BI & Reporting Predictive Analytics Analytics and Reporting Zone Visualization & Discovery Landing and Analytics Sandbox Zone Hive/HBase Col Stores Documents in variety of formats MapReduce Hadoop
  • 13. IBM Big Data & Analytics TECHNOLOGY Example: Integrate big data sources with enterprise data SPSS Modeler Cognos RTM Real-time Analytics Predictive InfoSphere BigInsights Cognos Insight Cognos BI Export and Explore Social Media Analysis Reporting / Analysis Dashboards Cognos Consumer Insight IBM Business Analytics IBM Big Data Platform PureData Systems Data In-Motion Data At-Rest Other Sources
  • 14. IBM Big Data & Analytics © 2013 IBM Corporation14 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Key Use Cases:
  • 15. IBM Big Data & Analytics © 2013 IBM Corporation15 Big Difference: Schema on Run Regular database – Schema on load Big Data (Hadoop) – Schema on run Raw data Schema to filter Storage (pre-filtered data) Storage (unfiltered, raw data) Raw data Schema to filter Output
  • 16. IBM Big Data & Analytics
  • 17. IBM Big Data & Analytics BigInsights Enterprise Edition Connectivity and Integration Streams Netezza Text processing engine and library JDBC Flume Infrastructure Jaql Hive Pig HBase MapReduce HDFS ZooKeeper Indexing Lucene Adaptive MapReduce Oozie Text compression Enhanced security Flexible scheduler Optional IBM and partner offerings Analytics and discovery “Apps” DB2 BigSheets Web Crawler Distrib file copy DB export Boardreader DB import Ad hoc query Machine learning Data processing . . . Administrative and development tools Web console • Monitor cluster health, jobs, etc. • Add / remove nodes • Start / stop services • Inspect job status • Inspect workflow status • Deploy applications • Launch apps / jobs • Work with distrib file system •Work with spreadsheet interface •Support REST-based API • . . . R Eclipse tools • Text analytics • MapReduce programming • Jaql, Hive, Pig development • BigSheets plug-in development • Oozie workflow generation Integrated installer Open Source IBMIBM Cognos BI GPFS (EAP) Accelerator for machine data analysis Accelerator for social data analysis Guardium DataStageData Explorer Sqoop HCatalog
  • 18. IBM Big Data & Analytics Current fact finding Analyze data in motion – before it is stored Low latency paradigm, push model Data driven – bring data to the analytics Historical fact finding Find and analyze information stored on disk Batch paradigm, pull model Query-driven: submits queries to static data Traditional Computing Stream Computing Stream Computing Represents a Paradigm Shift Real-time Analytics 1818
  • 19. IBM Big Data & Analytics Modify Filter / Sample Classify Fuse Annotate Big Data in real-time with InfoSphere Streams Score Windowed Aggregates Analyze
  • 20. IBM Big Data & Analytics Mining in Microseconds (included with Streams) Image & Video (Open Source) Simple & Advanced Text (included with Streams) (IBM Research) (Open Source UIMA) Text (listen, verb), (radio, noun) Acoustic (IBM Research) (Open Source) Geospatial (IBM Research) Predictive (IBM Research) Advanced Mathematical Models (IBM Research) Statistics (included with Streams) ∑population tt asR ),( Analytic Accelerators Designed for Velocity (and Variety) 2020
  • 21. IBM Big Data & Analytics Putting it all together …end-to-end big data solution Netezza Appliance InfoSphere BigInsights IBM Cognos IBM SPSS Streaming Data Sources Discover Model Visualize & Publish Score Measure InfoSphere Streams InfoSphere Warehouse 2121
  • 22. IBM Big Data & Analytics
  • 23. IBM Big Data & Analytics
  • 24. IBM Big Data & Analytics
  • 25. IBM Big Data & Analytics Big SQL enables the Cognos BI server to delegate many types of analytical computations to BigInsights MapReduce processing instead of computing them locally at a performance cost like it would do with Hive Faster response times due to increased opportunity for query processing to occur closer to the data Not hindered by the latency and other limitations of querying Hadoop via Hive Application (Map-Reduce) Storage (HBase, HDFS) InfoSphere BigInsights Cognos BI Server Explore & Analyze Report & Act SQL Interface via JDBC Hive Cognos Business Intelligence optimized for Big SQL
  • 26. IBM Big Data & Analytics Of database queries for reporting2 3838xx Average Acceleration 2. Based on internal tests. Dynamic Query Compatible Query Dynamic Cubes Dynamic Cubes C1 C2 C3 C4 C5 C6 C7 C8C1 C2 C3 C4 C5 C6 C7 C8C1 C2 C3 C4 C5 C6 C7 C8C1 C2 C3 C4 C5 C6 C7 C8 DB2 with BLU Cognos BI + DB2 BLU + Power Performance – Cognos BI + DB2 BLU Dynamic Query Compatible Query Dynamic Cubes Dynamic Cubes Faster cube load* Faster DB Query*
  • 27. IBM Big Data & Analytics For apps like E-commerce… Database cluster services optimized for transactional throughput and scalability For apps like Customer Analysis… Data warehouse services optimized for high-speed, peta-scale analytics and simplicity For apps like Real-time Fraud Detection… Operational data warehouse services optimized to balance high performance analytics and real-time operational throughput Meeting Big Data Challenges – Fast and Easy! System for Transactions System for Analytics System for Operational Analytics System for Hadoop For Exploratory Analysis & Queryable Archive Hadoop data services optimized for big data analytics and online archive with appliance simplicity IBM PureData Systems
  • 28. IBM Big Data & Analytics © 2013 IBM Corporation28 Use Cases for a Big Data Platform Innovate New Products at Speed and Scale Know Everything about your Customer Social Media - Product/brand Sentiment analysis Brand strategy Market analysis RFID tracking & analysis Transaction analysis to create insight- based product/service offerings Social media customer sentiment analysis Promotion optimization Segmentation Customer profitability Click-stream analysis CDR processing Multi-channel interaction analysis Loyalty program analytics Churn prediction Run Zero Latency Operations Smart Grid/meter management Distribution load forecasting Sales reporting Inventory & merchandising optimization Options trading ICU patient monitoring Disease surveillance Transportation network optimization Store performance Environmental analysis Experimental research Instant Awareness of Risk and Fraud Multimodal surveillance Cyber security Fraud modeling & detection Risk modeling & management Regulatory reporting Exploit Instrumented Assets Network analytics Asset management and predictive issue resolution Website analytics IT log analysis
  • 29. IBM Big Data & Analytics 29 Every Industry can Leverage Big Data and Analytics. Insurance • 360˚˚˚˚ View of Domain or Subject • Catastrophe Modeling • Fraud & Abuse Banking • Optimizing Offers and Cross-sell • Customer Service and Call Center Efficiency Telco • Pro-active Call Center • Network Analytics • Location Based Services Energy & Utilities • Smart Meter Analytics • Distribution Load Forecasting/Scheduling • Condition Based Maintenance Media & Entertainment • Business process transformation • Audience & Marketing Optimization Retail • Actionable Customer Insight • Merchandise Optimization • Dynamic Pricing Travel & Transport • Customer Analytics & Loyalty Marketing • Predictive Maintenance Analytics Consumer Products • Shelf Availability • Promotional Spend Optimization • Merchandising Compliance Government • Civilian Services • Defense & Intelligence • Tax & Treasury Services Healthcare • Measure & Act on Population Health Outcomes • Engage Consumers in their Healthcare Automotive • Advanced Condition Monitoring • Data Warehouse Optimization Life Sciences • Increase visibility into drug safety and effectiveness Chemical & Petroleum • Operational Surveillance, Analysis & Optimization • Data Warehouse Consolidation, Integration & Augmentation Aerospace & Defense • Uniform Information Access Platform • Data Warehouse Optimization Electronics • Customer/ Channel Analytics • Advanced Condition Monitoring
  • 30. IBM Big Data & Analytics © 2013 IBM Corporation30 Clients Achieve Breakthrough Outcomes With IBM’s Big Data Platform Imperative Primary Capability Business Value Run Zero Latency Operations InfoSphere BigInsights Reduce maintenance costs and differentiate by optimal turbine placement PureData for Analytics Instant Awareness of Risk and Fraud Analysis time on 2 PB of data cut from 26 hours to 2 minutes PureData for Analytics Increased network availability by identifying and fixing holes Exploit Instrumented Assets InfoSphere Data Explorer Provide single point of access to disparate data sources Secure single point of access to all enterprise data Analyzed call records to drive real-time promotions & reduce churn InfoSphere Streams Know Everything about your Customers Aircraft Manufacturer
  • 31. IBM Big Data & Analytics 31 A Catalyst for ISV and Partner Innovation Traditional Approach Transformational Outcomes Customer segmentation based on loyalty data Historical analysis of subscriber data Managing rising cost of care Capturing information from all interactions to improve customer lifetime value Combining data from hundreds of hospitals to improve results across the healthcare continuum 2 million events analyzed per minute, delivering real-time insight to mobile operators Use Big Data analytics to prioritize and isolate areas of risk or rogue activity Anti-corruption and bribery compliance program Provide visibility, analysis and reporting across the entire supply chain (planning -> execution) Measure and predict patient payment behavior, reduce risk from bad debt and boost collection rates Analyzing parking systems to maximize revenue & improve the parking experience in cities Treat-first, seek-payment-later and write off bad debt Manual supply chain integration Random parking meter patrols & search for open spots
  • 32. IBM Big Data & Analytics Get started! Identify and prioritize business use cases Identify and prioritize business use cases New insights and new possibilities New insights and new possibilities New revenue opportunities New revenue opportunities Process and performance improvement Process and performance improvement Evolve your existing analytics capabilities Evolve your existing analytics capabilities Build or acquire new skills required Build or acquire new skills required Measure and communicate success Measure and communicate success Ensure that the business is engaged Ensure that the business is engaged Agree on the key measures for success Agree on the key measures for success Think Big Pick your Spot Execute and Deliver Value
  • 33. IBM Big Data & Analytics © 2013 IBM CorporationApril 24, 2014 Thank You