NATC 2013 - Big Data in Real World by Chandra Kallur, IBM


Published on

NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

  1. 1. Chandra S Kallur Service Area Leader, Business Analytics and Optimization December 8, 2013 Big Data in the Real World © 2013 IBM Corporation
  2. 2. Agenda  Big Data – Myths & Truths  The Big Data Strategy  Examples of Big Data Instantiation in real world  Future of Big Data  What can Big Data do for your Organization ? 2 © 2013 IBM Corporation
  3. 3. Big Data Myths Big Data is only about Unstructured False! Most projects include information structured information sources. Big Data projects are expensive Big Data technologies makes traditional databases and warehouses obsolete False! Databases and warehouse remain vital part of analytic solutions Big Data technologies require BIG datasets 3 False! You should start small and projects should be ROI positive False! Flexibility, not data size, is the most important aspect. © 2013 IBM Corporation
  4. 4. Big Data: Is It Only For A Few Industries? 4 False? © 2013 IBM Corporation
  5. 5. The Big Data Strategy: Move the Analytics Closer to the Data New analytic applications drive the requirements for a big data platform • • Apply advanced analytics to information in its native form • Visualize all available data for ad-hoc analysis • Development environment for building new analytic applications • Workload optimization and scheduling • 5 Integrate and manage the full variety, velocity and volume of data Security and Governance © 2013 IBM Corporation
  6. 6. T-Mobile uses big data to optimize network performance and reduce costs • Needed a solution to store and analyze two years worth of Call Detail Records (CDRs), switch, billing and network event data for over 30 million subscribers to identify and address network bottlenecks • Analyze over 17 billion events per day to provide over 1,300 users with network Quality of Experience (QoE) analytics, traffic engineering, dropped session analytics as well as voice and data session analytics • Business users can perform ad-hoc network and traffic analysis to identify performance issues in seconds and address them faster 6 6 © 2013 IBM Corporation
  7. 7. Ufone uses real-time analytics to reduce customer churn Need • Difficulty in managing marketing campaigns • No direct ability to correlate campaigns with earned business • Execute a successful marketing campaign base on real time customer insights Benefits • Analyzed customer call detail records (CDRs) and created customer profile segmentation . • Data is streamed and analyzed real-time, offer is given to clients in a timely manner • Campaign response time improved from 25% to 50%, improving CDR analysis from 1 day to 30 seconds and customer churn reduced by 15 to 20% 7 7 © 2013 IBM Corporation
  8. 8. A European utility uses streams and predictive analytics to create accurate estimates of demand to fully capture and optimize the use of distributed generation resources Need •Real time, scalable and accurate forecasts at a very low level of locality •Very high number of forecast models automatically updated with limited user interaction •Incorporate local, diverse information, such as local weather conditions or events •Simulation for test and what-if analysis on huge amounts of data Benefits •Accuracy: 20% improvement over industry and academic state of the art. Validated onsite with real consumption data •Performance: 100’s of thousands of time series processed on an IBM Blade server •Abrupt changes in demand were resolved with network reconfigurations 8 8 © 2013 IBM Corporation © 2013 IBM Corporation
  9. 9. Optimizing capital investments based on double digit Petabyte analysis •Model the weather to optimize placement of turbines, maximizing power generation and longevity • Modeling based on a global 1x1 kilometer grid with hundreds of variables • Time to analysis curve flatted from 3 weeks to 3 days! •Build models to cover forecasting and real-time operation of power generation units • Wind turbine sensor data collection to store and understand PB’s of actual operating results, once the turbine is in production • Scope includes service intervals, mean time to failure, and optimization of turbine interaction with wind conditions 9 9 © 2013 IBM Corporation
  10. 10. A large U.S. regulated energy provider deploys condition-based maintenance to assess natural gas pipeline risks Need Correlate data from multiple sources into one actionable platform – utilizing information to better plan and deploy inspection, detection, maintenance, repair and replacement resources and personnel. Benefits •Unified source of truth by integrating data from: • GIS, EAM, historians • Corrosion history, drawings, cathodic protection • External data sources like weather, soil etc •Analytics-driven condition based assessment •Estimates of mean residual life, true asset age •The ability to associate asset condition with failure and mitigation actions probability •Identify prescriptive options on assets 10 10 © 2013 IBM Corporation © 2013 IBM Corporation
  11. 11. TerraEchos identifies and classifies potential security threats – miles away Need •More secure facilities •A U.S. high security facility needed a physical intrusion detection system able to detect, classify, locate and track potential threats – above and below ground Benefits •Because the solution captures and transmits in real-time, security personnel are able to have unprecedented insight into any event – even when the disturbance is miles away – and take appropriate action 11 11 © 2013 IBM Corporation © 2013 IBM Corporation
  12. 12. University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Capabilities Utilized Sooner Stream Computing • Performing real-time analytics using physiological data from neonatal babies • Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner • Early warning gives caregivers the ability to proactively deal with complications Results “Helps detect life threatening conditions up to 24 hours sooner” 12 • Helps detect life threatening conditions up to 24 hours sooner • Lower morbidity and improved patient care © 2013 IBM Corporation
  13. 13. Future of Big Data – Cognitive computing at play 2 1 Understands natural language and human speech Generates and evaluates hypothesis for better outcomes 99% 60% 10% 3 Adapts and Learns from user selections and responses 13 © 2013 IBM Corporation
  14. 14. Big Data and Watson Big Data technology is used to build Watson’s knowledge base Watson can consume insights from Big Data for advanced analysis Watson uses the Apache Hadoop open framework to distribute the workload for loading information into memory. CRM Data Social Media POS Data Approx. 200M pages of text (To compete on Jeopardy!) InfoSphere BigInsights Watson’s Memory 14 Distilled Insight - Spending habits - Social relationships - Buying trends Advanced search and analysis © 2013 IBM Corporation
  15. 15. DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture One Jeopardy! question can take 2 hours on a single 2.6Ghz Core: Optimized & Scaled out on 2,880-Core IBM HPC using UIMA-AS, Watson is answering in 2-6 seconds. Generates and scores many hypotheses using a combination of 1000’s Natural Language Processing, Information Retrieval, Machine Learning and Reasoning Algorithms. These gather, evaluate, weigh and balance different types of evidence to deliver the answer with the best support it can Question 100s sources 1000’s of Pieces of Evidence 100s Possible Answers find 100,000’s scores from many simultaneous Text Analysis Algorithms Multiple Interpretations Question & Topic Analysis Question Decomposition Hypothesis Generation Hypothesis Generation Hypothesis and Evidence Scoring ... 15 Hypothesis and Evidence Scoring Synthesis Final Confidence Merging & Ranking Answer & Confidence © 2013 IBM Corporation
  16. 16. Oncology Diagnosis and Treatment Demonstration IBM Watson Oncology Advisor IBM Confidential: References to potential future products are subject to the Important Disclaimer provided earlier in the presentation 16 © 2013 IBM Corporation
  17. 17. 17 © 2013 IBM Corporation
  18. 18. 18 © 2013 IBM Corporation
  19. 19. 19 © 2013 IBM Corporation
  20. 20. 20 © 2013 IBM Corporation
  21. 21. 21 © 2013 IBM Corporation
  22. 22. What can Big Data do for your organization? Act on Deeper Customer Insight          Social media customer sentiment analysis Promotion optimization Segmentation Customer profitability Click-stream analysis CDR processing Multi-channel interaction analysis Loyalty program analytics Churn prediction Optimize your Operational Processes            22 Create Innovative New Products  Social Media - Product/brand Sentiment analysis  Brand strategy  Market analysis  RFID tracking & analysis  Transaction analysis to create insight-based product/service offerings Prevent Fraud and Reduce Risk Smart Grid/meter management  Multimodal surveillance Supply Chain Optimization  Cyber security Sales reporting  Fraud modeling & detection Inventory & merchandising optimization  Risk modeling & management Options trading  Regulatory reporting ICU patient monitoring Proactively Maintain Disease surveillance Transportation network optimization your Assets Store performance  Network analytics Environmental analysis  Asset management and predictive issue resolution Experimental research  Website analytics  IT log analysis © 2013 IBM Corporation
  23. 23. Questions? 23 © 2013 IBM Corporation