Cognitive computing big_data_statistical_analytics

1,311 views
1,157 views

Published on

Some Smarter Analytics Innovation Trends - Cognitive Computing, Big Data e
Statistical Analytics

Published in: Technology

Cognitive computing big_data_statistical_analytics

  1. 1. December 2013Some Smarter Analytics A Talk with students ofInnovation Trends University of Bari (Italy) – Computing Science DepartmentCognitive Computing, Big Data e Knowledge Bases and DataStatistical Analytics Mining (Basi di Conoscenza e Data Mining) CoursePietro LeoIBM GBS Executive Architect – Member of IBM Academy of Technology Leadership Team @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  2. 2. December 2012My Personal IT Mind-MapData Models World Instrumentation eBusiness Services/Legacy Applications Enterprise Data Pervasive Computing Storage (IMS, DBMS, Portals – Webization Internet of Things Etc,) Big Data Social-ization App-ization (structured & unstructured)Virtualization Web-App-ization Cloud Cloud Services IT Consumerization/BYOD Computing Cognitive Workload- Computing Optimizied Business Analytics Mobile Computing Parallel Computing Optimization Data Warehousing /Computing Models, Business Intelligence Social Business &Architectures & Styles Analytics - Information-based Intelligence Mobility = Conceptual connection, Evolution Path, Cause-Effect, etc. @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  3. 3. December 2012Agenda Research Overview and Grand Challenges 1 Cognitive Systems Era  Data Centric ← Beyond Big Data  Statistical Analytics ← Beyond Machine Learning 2 Cognitive Systems Strategic challenges for Our Organizations 3 Statistical Analytics Strategy 4 Examples of Statistical Analytics Problems & Benefits @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  4. 4. December 2012 IBM - Continually Looking Forward C-suite Studies Executive Exchange: http://www-935.ibm.com/services/c-suite/insights/index.html IBM Institute for IBM Global Business Value Technology Outlook Smarter Planet @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  5. 5. December 2012 Nothing Is Changing More than IT … The way The way The way it’s accessed it’s applied it’s architected Integrated ubiquitously for insight and flexible @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  6. 6. December 2012 Grand Challenges are the trigger of new changes… IBM Is Founded The IBM Punched Card RAMAC FORTRAN IBM 1401: The Mainframe 1911 1920 1954 1957 1959 Magnetic Stripe Universal Product Code The PC Scanning Tunneling Technology (UPC) barcode Microscope 1969 1973 1981 1986 Optimizing the Food Chain The Globally e-business Linux Integrated Enterprise 1988 1990s 2000 2006 Breaking the Petaflop The DNA Transistor Smarter Planet A Computer Called Watson Barrier 2008 2009 2008 @pieroleo 2011 www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  7. 7. December 2012 Ultimately Leading to Tremendous New Value Provide New Types of Insights @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  8. 8. December 2012Agenda 1 Cognitive Systems Era  Data Centric ← Beyond Big Data  Statistical Analytics ← Beyond Machine Learning 2 Cognitive Systems Strategic challenges for Our Organizations 3 Statistical Analytics Strategy 4 Examples of Statistical Analytics Problems & Benefits @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  9. 9. December 2012 Eras of computing Cognitive Systems Era ProgrammableComputer Intelligence Systems Era Tabulating Systems Era Time Time @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  10. 10. December 2012Cognitive Systems Cognitive Systems Era 1. Data-centric Programmable 2. Statistical analytics Systems Era 3. Scale in 4. Automated systems/ 1. Processor-centric workload managemen 2. Fixed calculation 3. Scale up/out 4. Manual systems Cognitive management Systems Era Programmabl e Systems Era @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  11. 11. December 2012Cognitive Systems Cognitive Systems Era 1. Data-centric Programmable 2. Statistical analytics Systems Era 3. Scale in 4. Automated systems/ 1. Processor-centric workload managemen 2. Fixed calculation 3. Scale up/out 4. Manual systems Cognitive management Systems Era Programmabl e Systems Era @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  12. 12. December 2012 Data-Centric: Big Data this is just the beginning Cognitive Systems Era Programmable Systems EraComputer Intelligence Percentage of uncertain data Tabulating Percentage of uncertain data Systems Era Time @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  13. 13. December 2012 Data-centric models are driving us to a new era of computing Volume Variety Structured, Semi-Terabytes to exabytes of structured Unstructured, existing data <20% Content Data text & multimedia to process >80% Traditional Velocity Enterprise Data Veracity Streaming data, Social Data from and about People Uncertainty frommilliseconds to seconds to inconsistency, respond ambiguities, etc. Physical Sensors & Streams @pieroleo www.linkedin.com/in/pieroleo13 Nove © 2012 IBM Corporation
  14. 14. Big data is a business priority – inspiring new models andprocesses for organizations, and even entire industries14 | ©2012 IBM Corporation
  15. 15. December 2012 Statistical analytics: Develop tools that augment human intelligence and productivity Cognitive Systems Era Programmable Systems EraComputer Intelligence Tabulating Systems Era Information-based Intelligence The Singularity! Kurzweil > 2045: The Year Man Becomes Artificial Intelligence Immortal Strong Approach Surpass Humans in Intelligence Time @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  16. 16. December 2012 Information-based Intelligence Approach Statistical, brute force approach based on analyzing Strong Approach vast amounts of information using powerful computersEarly efforts approached AI based on programming and sophisticated algorithms logic, reasoning, planning, learningA number of government supported academic efforts Scales very nicely: the more information you have, the in the 1960s and 1970s, primarily in the US (MIT, more powerful the computer, the more sophisticated Stanford, etc) and UK. Many felt that problem was the analytical algorithms . . . the better the results speed of machines - therefore machines would catch up with human intelligence within a generation based on advances in technology Data & Knowledge Integration  more insights you have, more methods and approaches you have, moreFifth Generation Project: Major Japanese effort in 1980s to leap ahead of US in computer development longitudianlabilities you have to generato point of views by creating new generation of intelligent, reasoning … more effective will be the final result machinesAll these efforts failed. Grossly underestimated Originated in science, especially high energy physics difficulty of developing machines exhibiting human intelligence Statistical Data mining (mainly from 1990s) Analytics Deep Blue (1997) Watson (2011) @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  17. 17. December 2012Agenda 1 Cognitive Systems Era  Data Centric ← Beyond Big Data  Statistical Analytics ← Beyond Machine Learning 2 Cognitive Systems Strategic challenges for Our Organizations 3 Statistical Analytics Strategy 4 Examples of Statistical Analytics Problems & Benefits @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  18. 18. December 2012Statistical Analytics challenges for Our Organizations From Data to Insight to Context From Data to Insight to Context Not about bigger or …It’s about fusing data and faster data from any one analytics from 100s-1000s of source… sourcesAnalyze Structured, Un- structure andUnstructured Data and Integrate InsightsAnalyst Social Web/digital From the Field Contact Center - Interactions @pieroleo www.linkedin.com/in/pieroleo These capabilities exist today: High Value Context Requires a Wide Variety of High-V Data SourcesCorporation © 2012 IBM
  19. 19. December 2012Cognitive Systems Strategic challenges for Our Organizations Create an integrated view of from Data & Content coming from ALL data channels including social business Data Channels Data Analysts/Cases From the Field Interactions Web/digital Social Semi-structured and Structured Unstructured Structured Data & Content Agent/case Data Call logs, Web Logs, Observation Data Transcripts, Emails… Big Data & Business Integrate and Analyze Structured and Unstructured Data Organization / Analytics Enterprise Insights  Crime Intelligence  Statistical Reports  Predictive Models Distribution  Alerts & warning  Analytics Reports  Geo-spatial Display & Utilization generation  Relation Resolution  Deep Text analytics  Identity Resolution @pieroleo www.linkedin.com/in/pieroleo19 Nove © 2012 IBM Corporation
  20. 20. December 2012Analytics challenge: Fusion reduces uncertainty by constructing context Required: tight integration to maximize context discovery Credit Loyalty Data Required: common practices followed FUSION finds by multiple standards for representing Michael Data uncertain data and uncertainty of all San Jose, CA Mother types, provenance, and lineage and Date other metadata Buyin Buyin Son g g Fact Birthday $560 DSLR DSLR today !! Discovery OR today Influencers Intent A $999 & NY Buying Spatial Reasoning a Sense Making DSLR & today ! Customer at Mall Temporal Reasoning Maximum Context For Customer in Store #42 Correlation Minimum Uncertainty Required: common APIs to enable $999 $560 sharing across the uncertainty Corroboration management pipeline In-Store Pricing (Evidence Combination) And Discounts ETC. No such common practices, standards or APIs exist today @pieroleo www.linkedin.com/in/pieroleo 20 © 2012 IBM Corporation
  21. 21. December 2012 The value of analytics grows by incorporating new sources of data, composing a variety of analytic techniques, spanning organizational silos, and enabling iterative, user-driven interaction New format or usage of data Multi-modal Intent-to-buy trends demand forecasting Sources and types of data Segmentation- based market impact estimates Price-based demand forecasting Sales-based (own & competitors) demand forecasting Structured or standardized Low Scope of decision High @pieroleo www.linkedin.com/in/pieroleo 21 © 2012 IBM Corporation
  22. 22. December 2012Analytics toolkits will be expanded to support ingestion and interpretation ofunstructured data, and enable adaptation and learning Adaptive Analysis Responding to context  Learn In the context of the Continual Analysis Responding to local change/feedback decision process Optimization under Uncertainty Quantifying or mitigating risk  Decide and Acts doh e M w N Optimization Decision complexity, solution speed e Predictive Modeling Causality, probabilistic, confidence levels Simulation High fidelity, games, data farming  Understand t Forecasting Larger data sets, nonlinear regression and Predict Alerts Rules/triggers, context sensitive, complex events Query/Drill Down In memory data, fuzzy search, geo spatiall anoti da T r Ad hoc Reporting Query by example, user defined reports  Report i Standard Reporting Real time, visualizations, user interaction Entity Resolution People, roles, locations, things  Collect and Relationship, Feature Extraction Rules, semantic inferencing, matching Ingest/Interpret Decide what to count; Annotation and Tokenization Automated, crowd sourceda aD w N enable accurate counting eExtended from: Competing on Analytics, Davenport and Harris, 2007 @pieroleo www.linkedin.com/in/pieroleo 22 © 2012 IBM Corporation t
  23. 23. December 2012Analytics solution development requires several interacting design steps Algorithm Composition and Invention Data Evaluation and Fusion Testing and Execution Optimization Streaming data Data mining & statistics Text data Optimization Multi-dimensional & simulation Semantic Time series analysis Fuzzy Geo spatial matching Video & image Network algorithms Relational New algorithms Social network ✔ Filtering and Business Rules Engine Composition and Data Acquisition Core Analytics Deployment Extraction Validation Packaging @pieroleo www.linkedin.com/in/pieroleo 23 © 2012 IBM Corporation
  24. 24. December 2012Agenda 1 Cognitive Systems Era  Data Centric ← Beyond Big Data  Statistical Analytics ← Beyond Machine Learning 2 Cognitive Systems Strategic challenges for Our Organizations 3 Statistical Analytics Strategy 4 Examples of Statistical Analytics Problems & Benefits @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  25. 25. December 2012Statistical Analytics Strategy Content Access Content & data Insight Distribution & Integration Organization Analytics & Utilization Disorganized Organized Investigation Knowledge And/OR Siloed Content added-value Accumulation & Content from Contend Distribution and Data From the chaos to the New visibility and Insight generation and order knowledge investigation support @pieroleo www.linkedin.com/in/pieroleo25 Nove 25 © 2012 IBM Corporation
  26. 26. December 2012A full set of functional capabilities needed to support a AStatistical Analytics Strategy Content Access Content & data Insight Distribution Organization Analytics & Integration & Utilization Natural LP Social Media Analytics Content Management Inf. Brodcast News Monitoring Extraction Image Analytics Adv. Advanced User Profiles Analytics Process Management Enterprise Analytics Search Deep Question & Answer Content Federation & Mining Content Predictive Reporting & Dashboards Entity Analytics Analytics Master Data resolution & Business Network Visualization Management Relation Rules Adv. Case Management discovery Content Classification Standard Datawarehouse models Advanced Big Data models (streams and restfull data) Disorganized Organized Investigation Investigation And/OR Siloed Content added-value from Knowledge Content Contend and Data Accumulation & Distribution @pieroleo www.linkedin.com/in/pieroleo26 Nove © 2012 IBM Corporation
  27. 27. December 2012Agenda 1 Cognitive Systems Era  Data Centric ← Beyond Big Data  Statistical Analytics ← Beyond Machine Learning 2 Cognitive Systems Strategic challenges for Our Organizations 3 Statistical Analytics Strategy 4 Examples of Statistical Analytics Problems & Benefits @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  28. 28. December 2012Examples of Statistical Analytics Benefits Retail Banking Customer Care Retail Customer Care Analyzing: Call logs, internal and external media, claim Analyzing: Call logs, online media For: Buyer Behavior For: Brand Reputation Management Benefits: Improve Customer satisfaction, marketing Benefits: Improve customer sat, marketing campaigns campaigns, find new revenue opportunities Healthcare Analytics Crime Analytics Analyzing: Care records Analyzing: Police records, Emergency calls… For: Clinical analysis; treatment protocol optimization For: Rapid crime solving & crime trend analysis Benefits: Better management of chronic diseases; optimized drug Benefits: Safer communities & optimized force deployment formularies; improved patient outcomes Insurance Fraud Automotive Quality Insight Analyzing: Insurance claims Analyzing: Tech notes, call logs, online media For: Detecting Fraudulent activity & patterns For: Brand Reputation Management Benefits: Reduced losses, faster detection, more efficient Benefits: Reduce warranty costs, improve customer claims processes satisfaction, marketing campaigns @pieroleo www.linkedin.com/in/pieroleo28 Nove © 2012 IBM Corporation
  29. 29. December 2012Agenda Ongoing Research Project with University of Bari: Recognise a “Complex Event” from Social Media Data Students: Francesco Tangari Rocco Caruso @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  30. 30. December 2012 Dipartimento di Informatica Università degli Studi di BariResearch challenge and its business value (1/2) A complex event has People Attributes a defined spatio- who is planning? who is going to Spatial temporal connotation: participate/attend?, who is interested Where is it and follows, which is the network located? it can be It involves one or created around the event… a square, a station, more individuals a virtual a place, (Who) that organize etc. where and/or Who everyone can Participate and/or are see the event followers to set up a defined action (what) Complex Event in a defined location, Argument & its Dimensions real or virtual, (where) what was What PROFILE Where planned in a given moment for the (when). Event? Whis is the topic and the A “flas mob” is an motivation?, When example of a complex es People event, other examples will dance, will are srikes, sport freeze, etc... Temporal events, protests, etc. The date and the time at which the event will take place, the date and the time where the event preparation will take place…. @pieroleo www.linkedin.com/in/pieroleo Un approccio Statistico per la Predizione di Flashmob da Reti © 2012 IBM Corporation Sociali 2
  31. 31. December 2012Research challenge and its business value (2/2 ) ….in the case of predicting a Flash MobLeveraging socialmedia data and •Ex. 2: Knowing that a flash mob will begenerate insights used for the promotion of a new product, aabout complex firm which is in competition on the same market can organize counter-action.business relevantphenomena by • Ex 2: A law enforcement org knowing thatconnecting the a flash mob will be organized for political purpose or for demonstration candots effectively relocate law forces. @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  32. 32. December 2012 Information about the Event are spread on a number of social media channels: An example of Flash Mob organization dynamic @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  33. 33. December 2012General System Context: First Prototype & Experimentation based onTwitter channelTwitter Channel (*) Flash Mobs Profile Recostruction & Alerts Information Extraction Who What FlashMob Where When Who (POS, named entities: person, Data Access organization, Locations, data, Event What FlashMob Where & Basic etc. High-level, concepts, wikification, etc..) Prediction Who Feature When & Alerting Extraction What FlashMob Where When Who (Tokenization, hashtags, (Clustering, Incremental What FlashMob Where URs, Geotags, social Social Network Clustering, Burst network metadata, etc.) Analysis recognition..) When (Clique, Relevant Nodes, Page Rank ndes, etc.. Implemented path Planed integration Analytics Consumers Acquiring tweets including the #flashmob hashtag (*) In our vision a number of “channels” should provide data (What’s up App for and/or the keyword “flashmob” to the system such as Facebook, YouTube, etc. As well as also smartphone, Social analytics and/or “flash mob” Other social analytics applications such as IBM COBRA or CCO, etc. @pieroleo client, etc. etc.) www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  34. 34. December 2012Working on real data and applying the prediction model Period: 1/gen – 29/Feb Alerts/Clusters = 59 Analyzed 5148 (English language) Tweets that included the word or the hashtag “flashmob” Generated in total 59 Flash Mob Alerts (clusters) involving 1267 tweets 20 Alerts correctly aggregated data about 20 Flash Mobs with an accuracy about of 100% @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  35. 35. December 2012In a new research phase we are now extending the predicton model torecostruct main Complex Event attributes Complex Event Attributes Hadoop CLUSERING CLUSERING DBSCAN DBSCAN COMPLEX COMPLEX EVENT EVENT PROFILER PROFILER NLP ANNOTATION EXTRACTOR EXTRACTOR TOOLS {1..N} HIVE DW HIVE DW Streaming JSON HDFS @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  36. 36. December 2012Agenda Wrap-up! @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  37. 37. December 2012My Personal IT Mind-MapData Models World Instrumentation eBusiness Services/Legacy Applications Enterprise Data Pervasive Computing Storage (IMS, DBMS, Portals – Webization Internet of Things Etc,) Big Data Social-ization App-ization (structured & unstructured)Virtualization Web-App-ization Cloud Cloud Services IT Consumerization/BYOD Computing Cognitive Workload- Computing Optimizied Business Analytics Mobile Computing Parallel Computing Optimization Data Warehousing /Computing Models, Business Intelligence Social Business &Architectures & Styles Analytics - Information-based Intelligence Mobility = Conceptual connection, Evolution Path, Cause-Effect, etc. @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation
  38. 38. December 2012 Grazie! @pieroleo www.linkedin.com/in/pieroleo © 2012 IBM Corporation

×