Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Upcoming SlideShare
Loading in...5
×
 

Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making

on

  • 1,202 views

Fast Data as a different approach to Big Data for managing large quantities of “in-flight” data that help organizations get a jump on those business-critical decisions. Difference between Big Data ...

Fast Data as a different approach to Big Data for managing large quantities of “in-flight” data that help organizations get a jump on those business-critical decisions. Difference between Big Data and Fast Data is comparable to the amount of time you wait downloading a movie from an online store and playing the dvd instantly.

Data Mining as a process to extract info from a data set and transform it into an understandable structure in order to deliver predictive, advanced analytics to enterprises and operational environments.
The combination of Fast Data and Data Mining are changing the “Rules”

Statistics

Views

Total Views
1,202
Views on SlideShare
1,178
Embed Views
24

Actions

Likes
0
Downloads
26
Comments
0

2 Embeds 24

http://milano.codemotionworld.com 16
http://milan2013.codemotionworld.com 8

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making Presentation Transcript

  • Fast Data Mining Real Time Knowledge Discovery for Predictive Decision Making
 Nino Guarnacci nino.guarnacci@oracle.com !1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • Data Explosion
 Web & social networks experienced it first… Infographic by Go-gulf.com !2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • … but enterprises are now facing it too … but • Services and web transaction data (to refine recommendations, detect trends etc.) • “Sensor” data: • GPS in mobile phones • RFIDs • NFC • SmartMeters • Etc. • Log file monitoring and analysis • Security monitoring Utilities deploying smart meters? ! 200x information flowing to data center! !3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. enterprises are also facing it now
  • % 93 executives who would grade themselves C or lower in preparedness % 89 !4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. executives who say drawing intelligence organization is priority from data is top losing believe their revenue as a result of not being able to fully leverage information % 67 Source: Oracle Research Study - From Overload to Impact: An Industry Scorecard on Big Data Business Challenges, July 2012
  • Obstacles to Faster Manage Data – Latency Gap While Ensuring Accuracy, Efficiency, and Scale Fragmented event entities The Gap Business Value Business event Data captured Analysis completed Action taken Action Time !5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Source: Richard Hackethorn’s Component’s of Action Time
  • Obstacles to Faster Manage Data – Latency Gap While Ensuring Accuracy, Efficiency, and Scale Fragmented event entities The Gap Business Value Business event Data captured Analysis completed Action taken Action Time !6 Source: Richard Hackethorn’s Component’s of Action Time
  • What is Fast Data? Turning High Velocity Data into Value ▪ It’s about getting more from in-flight data ▪ It’s about faster action, faster insights ▪ It’s about running your business in real-time !7
  • Oracle Fast Data Approach Filter, Move, Transform, Analyze, and Act at High Velocity FILTER & 
 CORRELATE MOVE & TRANSFORM !8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. ANALYZE 
 ACT
  • Oracle Fast Data Approach Filter, Move, Transform, Analyze, and Act at High Velocity Network Status In-Memory Data Grid FILTER & 
 CORRELATE Real Time Streams Information • Parallel Multiple Streams: jms, files, coherence, db,.. • Different Object Type: text, java object… • High throughput for data Aggregation and Event Querying Coherence Data Grid holds the data and compute in parallel !9
  • Oracle Fast Data Approach Filter, Move, Transform, Analyze, and Act at High Velocity - Event Streams - Event-type Event-type Event-type EPN (Event Processing Network) Elements Adapter Channel Cache !10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. POJO JSON Processor HTTP Pub/Sub
  • Oracle Event Processing
 STREAMS SLA Detection: Pattern Matching <TRACE> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <TRACE> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <TRACE> <WHERE_HAPPENED_DETAIL> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <OFFICE> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <WHERE_HAPPENED_DETAIL> <WHERE_DESCRIPTION>MONZA</ <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> WHERE_DESCRIPTION> <OFFICE> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <WHERE_ID>MZ</WHERE_ID> <WHERE_HAPPENED_DETAIL> <WHERE_DESCRIPTION>MONZA</ WHERE_DESCRIPTION> </OFFICE> <OFFICE> </WHERE_HAPPENED_DETAIL> <WHERE_ID>MZ</WHERE_ID> <WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION> </TRACE> </OFFICE> <WHERE_ID>MZ</WHERE_ID> </WHERE_HAPPENED_DETAIL> </OFFICE> </TRACE> </WHERE_HAPPENED_DETAIL> </TRACE> Copyright © 2013, Oracle and/or its affiliates. All rights reserved. DATABASE SPATIAL Match Pattern= R 7 ◆ TIME WINDOW SELECT M.SLA_VIOLATED FROM TRACE IN CHANNEL, ENTITIES, SPATIAL CONTEXT MATCH_RECOGNIZE ( MEASURES SLA_VIOLATED PATTERN (A B) DEFINE A (DELIVERY TIME - NOW) < 2 DAYS B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM ) as M
  • Oracle Event Processing
 SLA Detection: Filtering & Correlation ISTREAM( SELECT FROM PARTITION BY SELECT M.SLA_VIOLATED FROM TRACE IN CHANNEL, ENTITIES, SPATIAL CONTEXT MATCH_RECOGNIZE ( MEASURES SLA_VIOLATED PATTERN (A B) DEFINE A (DELIVERY TIME - NOW) < 2 DAYS B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM ) as M WITHIN GROUP BY ) ▪ Aggregate and Correlate received filter-events Partition by Trip-Path probable SLA violations Copyright © 2013, Oracle and/or its affiliates. All rights reserved. SPATIAL_CONTEXT SLA_VIOLATED_OUT_CHANNEL START_OFFICE, WHERE_HAPPENED 1 HOUR HAVING ▪ COUNT(*), START_OFFICE, WHERE_HAPPEND, LATITUDE, LONGITUDE START_OFFICE COUNT(*) > 5
  • Oracle Fast Data Approach Mining? What is Oracle Data Filter, Move, Transform, Analyze, and Act at High Velocity 
 ! Real-Time Streams analysis, correlate events from Automatically sifting through large amounts of data to different source, manage and use them valuable new find previously hidden patterns, discover as a windows and slides relational data. insights and make predictions • Identify most important factor (Attribute Importance) • Predict customer behavior (Classification) • Predict or estimate a value (Regression) • Find profiles of targeted people or items (Decision Trees) • Segment a population (Clustering) • Find fraudulent or “rare events” (Anomaly Detection) • Determine co-occurring items in a “baskets” (Associations) !13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2013, CONFIDENTIAL – ORACLE RESTRICTED
  • Data Mining Provides
 Better Information, Valuable Insights and Predictions Cell Phone Churners vs. Loyal Customers Income Segment #3: Insight & Prediction IF CUST_MO > 7 AND INCOME < $175K, THEN Prediction = Cell Phone Churner, Confidence = 83%, Support = 6/39 Segment #1: IF CUST_MO > 14 AND INCOME < $90K, THEN Prediction = Cell Phone Churner, Confidence = 100%, Support = 8/39 Customer Months !14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • A Real Fraud Example Total purchases exceeds time period average My credit card statement—Can you see the fraud?
 May 22 May 22 … June 14 June 14 June 15 June 15 May 28 May 29 June 16 June 16 1:14 PM 7:32 PM Gas Station? 2:05 PM 2:06 PM 11:48 AM 11:49 AM 6:31 PM 8:39 PM 11:48 AM 11:49 AM FOOD WINE Monaco Café Wine Bistro Monaco? MISC MISC MISC MISC WINE FOOD MISC MISC Mobil Mart Mobil Mart Mobil Mart Mobil Mart Acton Shop Crossroads Mobil Mart Mobil Mart All same $75 amount? !15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 2013, $127.38 $28.00 Insert Information Protection Policy Classification from Slide 13 $75.00 $75.00 $75.00 $75.00 $31.00 $128.14 $75.00 $75.00 Pairs of $75?
  • “Essentially, all models are wrong, 
 …but some are useful.” 
 
 
 - George Box 
 (One of the most influential statisticians of the 20th century and a pioneer in the areas of quality control, time series analysis, design of experiments and Bayesian inference.) !16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • You Can Think of It Like This… Traditional SQL Oracle Data Mining • “Human-driven” queries • Domain expertise • Any “rules” must be defined and managed
 • SQL Queries • SELECT • DISTINCT • Automated knowledge discovery, model building and deployment • Domain expertise to assemble the “right” data to mine
 ! + • ODM “Verbs” • PREDICT • DETECT • AGGREGATE • CLUSTER • WHERE • CLASSIFY • AND OR • REGRESS • GROUP BY • PROFILE • ORDER BY • IDENTIFY FACTORS • RANK • ASSOCIATE !17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • ! Real-time Prediction for a Customer ! • On-the-fly, single record apply with new data (e.g. from call center) Select prediction_probability(CLAS_DT_5_2, 'Yes' USING 7800 as bank_funds, 125 as checking_amount, 20 as credit_balance, 55 as age, 'Married' as marital_status, 250 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership) Social Call from dual; Branc ECM BI Get Web Email CRM !18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Mobile
  • Predictive and Recommendation Analytics Real Time Data Mining Modeling with Streaming Events • !19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Combine Real Time Event Streaming Data Technologies with the Industry leading Oracle Historical Data Mining: – Oracle Data Mining • Rich set of Algorithms for Data Mining • Predict Customer Behavior • Find Profiles of Targeted People or Items, and determine important relationships • Immediately Predict Trends and Themes for Data in motion • Respond to Prevent Business Threats and take Advantage of Opportunities
  • Acting Oracle Data Mining: 
 Technology Behind the America’s Cup Win • “The USA holds 250 sensors to collect raw data: pressure sensors on the wing; angle sensors on the adjustable trailing edge of the wing sail to monitor the effectiveness of each adjustment, allowing the crew to ascertain the amount of lift it’s generating; and fiber-optic strain sensors on the mast and wing to allow maximum thrust without over bending them. ! 
 • But collecting data was only the 
 beginning. ORACLE Racing 
 also had to manage that data, 
 analyze it, and present useful 
 results…… !20 Copyright © 2012, http://www.sail-world.com/USA/Americas-Cup:-Oracle-Data-Mining-supports-crew-and-BMW-ORACLE-Racing/68834 Copyright © 2012, OracleOracle and/or its affiliates. Allreserved. and/or its affiliates. All rights rights reserved. Information Protection Policy Classification from Slide 13 Insert
  • Fast Data Mining Demo: 
 Fraud Prediction in action… ▪ Extract Knowledge starting from a csv file ▪ Execute Anomaly Detection Mining on stored data ▪ Put in place a RealTime Event Processing Flow ▪ Consuming event from In-Memory Data Grid ▪ Obtain instantly Fraud Prediction from : Streaming Data !21
  • Q&A !22
  • Thanks ! Fast Data Mining Real Time Knowledge Discovery for Predictive Decision Making
 Nino Guarnacci nino.guarnacci@oracle.com !23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.