SlideShare a Scribd company logo
1 of 52
Download to read offline
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Autonomous Data Warehouse
Oracle Machine Learning
Oracle Analytics Cloud
A Data Model Approach to performing
Pattern Analysis
Shankar Somayajula
shankar.somayajula@oracle.com
Feb 25th, 2020
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
• Pattern Analysis Data Model … as extension to analytical star schema
• Pattern/MB Rule Definition
• SQL Pattern Matching
• Market Basket BI Application/usecase
• Demo / Screenshots
• Benefits of Pattern Analysis – other possibilities
• Q&A
3
Agenda
Confidential – © 2020 Oracle Internal
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Finding Patterns in Data
Typical use cases in today’s world of fast exploration of data
Financial
Services
Money
Laundering
Fraud
Tracking Stock
Market
Law
&
Order
Monitoring
Suspicious
Activities
Retail
Returns
FraudBuying
Patterns
Session-
ization
Telcos
Money
Laundering
SIM Card
Fraud
Call
Quality
Utilities
Network
Analysis
Fraud
Unusual
Usage
Lots of
Data
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Typical Pattern Matching Use Cases
Input Data Pattern Result
Sessionization Weblogs continuous clicks by same
user
Generate reports on number of distinct
sessions, average page views per session, etc
Fraud Credit card
transactions
two transactions in different
locations within a short
period of time
Find cases in which a credit card may have been
used fraudulently since a physical person cannot
be in two places at once
In-game
purchases
Games logs events leading up to an in-
game purchase
Detect common sequences of event that results
in an in-game purchase
Fraud (mobiles) CDR logs SIM card being used in
multiple handsets
Flag individual SIM cards being used by multiple
handsets within a specified time period
Stock market
analysis
Ticker logs Track possible fraudulent
linked patterns of behavior
Track known patterns of behavior such as head
and shoulders, triangles, channels and wedges
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Typical Pattern Matching Use Cases
Input Data Pattern Result
Auditing/Complia
nce
Application
logs
Analyze changes to secure
customer data
Find instances where operator has made
suspect modifications to secure client data
Money
laundering
Transaction
logs
Search for small transfers
within a time window
following by large transfer
within “x” days of last small
transfer
Detect suspicious money transfer pattern for an
account and report account, date of first small
transfer, date of last large transfer
Call service
quality
CDR logs Search for
dropped/reconnected calls
Identify how many times calls were restarted in
a session, total effective call duration and total
interrupted duration
Login security Application
logs
Search for attempted logins Identify attempts to gain access to
application/schema that can be linked to
hackers or inappropriate access
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 8
PADM Evolution
• OAC/OBIEE Business Model • Typical MB involves
extraction of MB
Rules/Patterns from
Trx Data.
• MB Rules are
qualified with default
MB KPIs
• BI schema for adhoc
reporting/analysis
can involve source Trx
data analysis as well
as pattern/MB Rule
analysis (disjoint)
Store
Customer
Channel
Promotion
Product
MB Rule Trx
MB Prod
MB OML KPIs
MB Rules
MB Trx
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 9
PADM Evolution
• OAC/OBIEE Business Model • Typical MB involves
extraction of MB
Rules/Patterns from
Trx Data.
• MB Rules are
qualified with default
MB KPIs
• BI schema for adhoc
reporting/analysis
can involve source Trx
data analysis as well
as pattern/MB Rule
analysis
• Add Model
Dimension for
analysis context.
MB Rule Trx
MB Prod MB OML KPIs
MB Rules
MB Trx KPIs
MB Model
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 10
PADM Evolution
• OAC/OBIEE Business Model • Typical MB involves
extraction of MB
Rules/Patterns from
Trx Data.
• MB Rules are
qualified with default
MB KPIs
• Advanced BI schema
to support adhoc
reporting/analysis of
MB Rules/Patterns
across whole dataset
or split by attribute
fields as well against
source Trx subset of
interest.
• Model for analysis
context.
MB Rule Trx
MB Prod MB OML KPIs
MB Rules
MB Trx KPIs
MB Model
MB Rule KPIs
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 11
PADM Evolution
• OAC/OBIEE Business Model
MBKPIs (Model - Rule) –
Dataset, All Trx
MB Rules
MB Model
MB Rule KPIs
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 12
PADM Evolution
• OAC/OBIEE Business Model
MBKPIs (Model – Rule – Trx) –
Data Subset, Partition, Deepdives
MB Rules
MB Model
MB Rule KPIs
MB Rule Trx
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 13
Patterns – Some examples
• Complete Dataset (DS)
•
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 14
Patterns – Some examples
• Complete Dataset (DS)
• Find Big Dark Red panels (here, brown = red)
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 15
Patterns – Some examples
• Complete Dataset (DS)
• Find Big Dark Red panels (here, brown = red)
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 16
Patterns – Some examples
• Complete Dataset (DS)
– Assume each horizontal row is a set/transaction of ordered events
• Find a large Blue and a large Red combination of panels
– (here, brown = red) panels
Credits: 1. Photo by Markus Spiske on Unsplash
Natural order of events
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 17
Patterns – Some examples
• Complete Dataset (DS)
– Assume each horizontal row is a set/transaction
• Find combination: large Dark Blue and large Pink
Credits: 1. Photo by Markus Spiske on Unsplash
Natural order of events
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 18
Patterns – Some examples
• Complete Dataset (DS)
– Assume each horizontal row is a set/transaction
• Find combination: large Dark Blue followed by large Pink
Credits: 1. Photo by Markus Spiske on Unsplash
Natural order of events
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 19
Global Models
• Complete Dataset (DS)
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 20
Pattern Definition
• Complete Dataset (DS)
• Global Pattern: p, q => c … {If (p,q) THEN (c)}
– Global KPIs
Model 3
Credits: 1. Photo by Markus Spiske on Unsplash
Model 4
Model 1
Model 2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 21
Partitioned Models
• DB/Star Schema/Analysis Container (Host), MB Model (Context), MB Rules and MB KPIs
– Lab like environment for multiple models being in play
Credits: 1. Photo by Markus Spiske on Unsplash
Model 1
Model 2
Model 3
Model 4 MB Model Partitioned by Country (say)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 22
Pattern Definition
• Complete Dataset (DS) split by Country: {(C1), (C2), (C3)}
• Partitioned Pattern: p, q => c … {If (p,q) THEN (c)}
• For partition, country=C1 … p, q => c
Model 3
Credits: 1. Photo by Markus Spiske on Unsplash
Model 4 Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 23
Pattern Definition
• Complete Dataset (DS) split by Country: {(C1), (C2), (C3)}
• Partitioned Pattern: p, q => c … {If (p,q) THEN (c)}
• Partition - country=C1 … p, q => c
• Partition - country=C2 … NA (Knowledge Discovery),
Available (via SQL)
Model 3
Credits: 1. Photo by Markus Spiske on Unsplash
Model 4 Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 24
Pattern Definition
• Complete Dataset (DS) split by Country: {(C1), (C2), (C3)}
• Partitioned Pattern: p, q => c … {If (p,q) THEN (c)}
• Partition - country=C1 … p, q => c
• Partition - country=C2 …
• Partition - country=C3 … p, q => c
Model 3
Credits: 1. Photo by Markus Spiske on Unsplash
Model 4 Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 25
Pattern Definition
• Pattern or MB Rule
– IF antecedents ((optional) set of logical Partitions, set of products/items)
– THEN consequent (single product/item)
• Complete Dataset (DS):
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 26
Pattern Definition
• Pattern or MB Rule
– IF antecedents ((optional) set of logical Partitions, set of products/items)
– THEN consequent (single product/item)
• E.g.: Complete Dataset (DS) split by Country and Year:
{(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)}
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 27
Pattern Definition
• Pattern or MB Rule
– IF antecedents ((optional) set of logical Partitions, set of products/items)
– THEN consequent (single product/item)
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3,
Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
– Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)}
– Global KPIs: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)}
– Core pattern: p, q => c
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 28
Pattern Definition
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2),
(C3, Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 29
Pattern Definition
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2),
(C3, Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
– Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)}
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 30
Pattern Definition
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2),
(C3, Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
– Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)}
– Global KPIs: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)}
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 31
Pattern Definition
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2),
(C3, Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
• Core pattern: p, q => c
– Pattern Logical Partition can act as Filters (performant)
• Not concerned with KPIs at Global or NP levels
• Can be highly selective
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 32
Pattern Definition
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2),
(C3, Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
– Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)}
• Core pattern: p, q => c
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 33
Pattern Definition
• Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2),
(C3, Y1), (C3, Y2)}
• Pattern: country=C2 (LP), year=Y1 (LP), p, q => c
– Logical Partition (Part KPIs) : {(C2, Y1)}
– Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)}
– Global KPIs: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)}
• Core pattern: p, q => c
Time Year
Country Name
Dataset Partitioned along Time by Year(s)
Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
Y1
Y2
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34
Pattern Definition
• Complete Dataset (DS)
• Pattern: country=C2, year=Y1, p, q => c
– Logical Partition (Part KPIs) : No LP, hence Full DS
– Non-Partition (NP KPIs): NA
– Global KPIs: Full DS
• Core pattern: C2, Y1, p, q => c
Time Year
Country Name
Credits: 1. Photo by Markus Spiske on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 35
Examples of MB Rules/Insights
• (diapers) => (beer)
• (peanutButter, jelly) => (bread)
• Many ways to improve traditional MB
– Multiple levels of dimension … SKU to Sub-Category to Category (ideally at same time)
– Add additional dimensions – Trx/ Dimensional Attributes as tags
Multidimensional Rules with artificial/virtual products gives richer picture …
• (Item=X, isOver18=TRUE, isNewCustomer=TRUE) => (Item=Y)
• (buyerAge >= 63, loyaltyAge>= 2) => (toothBrushBuy >=2)
• age(X,"20...29"), income(X,"52k...58k") => buys(X, "iPad")
•
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 36
Data Model can handle Multiple Datasets and
multiple models within a Dataset
• DB/Star Schema/Analysis Container (Host), MB Model (Context), MB Rules and MB KPIs
– Lab like environment for multiple models being in play
Trx Dataset #1 (SS1, SS #1) Trx Dataset #2 (SH2, SS #2)
Credits: 1. Photo by Markus Spiske on Unsplash, 2. Photo by Andrew Ridley on Unsplash
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 37
MB Rules >> Patterns >> Insights … #1a
• MB Rules
– IF antecedents (set of products/items)
– THEN consequent (single product/item)
– This is extracted from an Association Rule (AR) model after running the Apriori algorithm on the input Transactional data
– Possible to store the MB Rule in many ways. For e.g. for rule "b, p, r => c“, we can store the rule in the following ways:
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 38
MB Rules >> Patterns >> Insights … #1b
• MB Rules
– IF antecedents ((optional) set of logical Partitions, set of products/items)
– THEN consequent (single product/item)
– Possible to store the MB Rule in many ways. For e.g. for rule "country=C2 (LP, year=Y1 (LP), b,p,r => c “, we can store the rule
in the following ways:
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 39
MB Rules >> Patterns >> Insights … #2
• A lot of MB Rules and Not all patterns are useful.
• Taking the MB Rule and analyzing it in different contexts is typically an offline exercise
– Typically this would involve a lot of offline actions/modeling exercises to look at the Transactional dataset from different
perspectives
– From frinkiac :D
– Well, There is a way … and that’s where SQL Pattern Matching comes in.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 40
MB Rules >> Patterns >> Insights … #3
• Make it possible to act on the MB Rule independent of the entire model
– The output of modeling exercise is a lot of MB Rules but Not all patterns are useful.
– Association Rules Model based Patterns (i.e. MB Rules) are independent of each other. Allows focused
analysis. Unlike a Decision Tree based Rule or a Clustering Model, we can zoom in on a set of rules or even a
single rule of interest and analyze it w/o affecting the rest of the patterns/Rules.
– We have ways to identify "interesting" rules using technical criteria/KPIs but context/business exigency
trumps technical analysis.
– Multiple MB Models can be in play at the same time working on the same input transactional dataset but
baking in business context into the model. E.g. analyze product purchase patterns with model1, analyze
mode of payment choices for products/product categories in model2, analyze behavior of customer
segments say, newly signed up customers or customers responding to a Marketing Campaign in model3 etc.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 41
MB Rules >> Patterns >> Insights … #4 (cont.)
• A lot of MB Rules and Not all patterns are useful.
• Taking the MB Rule and analyzing it in different contexts is typically an offline exercise
Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash
Model 3
Model 1
Model 2
Model 4
Rule 1
Rule N
…
Rule 2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 42
MB Rules >> Patterns >> Insights … #4 (cont.)
• A lot of MB Rules and Not all patterns are useful.
• Taking the MB Rule and analyzing it in different contexts is typically an offline exercise
Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash
Model 3
Model 1
Model 2
Model 4
Rule 1
Rule N
…
Rule 2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 43
MB Rules >> Patterns >> Insights … #4 (cont.)
• A lot of MB Rules and Not all patterns are useful.
• Taking the MB Rule and analyzing it in different contexts is typically an offline exercise
Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash
Model 3
Model 1
Model 2
Model 4
Rule …
Rule N
Rule 101
Rule 102
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 44
MB Rules >> Patterns >> Insights … #4 (cont.)
• A lot of MB Rules and Not all patterns are useful.
• Taking the MB Rule and analyzing it in different contexts is typically an offline exercise
Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash
Model 3
Model 1
Model 2
Model 4
Rule …
Rule N
Rule 101
Rule 102
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 45
MB Rules >> Patterns >> Insights … #5
• Allow for What-if actions on MB Rules/Patterns
– From frinkiac :D
– SQL Tools allow what-if ... Facilitate end users to perform what if actions via BI Tools.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 46
Demo
• Autonomous Database Warehouse (ADW) … Oracle 18c database
• Oracle Machine Learning (OML) bundled/packaged with ADW
• Oracle Analytics Cloud (OAC)
– Many advanced features of the solution leverage the rpd (data modeling layer) component
of OAC
– KPI Calculations and Deepdives on-demand need the modeling layer (rpd or equivalent)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 47
Why Data Model Solution instead of hand
crafted SQL?
• Pattern Matching SQL benefits from using a fixed pattern for matching.
– We can write SQL for a single Rule … to match against a dataset (many Trx)
• For e.g. for rule “p, q, r => a” we use ….
PATTERN ( permute(p,q,r) | a )
DEFINE
p as (mb_prod_id = 'p'),
q as (mb_prod_id = ‘q'),
r as (mb_prod_id = ‘r'),
a as (mb_prod_id = ‘a')
– When we need to match many patterns (say, act on a whole AR model with 100+ rules of
varying sizes) -- each against a trx dataset we should define the patterns via
metadata/component structures.
PATTERN ((apli|bpli|opli)*)
DEFINE
apli as (mb_comp_li = 'ap'),
bpli as (mb_comp_li = 'bp'),
opli as (mb_comp_li = 'op')
• Same sql for any pattern => Allows integration into ETL or use in sql view to match
dynamically via sql query (issued by BI Tools).
Metadata based pattern, SQL
Data driven pattern, Dyn SQL
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 48
Why Data Model Solution?
• Expand defn of "MB Products" to cover other dimensions - channel,
city, country, dayname, timeofday, ... as artificial products
• Design-Time/ETL/offline modeling decisions can be deferred to online
analysis for more interactive/dynamic analysis, BI Dashboard time
decisions
• Possible to model Complex behavior for analysis (sometimes need
extra ETL step but we get full analytics capability thereafter)
– "Avid" Reader/Browser
– "Very Active/Interested in product: X" during Sale/Holiday
– No/Regular/Aggressive Treatment of Patients and its effect on outcomes
– Use a datapoint or (set of) Trx as source for pattern definition (What If) … (no ETL)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 49
Why Data Model Solution?
• Use case: Classification Models (Single Row Trx) can be coerced into
Master-Detail multi-row format needed for SQL Pattern Matching.
– Decision Tree or any other Classification model (Linear Regression, Random
Forests based models as well as other models built using NN, CNN etc.) can be
analyzed using the True Positive (TP) pattern.
– Confusion Matrix KPIs like Accuracy, Recall, Precision etc can be calculated and
recreated at Model/Global level. As shown, ability to do the same for logical
Partition(s) is also possible.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 50
Summary
• Pattern Analysis (Earlier)
– Pattern Discovery via OAA/ODM
– Model used to extract Rules and
core KPIs
– No way to score Rules (need to
rebuild)
– Patterns of special interest
(anomalous/obscure) cannot be
found unless model settings are
relaxed. If we relax the criteria,
we will get those patterns but
also many many more.
• Pattern Analysis via Data Model (This Solution)
– Pattern Discovery via OML (no change)
– Rules/KPIs extracted into a Data Model allowing for BI/Adhoc
analysis
– Post – processing to setup the analysis context (superset of
analysis dimensions/attributes)
– SQL approach allows
• New KPIs – KPIs of statistical nature as well as KPIs related to Business needs
(as elaborate as needed)
• Scoring against new data possible – patterns can degrade in performance
• Score/Track Patterns against specific Trx subsets of interest
• Adhoc BI/Exploratory Data Analysis of Patterns
• Special Patterns of interest (Fraud use cases) with very low support can also be
found as well as analyzed (what-if)
• 2 independent ways to MB KPIs – ETL + DB/BI (faster) or DB View + DB/BI
(slower, on demand)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 51
• Useful?
• Very little shown of ADW/OML currently (end goal), using SQL Developer for most Db actions
• Need more details on Market Basket Analysis (MBA)? SQL Pattern Matching? 40 min talk
precludes possibility of giving lot of introduction to the material.
AnDSummit2020 Session Pattern Analysis Data Model

More Related Content

What's hot

Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Jean Ihm
 
Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Jean Ihm
 
Information Exploitation at BBN
Information Exploitation at BBNInformation Exploitation at BBN
Information Exploitation at BBNPlamen Petrov
 
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on HadoopPractical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on HadoopDataWorks Summit
 
Graph Analytics for big data
Graph Analytics for big dataGraph Analytics for big data
Graph Analytics for big dataSigmoid
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesEUDAT
 
Machine learning with Spark
Machine learning with SparkMachine learning with Spark
Machine learning with SparkKhalid Salama
 
Serverless data pipelines gcp
Serverless data pipelines gcpServerless data pipelines gcp
Serverless data pipelines gcpCatherine Kimani
 
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...Databricks
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonDremio Corporation
 
Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...
Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...
Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...Yuanyuan Tian
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
 
詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systemshdhappy001
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
Big Data LDN 2016: All data is equal – but some data is more equal than others
Big Data LDN 2016: All data is equal – but some data is more equal than othersBig Data LDN 2016: All data is equal – but some data is more equal than others
Big Data LDN 2016: All data is equal – but some data is more equal than othersMatt Stubbs
 

What's hot (20)

Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1)
 
Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Gain Insights with Graph Analytics
Gain Insights with Graph Analytics
 
Information Exploitation at BBN
Information Exploitation at BBNInformation Exploitation at BBN
Information Exploitation at BBN
 
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on HadoopPractical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on Hadoop
 
Graph Analytics for big data
Graph Analytics for big dataGraph Analytics for big data
Graph Analytics for big data
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
SPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth ObservationSPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth Observation
 
rasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubesrasdaman: from barebone Arrays to DataCubes
rasdaman: from barebone Arrays to DataCubes
 
Machine learning with Spark
Machine learning with SparkMachine learning with Spark
Machine learning with Spark
 
Machine Learning with Apache Spark
Machine Learning with Apache SparkMachine Learning with Apache Spark
Machine Learning with Apache Spark
 
Incorporating ISO Metadata Using HDF Product Designer
Incorporating ISO Metadata Using HDF Product DesignerIncorporating ISO Metadata Using HDF Product Designer
Incorporating ISO Metadata Using HDF Product Designer
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
Serverless data pipelines gcp
Serverless data pipelines gcpServerless data pipelines gcp
Serverless data pipelines gcp
 
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...
Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...
Building A Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and ...
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems詹剑锋:Big databench—benchmarking big data systems
詹剑锋:Big databench—benchmarking big data systems
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Big Data LDN 2016: All data is equal – but some data is more equal than others
Big Data LDN 2016: All data is equal – but some data is more equal than othersBig Data LDN 2016: All data is equal – but some data is more equal than others
Big Data LDN 2016: All data is equal – but some data is more equal than others
 

Similar to AnDSummit2020 Session Pattern Analysis Data Model

L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
 
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Eelco Visser
 
What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...
What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...
What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...Edureka!
 
ASHviz - Dats visualization research experiments using ASH data
ASHviz - Dats visualization research experiments using ASH dataASHviz - Dats visualization research experiments using ASH data
ASHviz - Dats visualization research experiments using ASH dataJohn Beresniewicz
 
Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeTorsten Steinbach
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDatabricks
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Tech Triveni
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data LakesLinked Enterprise Date Services
 
When Graphs Meet Machine Learning
When Graphs Meet Machine LearningWhen Graphs Meet Machine Learning
When Graphs Meet Machine LearningJean Ihm
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaSpark Summit
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsJen Aman
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
 
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...Elizabeth Steiner
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Clustrix
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
[Sirius Day Eindhoven 2018] ASML's MDE Going Sirius
[Sirius Day Eindhoven 2018]  ASML's MDE Going Sirius[Sirius Day Eindhoven 2018]  ASML's MDE Going Sirius
[Sirius Day Eindhoven 2018] ASML's MDE Going SiriusObeo
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware OverviewChristalin Nelson
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterDatabricks
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise ArchitectsNeo4j
 

Similar to AnDSummit2020 Session Pattern Analysis Data Model (20)

L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
 
What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...
What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...
What Is SAS | SAS Tutorial For Beginners | SAS Training | SAS Programming | E...
 
ASHviz - Dats visualization research experiments using ASH data
ASHviz - Dats visualization research experiments using ASH dataASHviz - Dats visualization research experiments using ASH data
ASHviz - Dats visualization research experiments using ASH data
 
Suburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data LakeSuburface 2021 IBM Cloud Data Lake
Suburface 2021 IBM Cloud Data Lake
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
Big Data Modeling
Big Data ModelingBig Data Modeling
Big Data Modeling
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
 
When Graphs Meet Machine Learning
When Graphs Meet Machine LearningWhen Graphs Meet Machine Learning
When Graphs Meet Machine Learning
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
Knowledge-Based Analysis and Design (KBAD): An Approach to Rapid Systems Engi...
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
[Sirius Day Eindhoven 2018] ASML's MDE Going Sirius
[Sirius Day Eindhoven 2018]  ASML's MDE Going Sirius[Sirius Day Eindhoven 2018]  ASML's MDE Going Sirius
[Sirius Day Eindhoven 2018] ASML's MDE Going Sirius
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise Architects
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Recently uploaded (20)

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 

AnDSummit2020 Session Pattern Analysis Data Model

  • 1.
  • 2. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Autonomous Data Warehouse Oracle Machine Learning Oracle Analytics Cloud A Data Model Approach to performing Pattern Analysis Shankar Somayajula shankar.somayajula@oracle.com Feb 25th, 2020
  • 3. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | • Pattern Analysis Data Model … as extension to analytical star schema • Pattern/MB Rule Definition • SQL Pattern Matching • Market Basket BI Application/usecase • Demo / Screenshots • Benefits of Pattern Analysis – other possibilities • Q&A 3 Agenda
  • 4. Confidential – © 2020 Oracle Internal
  • 5. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Finding Patterns in Data Typical use cases in today’s world of fast exploration of data Financial Services Money Laundering Fraud Tracking Stock Market Law & Order Monitoring Suspicious Activities Retail Returns FraudBuying Patterns Session- ization Telcos Money Laundering SIM Card Fraud Call Quality Utilities Network Analysis Fraud Unusual Usage Lots of Data
  • 6. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Typical Pattern Matching Use Cases Input Data Pattern Result Sessionization Weblogs continuous clicks by same user Generate reports on number of distinct sessions, average page views per session, etc Fraud Credit card transactions two transactions in different locations within a short period of time Find cases in which a credit card may have been used fraudulently since a physical person cannot be in two places at once In-game purchases Games logs events leading up to an in- game purchase Detect common sequences of event that results in an in-game purchase Fraud (mobiles) CDR logs SIM card being used in multiple handsets Flag individual SIM cards being used by multiple handsets within a specified time period Stock market analysis Ticker logs Track possible fraudulent linked patterns of behavior Track known patterns of behavior such as head and shoulders, triangles, channels and wedges
  • 7. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Typical Pattern Matching Use Cases Input Data Pattern Result Auditing/Complia nce Application logs Analyze changes to secure customer data Find instances where operator has made suspect modifications to secure client data Money laundering Transaction logs Search for small transfers within a time window following by large transfer within “x” days of last small transfer Detect suspicious money transfer pattern for an account and report account, date of first small transfer, date of last large transfer Call service quality CDR logs Search for dropped/reconnected calls Identify how many times calls were restarted in a session, total effective call duration and total interrupted duration Login security Application logs Search for attempted logins Identify attempts to gain access to application/schema that can be linked to hackers or inappropriate access
  • 8. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 8 PADM Evolution • OAC/OBIEE Business Model • Typical MB involves extraction of MB Rules/Patterns from Trx Data. • MB Rules are qualified with default MB KPIs • BI schema for adhoc reporting/analysis can involve source Trx data analysis as well as pattern/MB Rule analysis (disjoint) Store Customer Channel Promotion Product MB Rule Trx MB Prod MB OML KPIs MB Rules MB Trx
  • 9. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 9 PADM Evolution • OAC/OBIEE Business Model • Typical MB involves extraction of MB Rules/Patterns from Trx Data. • MB Rules are qualified with default MB KPIs • BI schema for adhoc reporting/analysis can involve source Trx data analysis as well as pattern/MB Rule analysis • Add Model Dimension for analysis context. MB Rule Trx MB Prod MB OML KPIs MB Rules MB Trx KPIs MB Model
  • 10. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 10 PADM Evolution • OAC/OBIEE Business Model • Typical MB involves extraction of MB Rules/Patterns from Trx Data. • MB Rules are qualified with default MB KPIs • Advanced BI schema to support adhoc reporting/analysis of MB Rules/Patterns across whole dataset or split by attribute fields as well against source Trx subset of interest. • Model for analysis context. MB Rule Trx MB Prod MB OML KPIs MB Rules MB Trx KPIs MB Model MB Rule KPIs
  • 11. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 11 PADM Evolution • OAC/OBIEE Business Model MBKPIs (Model - Rule) – Dataset, All Trx MB Rules MB Model MB Rule KPIs
  • 12. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 12 PADM Evolution • OAC/OBIEE Business Model MBKPIs (Model – Rule – Trx) – Data Subset, Partition, Deepdives MB Rules MB Model MB Rule KPIs MB Rule Trx
  • 13. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 13 Patterns – Some examples • Complete Dataset (DS) • Credits: 1. Photo by Markus Spiske on Unsplash
  • 14. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 14 Patterns – Some examples • Complete Dataset (DS) • Find Big Dark Red panels (here, brown = red) Credits: 1. Photo by Markus Spiske on Unsplash
  • 15. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 15 Patterns – Some examples • Complete Dataset (DS) • Find Big Dark Red panels (here, brown = red) Credits: 1. Photo by Markus Spiske on Unsplash
  • 16. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 16 Patterns – Some examples • Complete Dataset (DS) – Assume each horizontal row is a set/transaction of ordered events • Find a large Blue and a large Red combination of panels – (here, brown = red) panels Credits: 1. Photo by Markus Spiske on Unsplash Natural order of events
  • 17. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 17 Patterns – Some examples • Complete Dataset (DS) – Assume each horizontal row is a set/transaction • Find combination: large Dark Blue and large Pink Credits: 1. Photo by Markus Spiske on Unsplash Natural order of events
  • 18. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 18 Patterns – Some examples • Complete Dataset (DS) – Assume each horizontal row is a set/transaction • Find combination: large Dark Blue followed by large Pink Credits: 1. Photo by Markus Spiske on Unsplash Natural order of events
  • 19. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 19 Global Models • Complete Dataset (DS) Credits: 1. Photo by Markus Spiske on Unsplash
  • 20. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 20 Pattern Definition • Complete Dataset (DS) • Global Pattern: p, q => c … {If (p,q) THEN (c)} – Global KPIs Model 3 Credits: 1. Photo by Markus Spiske on Unsplash Model 4 Model 1 Model 2
  • 21. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 21 Partitioned Models • DB/Star Schema/Analysis Container (Host), MB Model (Context), MB Rules and MB KPIs – Lab like environment for multiple models being in play Credits: 1. Photo by Markus Spiske on Unsplash Model 1 Model 2 Model 3 Model 4 MB Model Partitioned by Country (say)
  • 22. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 22 Pattern Definition • Complete Dataset (DS) split by Country: {(C1), (C2), (C3)} • Partitioned Pattern: p, q => c … {If (p,q) THEN (c)} • For partition, country=C1 … p, q => c Model 3 Credits: 1. Photo by Markus Spiske on Unsplash Model 4 Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
  • 23. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 23 Pattern Definition • Complete Dataset (DS) split by Country: {(C1), (C2), (C3)} • Partitioned Pattern: p, q => c … {If (p,q) THEN (c)} • Partition - country=C1 … p, q => c • Partition - country=C2 … NA (Knowledge Discovery), Available (via SQL) Model 3 Credits: 1. Photo by Markus Spiske on Unsplash Model 4 Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
  • 24. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 24 Pattern Definition • Complete Dataset (DS) split by Country: {(C1), (C2), (C3)} • Partitioned Pattern: p, q => c … {If (p,q) THEN (c)} • Partition - country=C1 … p, q => c • Partition - country=C2 … • Partition - country=C3 … p, q => c Model 3 Credits: 1. Photo by Markus Spiske on Unsplash Model 4 Dataset Partitioned along Geography by Country Name(s) C1 C2 C3
  • 25. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 25 Pattern Definition • Pattern or MB Rule – IF antecedents ((optional) set of logical Partitions, set of products/items) – THEN consequent (single product/item) • Complete Dataset (DS): Credits: 1. Photo by Markus Spiske on Unsplash
  • 26. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 26 Pattern Definition • Pattern or MB Rule – IF antecedents ((optional) set of logical Partitions, set of products/items) – THEN consequent (single product/item) • E.g.: Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 27. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 27 Pattern Definition • Pattern or MB Rule – IF antecedents ((optional) set of logical Partitions, set of products/items) – THEN consequent (single product/item) • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} – Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)} – Global KPIs: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} – Core pattern: p, q => c Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 28. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 28 Pattern Definition • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 29. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 29 Pattern Definition • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} – Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)} Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 30. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 30 Pattern Definition • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} – Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)} – Global KPIs: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 31. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 31 Pattern Definition • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} • Core pattern: p, q => c – Pattern Logical Partition can act as Filters (performant) • Not concerned with KPIs at Global or NP levels • Can be highly selective Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 32. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 32 Pattern Definition • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} – Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)} • Core pattern: p, q => c Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 33. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 33 Pattern Definition • Complete Dataset (DS) split by Country and Year: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Pattern: country=C2 (LP), year=Y1 (LP), p, q => c – Logical Partition (Part KPIs) : {(C2, Y1)} – Non-Partition (NP KPIs): {(C1, Y1), (C1, Y2), (C2, Y2), (C3, Y1), (C3, Y2)} – Global KPIs: {(C1, Y1), {C1, Y2), (C2, Y1), (C2, Y2), (C3, Y1), (C3, Y2)} • Core pattern: p, q => c Time Year Country Name Dataset Partitioned along Time by Year(s) Dataset Partitioned along Geography by Country Name(s) C1 C2 C3 Y1 Y2 Credits: 1. Photo by Markus Spiske on Unsplash
  • 34. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34 Pattern Definition • Complete Dataset (DS) • Pattern: country=C2, year=Y1, p, q => c – Logical Partition (Part KPIs) : No LP, hence Full DS – Non-Partition (NP KPIs): NA – Global KPIs: Full DS • Core pattern: C2, Y1, p, q => c Time Year Country Name Credits: 1. Photo by Markus Spiske on Unsplash
  • 35. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 35 Examples of MB Rules/Insights • (diapers) => (beer) • (peanutButter, jelly) => (bread) • Many ways to improve traditional MB – Multiple levels of dimension … SKU to Sub-Category to Category (ideally at same time) – Add additional dimensions – Trx/ Dimensional Attributes as tags Multidimensional Rules with artificial/virtual products gives richer picture … • (Item=X, isOver18=TRUE, isNewCustomer=TRUE) => (Item=Y) • (buyerAge >= 63, loyaltyAge>= 2) => (toothBrushBuy >=2) • age(X,"20...29"), income(X,"52k...58k") => buys(X, "iPad") •
  • 36. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 36 Data Model can handle Multiple Datasets and multiple models within a Dataset • DB/Star Schema/Analysis Container (Host), MB Model (Context), MB Rules and MB KPIs – Lab like environment for multiple models being in play Trx Dataset #1 (SS1, SS #1) Trx Dataset #2 (SH2, SS #2) Credits: 1. Photo by Markus Spiske on Unsplash, 2. Photo by Andrew Ridley on Unsplash
  • 37. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 37 MB Rules >> Patterns >> Insights … #1a • MB Rules – IF antecedents (set of products/items) – THEN consequent (single product/item) – This is extracted from an Association Rule (AR) model after running the Apriori algorithm on the input Transactional data – Possible to store the MB Rule in many ways. For e.g. for rule "b, p, r => c“, we can store the rule in the following ways:
  • 38. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 38 MB Rules >> Patterns >> Insights … #1b • MB Rules – IF antecedents ((optional) set of logical Partitions, set of products/items) – THEN consequent (single product/item) – Possible to store the MB Rule in many ways. For e.g. for rule "country=C2 (LP, year=Y1 (LP), b,p,r => c “, we can store the rule in the following ways:
  • 39. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 39 MB Rules >> Patterns >> Insights … #2 • A lot of MB Rules and Not all patterns are useful. • Taking the MB Rule and analyzing it in different contexts is typically an offline exercise – Typically this would involve a lot of offline actions/modeling exercises to look at the Transactional dataset from different perspectives – From frinkiac :D – Well, There is a way … and that’s where SQL Pattern Matching comes in.
  • 40. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 40 MB Rules >> Patterns >> Insights … #3 • Make it possible to act on the MB Rule independent of the entire model – The output of modeling exercise is a lot of MB Rules but Not all patterns are useful. – Association Rules Model based Patterns (i.e. MB Rules) are independent of each other. Allows focused analysis. Unlike a Decision Tree based Rule or a Clustering Model, we can zoom in on a set of rules or even a single rule of interest and analyze it w/o affecting the rest of the patterns/Rules. – We have ways to identify "interesting" rules using technical criteria/KPIs but context/business exigency trumps technical analysis. – Multiple MB Models can be in play at the same time working on the same input transactional dataset but baking in business context into the model. E.g. analyze product purchase patterns with model1, analyze mode of payment choices for products/product categories in model2, analyze behavior of customer segments say, newly signed up customers or customers responding to a Marketing Campaign in model3 etc.
  • 41. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 41 MB Rules >> Patterns >> Insights … #4 (cont.) • A lot of MB Rules and Not all patterns are useful. • Taking the MB Rule and analyzing it in different contexts is typically an offline exercise Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash Model 3 Model 1 Model 2 Model 4 Rule 1 Rule N … Rule 2
  • 42. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 42 MB Rules >> Patterns >> Insights … #4 (cont.) • A lot of MB Rules and Not all patterns are useful. • Taking the MB Rule and analyzing it in different contexts is typically an offline exercise Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash Model 3 Model 1 Model 2 Model 4 Rule 1 Rule N … Rule 2
  • 43. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 43 MB Rules >> Patterns >> Insights … #4 (cont.) • A lot of MB Rules and Not all patterns are useful. • Taking the MB Rule and analyzing it in different contexts is typically an offline exercise Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash Model 3 Model 1 Model 2 Model 4 Rule … Rule N Rule 101 Rule 102
  • 44. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 44 MB Rules >> Patterns >> Insights … #4 (cont.) • A lot of MB Rules and Not all patterns are useful. • Taking the MB Rule and analyzing it in different contexts is typically an offline exercise Credits: 1. Photo by Zhifei Zhou on Unsplash, 2. Photo by Niklas Hamann on Unsplash Model 3 Model 1 Model 2 Model 4 Rule … Rule N Rule 101 Rule 102
  • 45. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 45 MB Rules >> Patterns >> Insights … #5 • Allow for What-if actions on MB Rules/Patterns – From frinkiac :D – SQL Tools allow what-if ... Facilitate end users to perform what if actions via BI Tools.
  • 46. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 46 Demo • Autonomous Database Warehouse (ADW) … Oracle 18c database • Oracle Machine Learning (OML) bundled/packaged with ADW • Oracle Analytics Cloud (OAC) – Many advanced features of the solution leverage the rpd (data modeling layer) component of OAC – KPI Calculations and Deepdives on-demand need the modeling layer (rpd or equivalent)
  • 47. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 47 Why Data Model Solution instead of hand crafted SQL? • Pattern Matching SQL benefits from using a fixed pattern for matching. – We can write SQL for a single Rule … to match against a dataset (many Trx) • For e.g. for rule “p, q, r => a” we use …. PATTERN ( permute(p,q,r) | a ) DEFINE p as (mb_prod_id = 'p'), q as (mb_prod_id = ‘q'), r as (mb_prod_id = ‘r'), a as (mb_prod_id = ‘a') – When we need to match many patterns (say, act on a whole AR model with 100+ rules of varying sizes) -- each against a trx dataset we should define the patterns via metadata/component structures. PATTERN ((apli|bpli|opli)*) DEFINE apli as (mb_comp_li = 'ap'), bpli as (mb_comp_li = 'bp'), opli as (mb_comp_li = 'op') • Same sql for any pattern => Allows integration into ETL or use in sql view to match dynamically via sql query (issued by BI Tools). Metadata based pattern, SQL Data driven pattern, Dyn SQL
  • 48. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 48 Why Data Model Solution? • Expand defn of "MB Products" to cover other dimensions - channel, city, country, dayname, timeofday, ... as artificial products • Design-Time/ETL/offline modeling decisions can be deferred to online analysis for more interactive/dynamic analysis, BI Dashboard time decisions • Possible to model Complex behavior for analysis (sometimes need extra ETL step but we get full analytics capability thereafter) – "Avid" Reader/Browser – "Very Active/Interested in product: X" during Sale/Holiday – No/Regular/Aggressive Treatment of Patients and its effect on outcomes – Use a datapoint or (set of) Trx as source for pattern definition (What If) … (no ETL)
  • 49. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 49 Why Data Model Solution? • Use case: Classification Models (Single Row Trx) can be coerced into Master-Detail multi-row format needed for SQL Pattern Matching. – Decision Tree or any other Classification model (Linear Regression, Random Forests based models as well as other models built using NN, CNN etc.) can be analyzed using the True Positive (TP) pattern. – Confusion Matrix KPIs like Accuracy, Recall, Precision etc can be calculated and recreated at Model/Global level. As shown, ability to do the same for logical Partition(s) is also possible.
  • 50. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 50 Summary • Pattern Analysis (Earlier) – Pattern Discovery via OAA/ODM – Model used to extract Rules and core KPIs – No way to score Rules (need to rebuild) – Patterns of special interest (anomalous/obscure) cannot be found unless model settings are relaxed. If we relax the criteria, we will get those patterns but also many many more. • Pattern Analysis via Data Model (This Solution) – Pattern Discovery via OML (no change) – Rules/KPIs extracted into a Data Model allowing for BI/Adhoc analysis – Post – processing to setup the analysis context (superset of analysis dimensions/attributes) – SQL approach allows • New KPIs – KPIs of statistical nature as well as KPIs related to Business needs (as elaborate as needed) • Scoring against new data possible – patterns can degrade in performance • Score/Track Patterns against specific Trx subsets of interest • Adhoc BI/Exploratory Data Analysis of Patterns • Special Patterns of interest (Fraud use cases) with very low support can also be found as well as analyzed (what-if) • 2 independent ways to MB KPIs – ETL + DB/BI (faster) or DB View + DB/BI (slower, on demand)
  • 51. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 51 • Useful? • Very little shown of ADW/OML currently (end goal), using SQL Developer for most Db actions • Need more details on Market Basket Analysis (MBA)? SQL Pattern Matching? 40 min talk precludes possibility of giving lot of introduction to the material.