SlideShare a Scribd company logo
1 of 20
Download to read offline
Oracle Advanced Analytics:
insurance claim fraud detection
Oracle Innovation Days 2015, Riga
• Established in November, 2007
• 100+ employees
• Customers in Nordics, Latvia, Russia and
the USA
• Provide systems integration services
(CRM, Decision Support Systems)
• Develops original products
• (Micromiles, Debessmana)
Who we are
• Defining needs
• Collecting data
• Generating and evaluating options
• Selecting the best possible
• Applying and using
• Getting feedback and following up
Decisions Making Process Is …
Data Mining is
• the computational process of discovering
patterns in large data sets
• Knowledge Discovery in Databases
What is Data Mining?
Financial Services
- Credit risk analysis
- Cross-LOB up-selling
- Fraud detection
- Retail banking personalization
- “Best customer” prediction & profiling
Retail
- Product recommendations
- Customer segmentation
- Customer profiling
- Market Basket Analysis
Telecommunications
- Churn prevention
- Social network analysis
- Network monitoring
- Customer handling time reduction
Transportation and logistics
- Anticipate bottlenecks
- Proactive resource planning
- Improved preventative maintenance strategies
Data Mining use cases
Cross Industry Standard Process for Data Mining (CRISP)
Business Understanding
• Business Objectives
• Success Criteria
• Project plan
• Deliveries
Data Understanding
• Initial Data Collection
• Data Description
• Data Exploration
Data Preparation
• Data cleaning
• Sampling
• Normalization
• Feature Selection
Modeling
• Select modeling techniques
• Build/train model
• Prediction
Evaluation
• Model validation
• Review results
• Success criteria evaluation
Deployment
• Results visualization
• Report creation
Business Understanding
Fraud detection analysis for insurance claims
(car insurance)
Business Objectives
The goal of this analysis is to create a tool which helps to
identify fraudulent claims in auto insurance (KASKO)
Deliveries
• Possible fraud prediction
• Descriptive analysis
Data Understanding
Initial Data Collection
250 attributes
404 k claims
4% fraud
Fraud
Normal
Source: Oracle Siebel CRM
Data preprocessing
Fraud
Normal
Activities:
• normalization
• inputting missing data
• attribute selection
• stratified sampling
• 70% training dataset
• 30% test dataset
Final data set
150 of 250 attributes selected
Data Mining techniques
• Classification
• Clustering
Data mining tools: Oracle Data Miner
Modeling
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
– In-database data mining algorithms
and open source R algorithms
– SQL, PL/SQL, R languages
– Scalable, parallel in-database
execution
– Workflow GUI and IDEs
– Integrated component of Database
– Enables enterprise analytical
applications
Key Features
Oracle Advanced Analytics
Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
OBIEE
Oracle Database Enterprise Edition
Oracle Advanced Analytics Architecture
Oracle Advanced Analytics
Native SQL Data Mining/Analytic Functions + High-performance
R Integration for Scalable, Distributed, Parallel Execution
SQL Developer ApplicationsR Client
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Function Algorithms Applicability
Classification
Logistic Regression (GLM)
Decision Trees
Naïve Bayes
Support Vector Machines (SVM)
Classical statistical technique
Popular / Rules / transparency
Embedded app
Wide / narrow data / text
Regression
Linear Regression (GLM)
Support Vector Machine (SVM)
Classical statistical technique
Wide / narrow data / text
Anomaly
Detection
One Class SVM Unknown fraud cases or anomalies
Attribute
Importance
Minimum Description Length (MDL)
Principal Components Analysis (PCA)
Attribute reduction, Reduce data noise
Association
Rules
Apriori Market basket analysis / Next Best Offer
Clustering
Hierarchical k-Means
Hierarchical O-Cluster
Expectation-Maximization Clustering (EM)
Product grouping / Text mining
Gene and protein analysis
Feature
Extraction
Nonnegative Matrix Factorization (NMF)
Singular Value Decomposition (SVD)
Text analysis / Feature reduction
Oracle Advanced Analytics
In-Database Data Mining Algorithms—SQL & R & GUI Access
A1 A2 A3 A4 A5 A6 A7
F1 F2 F3 F4
• Automated data
preprocessing (normalizing,
cleaning)
• Workflow type modeling
• Build several models in
parallel
Modeling
Classification modeling using Oracle Data Miner
Models comparison and validation (confusion matrix)
Classification modeling evaluation
Models Actual values Predicted Values
Accuracy
Value Y N
SVM
Y 66% 34%
69%
N 29% 71%
DT
Y 66% 34%
66%
N 33% 67%
GLM
Y 70% 30%
70%
N 30% 70%
Where
Y – Fraud cases
N – Normal cases
Cluster evaluation
% of fraud vs normal cases
The top left quadrant
is our goal
22
Cluster analysis OBIEE dashboard
Fraudulent claims prediction
Output:
- List of possible
fraudulent cases
- Probabilities
Contacts
• Web: www.ideaportriga.lv
• Blog: blog.ideaportriga.lv
• Email: jurijs.jefimovs@ideaportriga.lv
• LinkedIn: lv.linkedin.com/in/jurijsj
Find out more
Q&A

More Related Content

What's hot

Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...EMC
 
AI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for SuccessAI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for SuccessDatabricks
 
Data driven decision making through analytics and IoT
Data driven decision making through analytics and IoTData driven decision making through analytics and IoT
Data driven decision making through analytics and IoTAachen Data & AI Meetup
 
Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...DataWorks Summit
 
Business Intelligence Overview
Business Intelligence Overview Business Intelligence Overview
Business Intelligence Overview Vibloo
 
Is your quality monitoring tech stack secure?
Is your quality monitoring tech stack secure?Is your quality monitoring tech stack secure?
Is your quality monitoring tech stack secure?Etech
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANABirst
 
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...Perficient, Inc.
 
Testing the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTesting the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTechWell
 
How can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersHow can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersgreyaudrina
 
Data as the New Oil: Producing Value in the Oil and Gas Industry
 Data as the New Oil: Producing Value in the Oil and Gas Industry Data as the New Oil: Producing Value in the Oil and Gas Industry
Data as the New Oil: Producing Value in the Oil and Gas IndustryVMware Tanzu
 
Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Niu Bai
 
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...Seeling Cheung
 
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...Precisely
 
DOG Meetup 18 November 2021 - Intro and Azumuta
DOG Meetup 18 November 2021 - Intro and AzumutaDOG Meetup 18 November 2021 - Intro and Azumuta
DOG Meetup 18 November 2021 - Intro and AzumutaDataops Ghent Meetup
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...GetInData
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligenceThilinaWanshathilaka
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...VMware Tanzu
 

What's hot (18)

Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
 
AI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for SuccessAI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for Success
 
Data driven decision making through analytics and IoT
Data driven decision making through analytics and IoTData driven decision making through analytics and IoT
Data driven decision making through analytics and IoT
 
Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...
 
Business Intelligence Overview
Business Intelligence Overview Business Intelligence Overview
Business Intelligence Overview
 
Is your quality monitoring tech stack secure?
Is your quality monitoring tech stack secure?Is your quality monitoring tech stack secure?
Is your quality monitoring tech stack secure?
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANA
 
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
 
Testing the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big ProblemsTesting the Data Warehouse—Big Data, Big Problems
Testing the Data Warehouse—Big Data, Big Problems
 
How can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersHow can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of others
 
Data as the New Oil: Producing Value in the Oil and Gas Industry
 Data as the New Oil: Producing Value in the Oil and Gas Industry Data as the New Oil: Producing Value in the Oil and Gas Industry
Data as the New Oil: Producing Value in the Oil and Gas Industry
 
Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714
 
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
 
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
 
DOG Meetup 18 November 2021 - Intro and Azumuta
DOG Meetup 18 November 2021 - Intro and AzumutaDOG Meetup 18 November 2021 - Intro and Azumuta
DOG Meetup 18 November 2021 - Intro and Azumuta
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligence
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
 

Viewers also liked

Medical Billing Fraud
Medical Billing FraudMedical Billing Fraud
Medical Billing Fraudmagicalmilon
 
Medical fraud and its implications Dr Vaikuthan Rajaratnam
Medical fraud and its implications Dr Vaikuthan RajaratnamMedical fraud and its implications Dr Vaikuthan Rajaratnam
Medical fraud and its implications Dr Vaikuthan RajaratnamVaikunthan Rajaratnam
 
ACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationScott Mongeau
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentationHernan Huwyler
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat SheetHortonworks
 
Presentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & controlPresentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & controlDominic Sroda Korkoryi
 

Viewers also liked (7)

Medical Billing Fraud
Medical Billing FraudMedical Billing Fraud
Medical Billing Fraud
 
Medical fraud and its implications Dr Vaikuthan Rajaratnam
Medical fraud and its implications Dr Vaikuthan RajaratnamMedical fraud and its implications Dr Vaikuthan Rajaratnam
Medical fraud and its implications Dr Vaikuthan Rajaratnam
 
Health care fraud stark law and false claim act
Health care fraud stark law and false claim actHealth care fraud stark law and false claim act
Health care fraud stark law and false claim act
 
ACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and Mitigation
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentation
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
 
Presentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & controlPresentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & control
 

Similar to IPR Oracle Innovation Days 2015

Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...HostedbyConfluent
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1Steve Markey
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
oracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxoracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxAdityaDas899782
 
Better insight 2010 nov 30 bucharest
Better insight 2010 nov 30 bucharestBetter insight 2010 nov 30 bucharest
Better insight 2010 nov 30 bucharestDoina Draganescu
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraMolly Alexander
 
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...Charlie Berger
 
EDB Executive Presentation 101515
EDB Executive Presentation 101515EDB Executive Presentation 101515
EDB Executive Presentation 101515Pierre Fricke
 
IBM Modern Analytics Journey
IBM Modern Analytics Journey IBM Modern Analytics Journey
IBM Modern Analytics Journey Robb Sinclair
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 
Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...
Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...
Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...Lucas Jellema
 
CCS - Business Intelligence Capabilities
CCS - Business Intelligence CapabilitiesCCS - Business Intelligence Capabilities
CCS - Business Intelligence CapabilitiesCCS Global Tech
 
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient..."Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...Dataconomy Media
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGIntroducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGSandesh Rao
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauDATAVERSITY
 
It's All About the Data - Tia Dubuisson
It's All About the Data - Tia DubuissonIt's All About the Data - Tia Dubuisson
It's All About the Data - Tia DubuissonCatalina Arango
 

Similar to IPR Oracle Innovation Days 2015 (20)

Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1Securing_Native_Big_Data_v1
Securing_Native_Big_Data_v1
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
oracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxoracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptx
 
Better insight 2010 nov 30 bucharest
Better insight 2010 nov 30 bucharestBetter insight 2010 nov 30 bucharest
Better insight 2010 nov 30 bucharest
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
 
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
 
EDB Executive Presentation 101515
EDB Executive Presentation 101515EDB Executive Presentation 101515
EDB Executive Presentation 101515
 
IBM Modern Analytics Journey
IBM Modern Analytics Journey IBM Modern Analytics Journey
IBM Modern Analytics Journey
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...
Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...
Turn Data into Business Value – Starting with Data Analytics on Oracle Cloud ...
 
Oracle canvas 140604 2
Oracle canvas 140604 2Oracle canvas 140604 2
Oracle canvas 140604 2
 
CCS - Business Intelligence Capabilities
CCS - Business Intelligence CapabilitiesCCS - Business Intelligence Capabilities
CCS - Business Intelligence Capabilities
 
Euro IT Group
Euro IT GroupEuro IT Group
Euro IT Group
 
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient..."Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGIntroducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
 
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauAnalyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and Tableau
 
It's All About the Data - Tia Dubuisson
It's All About the Data - Tia DubuissonIt's All About the Data - Tia Dubuisson
It's All About the Data - Tia Dubuisson
 

IPR Oracle Innovation Days 2015

  • 1. Oracle Advanced Analytics: insurance claim fraud detection Oracle Innovation Days 2015, Riga
  • 2. • Established in November, 2007 • 100+ employees • Customers in Nordics, Latvia, Russia and the USA • Provide systems integration services (CRM, Decision Support Systems) • Develops original products • (Micromiles, Debessmana) Who we are
  • 3. • Defining needs • Collecting data • Generating and evaluating options • Selecting the best possible • Applying and using • Getting feedback and following up Decisions Making Process Is …
  • 4. Data Mining is • the computational process of discovering patterns in large data sets • Knowledge Discovery in Databases What is Data Mining?
  • 5. Financial Services - Credit risk analysis - Cross-LOB up-selling - Fraud detection - Retail banking personalization - “Best customer” prediction & profiling Retail - Product recommendations - Customer segmentation - Customer profiling - Market Basket Analysis Telecommunications - Churn prevention - Social network analysis - Network monitoring - Customer handling time reduction Transportation and logistics - Anticipate bottlenecks - Proactive resource planning - Improved preventative maintenance strategies Data Mining use cases
  • 6. Cross Industry Standard Process for Data Mining (CRISP) Business Understanding • Business Objectives • Success Criteria • Project plan • Deliveries Data Understanding • Initial Data Collection • Data Description • Data Exploration Data Preparation • Data cleaning • Sampling • Normalization • Feature Selection Modeling • Select modeling techniques • Build/train model • Prediction Evaluation • Model validation • Review results • Success criteria evaluation Deployment • Results visualization • Report creation
  • 7. Business Understanding Fraud detection analysis for insurance claims (car insurance) Business Objectives The goal of this analysis is to create a tool which helps to identify fraudulent claims in auto insurance (KASKO) Deliveries • Possible fraud prediction • Descriptive analysis
  • 8. Data Understanding Initial Data Collection 250 attributes 404 k claims 4% fraud Fraud Normal Source: Oracle Siebel CRM
  • 9. Data preprocessing Fraud Normal Activities: • normalization • inputting missing data • attribute selection • stratified sampling • 70% training dataset • 30% test dataset Final data set 150 of 250 attributes selected
  • 10. Data Mining techniques • Classification • Clustering Data mining tools: Oracle Data Miner Modeling
  • 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | – In-database data mining algorithms and open source R algorithms – SQL, PL/SQL, R languages – Scalable, parallel in-database execution – Workflow GUI and IDEs – Integrated component of Database – Enables enterprise analytical applications Key Features Oracle Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics
  • 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | OBIEE Oracle Database Enterprise Edition Oracle Advanced Analytics Architecture Oracle Advanced Analytics Native SQL Data Mining/Analytic Functions + High-performance R Integration for Scalable, Distributed, Parallel Execution SQL Developer ApplicationsR Client
  • 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Function Algorithms Applicability Classification Logistic Regression (GLM) Decision Trees Naïve Bayes Support Vector Machines (SVM) Classical statistical technique Popular / Rules / transparency Embedded app Wide / narrow data / text Regression Linear Regression (GLM) Support Vector Machine (SVM) Classical statistical technique Wide / narrow data / text Anomaly Detection One Class SVM Unknown fraud cases or anomalies Attribute Importance Minimum Description Length (MDL) Principal Components Analysis (PCA) Attribute reduction, Reduce data noise Association Rules Apriori Market basket analysis / Next Best Offer Clustering Hierarchical k-Means Hierarchical O-Cluster Expectation-Maximization Clustering (EM) Product grouping / Text mining Gene and protein analysis Feature Extraction Nonnegative Matrix Factorization (NMF) Singular Value Decomposition (SVD) Text analysis / Feature reduction Oracle Advanced Analytics In-Database Data Mining Algorithms—SQL & R & GUI Access A1 A2 A3 A4 A5 A6 A7 F1 F2 F3 F4
  • 14. • Automated data preprocessing (normalizing, cleaning) • Workflow type modeling • Build several models in parallel Modeling Classification modeling using Oracle Data Miner
  • 15. Models comparison and validation (confusion matrix) Classification modeling evaluation Models Actual values Predicted Values Accuracy Value Y N SVM Y 66% 34% 69% N 29% 71% DT Y 66% 34% 66% N 33% 67% GLM Y 70% 30% 70% N 30% 70% Where Y – Fraud cases N – Normal cases
  • 16. Cluster evaluation % of fraud vs normal cases The top left quadrant is our goal 22
  • 18. Fraudulent claims prediction Output: - List of possible fraudulent cases - Probabilities
  • 19. Contacts • Web: www.ideaportriga.lv • Blog: blog.ideaportriga.lv • Email: jurijs.jefimovs@ideaportriga.lv • LinkedIn: lv.linkedin.com/in/jurijsj Find out more
  • 20. Q&A