Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IPR Oracle Innovation Days 2015


Published on

  • I’ve personally never heard of companies who can produce a paper for you until word got around among my college groupmates. My professor asked me to write a research paper based on a field I have no idea about. My research skills are also very poor. So, I thought I’d give it a try. I chose a writer who matched my writing style and fulfilled every requirement I proposed. I turned my paper in and I actually got a good grade. I highly recommend ⇒ ⇐
    Are you sure you want to  Yes  No
    Your message goes here

IPR Oracle Innovation Days 2015

  1. 1. Oracle Advanced Analytics: insurance claim fraud detection Oracle Innovation Days 2015, Riga
  2. 2. • Established in November, 2007 • 100+ employees • Customers in Nordics, Latvia, Russia and the USA • Provide systems integration services (CRM, Decision Support Systems) • Develops original products • (Micromiles, Debessmana) Who we are
  3. 3. • Defining needs • Collecting data • Generating and evaluating options • Selecting the best possible • Applying and using • Getting feedback and following up Decisions Making Process Is …
  4. 4. Data Mining is • the computational process of discovering patterns in large data sets • Knowledge Discovery in Databases What is Data Mining?
  5. 5. Financial Services - Credit risk analysis - Cross-LOB up-selling - Fraud detection - Retail banking personalization - “Best customer” prediction & profiling Retail - Product recommendations - Customer segmentation - Customer profiling - Market Basket Analysis Telecommunications - Churn prevention - Social network analysis - Network monitoring - Customer handling time reduction Transportation and logistics - Anticipate bottlenecks - Proactive resource planning - Improved preventative maintenance strategies Data Mining use cases
  6. 6. Cross Industry Standard Process for Data Mining (CRISP) Business Understanding • Business Objectives • Success Criteria • Project plan • Deliveries Data Understanding • Initial Data Collection • Data Description • Data Exploration Data Preparation • Data cleaning • Sampling • Normalization • Feature Selection Modeling • Select modeling techniques • Build/train model • Prediction Evaluation • Model validation • Review results • Success criteria evaluation Deployment • Results visualization • Report creation
  7. 7. Business Understanding Fraud detection analysis for insurance claims (car insurance) Business Objectives The goal of this analysis is to create a tool which helps to identify fraudulent claims in auto insurance (KASKO) Deliveries • Possible fraud prediction • Descriptive analysis
  8. 8. Data Understanding Initial Data Collection 250 attributes 404 k claims 4% fraud Fraud Normal Source: Oracle Siebel CRM
  9. 9. Data preprocessing Fraud Normal Activities: • normalization • inputting missing data • attribute selection • stratified sampling • 70% training dataset • 30% test dataset Final data set 150 of 250 attributes selected
  10. 10. Data Mining techniques • Classification • Clustering Data mining tools: Oracle Data Miner Modeling
  11. 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | – In-database data mining algorithms and open source R algorithms – SQL, PL/SQL, R languages – Scalable, parallel in-database execution – Workflow GUI and IDEs – Integrated component of Database – Enables enterprise analytical applications Key Features Oracle Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics
  12. 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | OBIEE Oracle Database Enterprise Edition Oracle Advanced Analytics Architecture Oracle Advanced Analytics Native SQL Data Mining/Analytic Functions + High-performance R Integration for Scalable, Distributed, Parallel Execution SQL Developer ApplicationsR Client
  13. 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Function Algorithms Applicability Classification Logistic Regression (GLM) Decision Trees Naïve Bayes Support Vector Machines (SVM) Classical statistical technique Popular / Rules / transparency Embedded app Wide / narrow data / text Regression Linear Regression (GLM) Support Vector Machine (SVM) Classical statistical technique Wide / narrow data / text Anomaly Detection One Class SVM Unknown fraud cases or anomalies Attribute Importance Minimum Description Length (MDL) Principal Components Analysis (PCA) Attribute reduction, Reduce data noise Association Rules Apriori Market basket analysis / Next Best Offer Clustering Hierarchical k-Means Hierarchical O-Cluster Expectation-Maximization Clustering (EM) Product grouping / Text mining Gene and protein analysis Feature Extraction Nonnegative Matrix Factorization (NMF) Singular Value Decomposition (SVD) Text analysis / Feature reduction Oracle Advanced Analytics In-Database Data Mining Algorithms—SQL & R & GUI Access A1 A2 A3 A4 A5 A6 A7 F1 F2 F3 F4
  14. 14. • Automated data preprocessing (normalizing, cleaning) • Workflow type modeling • Build several models in parallel Modeling Classification modeling using Oracle Data Miner
  15. 15. Models comparison and validation (confusion matrix) Classification modeling evaluation Models Actual values Predicted Values Accuracy Value Y N SVM Y 66% 34% 69% N 29% 71% DT Y 66% 34% 66% N 33% 67% GLM Y 70% 30% 70% N 30% 70% Where Y – Fraud cases N – Normal cases
  16. 16. Cluster evaluation % of fraud vs normal cases The top left quadrant is our goal 22
  17. 17. Cluster analysis OBIEE dashboard
  18. 18. Fraudulent claims prediction Output: - List of possible fraudulent cases - Probabilities
  19. 19. Contacts • Web: • Blog: • Email: • LinkedIn: Find out more
  20. 20. Q&A