ERP Centric Data Mining and KD

  • 1,459 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,459
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
54
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. ERP Centric Data Mining and Knowledge Discovery Naeem Hashmi Chief Technology Officer Information Frameworks e-mail: nhashmi@infoframeworks.com Web: http://infoframeworks.com Webcast - searchsap.com September 10, 2002
  • 2.
    • Founder and CTO of Information Frameworks, an author, speaker and world-renowned expert on emerging Information Architectures, Integration and Business Intelligence Technologies.
    • Author of the best selling book titled,
      • SAP Business Information Warehouse for SAP, 2000 .
    • Technical Editor
      • SAP BW Certification Guide , authored by Catherine Roze 2002
    • Contributing Author, SAP BW Handbook, 2002
    • Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The ERP World, Data Mining and the Data Warehouse Institute.
    • 25+ years of experience in emerging Information Technology research, development, and management; Information Architectures; Enterprise Application Integration e-business; ERP applications; Data Warehousing; Data Mining; CRM; Internet, Object and Client/Server Technologies and Strategic Consulting.
    • Email- nhashmi@infoframeworks.com url : http://infoframeworks.com
      • Tel: 603-432-4550
    About the Speaker Naeem Hashmi
  • 3. Agenda
    • Data Mining and Knowledge Discovery Basics
    • ERP Vendors and Data Mining Solutions
    • Data Mining in SAP Business Information Warehouse
    • Pro and Cons of ERP centric Data Mining
    • Q&A
  • 4. Agenda
    • Data Mining and Knowledge Discovery Basics
    • ERP Vendors and Data Mining Solutions
    • Data Mining in SAP Business Information Warehouse
    • Pro and Cons of ERP centric Data Mining
    • Q&A
  • 5. What is Data Mining and Knowledge Discovery ?
    • Data Mining is a tactical process that uses mathematical algorithms to sift through large data-stores to extract data patterns/models/rules
    • The Knowledge Discovery is the process of identifying and understanding potentially useful hidden anomalies, trends and patterns. Data mining is an integral part of knowledge discovery process
  • 6. Data Mining and Statistics ?
    • DM sounds very similar to regression analysis but its approach and purpose are quite different
      • Statistical methods tests a hypothesis on a data set
      • Data Mining starts from the data sets to construct a hypothesis
  • 7. Data Mining - Present State Source: http://www.kdnuggets.com/polls/ Application Domains
  • 8. Data Mining Methodologies
    • Business Understanding
    • Data Understanding
    • Data Preparation
    • Modeling
    • Evaluation
    • Deployment
    Source: http://www.kdnuggets.com/polls/ CRoss Industry Standard Process for Data Mining CRISP-DM Source: http://www.crisp-dm.org/ http://www.crisp-dm.org/ SIX STEPS PROCESS
  • 9. Data Mining Process
    • Business Understanding
    • Data Understanding
    • Data Preparation
    • Modeling
    • Evaluation
    • Deployment
    CRoss Industry Standard Process (CRISP) for Data Mining Data Understanding Data Preparation Data Warehouse Initially will take about 60% to 80% of the data mining project time http://www.crisp-dm.org/ Source: http://www.crisp-dm.org/
  • 10. Data Mining - Tools and Data Formats Source: http://www.kdnuggets.com/polls/ Domains 57% Flat files 37% Proprietary 27% DBMS
  • 11. Data Mining Technology Visualization Use human pattern recognition capabilities Statistics Applying statistical techniques to predict Decision Trees Building scripts based on historic data Association Rules (Rule Induction) Reasoning from specific facts to reach a hypothesis Clustering Refers to finding and visualizing groups of facts that were not previously known Neural Networks Learning how to solve problems based on examples K-Nearest Neighbor Classification by looking at similar data Genetic Algorithms Survival of the fittest … T E C H N I Q U E S U S A G E Discover Understand Predict
  • 12. Data Mining Models
      • Regression algorithms
        • Neural Networks, Rule Induction
        • Predict Numerical Outcome
      • Classification algorithm
        • CHAID, discriminant analysis
        • Predict Symbolic Outcome
    Two Types of Data Mining Models
      • Clustering/Grouping algorithms
        • K-means, Kohonen, Factor Analysis
      • Association algorithms
        • Apriori, Sequence
    Descriptive Models Grouping & Associations Prediction Models Prediction and Classification
  • 13. Traditional DM vendors
    • SPSS Clementine
    • SAS Enterprise Miner
    • IBM Intelligent Miner
    • Salford CART/MARTS
    • … more
  • 14. Database Vendors – DM within the Products
    • Data Mining Engine in Oracle 9i
      • Oracle 9i consists of key products
        • Oracle9 i Database ,Oracle9 i Application Server,Oracle9 i Developer Suite
    • IBM Intelligent Miner into DB2
    • TeraMiner into Teradata
    • Microsoft – SQL Server 2000
    • When you implement DM functionality in a DBMS, you are limited to a specific database engine and not quite flexible in a typical enterprise application landscape - heterogeneous environment.
  • 15. Data Mining Standards
    • PMML - P redictive M odel M arkup L anguage
    • OleDB for Data Mining
    • Java Data Mining API
    • Other Data Exchange Standards for Analytics and need Data Mining extensions
      • CWM: Common Warehouse Metadata
      • XML/A: XML for Analytics
      • CPEX: Customer Profile EXchange
      • xCIL: Extensible Customer Information Language
  • 16. Agenda
    • Data Mining and Knowledge Discovery Basics
    • ERP Vendors and Data Mining Solutions
    • Data Mining in SAP Business Information Warehouse
    • Pro and Cons of ERP centric Data Mining
    • Q&A
  • 17. Enterprise Applications Landscape
    • ERP Solutions
      • Oracle
      • PeopleSoft
      • SAP
    • ERP vendors have extended scope of their applications far beyond tradition ERP functions to a wide array of business solutions such as:
      • Customer Relationships Management
      • Business Intelligence
      • Enterprise Portals
    • Siebel
    • Oracle Business Intelligence Solution
    • Peoplesoft Enterprise Performance Management
    • SAP Business Information Warehouse
  • 18. Oracle Business Intelligence Solution
    • Business Processes ( Pre-Built Portlets)
    • Response to Lead (27)
    • Lead to Quote (56)
    • Quote to Order (15)
    • Order to Cash (34)
    • Demand to Build (40)
    • Procure to Pay (28)
    • Revenue to Compensation (29)
    • Expiration to Renewal (33)
    • Issue to Resolution (51)
    • HR Family (43)
    Source: Oracle
    • Oracle 9i DM Integration
    • Oracle Marketing Online for
    • Campaign Management
    • Oracle9iAS Personalization
    • iStore
    • more to come…
    Oracle9iDS Warehouse Builder Oracle9iAS Discoverer Oracle9iDS Reports Oracle9iAS Portal Oracle9iAS Clickstream Intelligence Oracle9iAS Personalization Oracle9i Data Mining Oracle9iDS Business Intelligence Beans Oracle 9i Business Intelligence
  • 19. PeoplSoft Business Intelligence Solution
    • Customer Profitability
    • Finance
    • Workforce Analytics
    • Supply Chain Management Process
    • Workforce Rewards
    • Enrollment Management
    • Retail Merchandise
    • Project Analysis
    • Student Administration
    • Balanced Scorecard
    • Employee Scorecard
    • Customer Scorecard
    • Vendor Scorecard
    Enterprise Performance Management (EPM) Courtesy: eBusiness Advantage Inc. (www.ebizadvan.com)
    • CRM Prospect Analysis
    • CRM Marketing Analysis
    • CRM Sales Effectiveness
    • CRM Service Effectiveness
    Data mining Capabilities No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products. No response from PeopleSoft contacts
  • 20. SAP Business Intelligence Solution
    • SAP CRM
      • Campaign management
      • Opportunity analytics
      • Customer behavior modeling
    • SAP SCM
      • Demand planning
      • Spend optimization
      • SCOR KPIs
    • SAP Financials, Human Capital Management
      • SEM
      • Balanced scorecard
      • Planning
      • Economic profit
      • Benchmarking
      • Employee turnover & retention
      • Corporate investment management
    +420 InfoCubes +1700 Queries Source: SAP
    • Closed loop platform capabilities
      • Drill-through (report-report i/f)
      • Remote cubes (read through)
      • Real-time data warehousing
      • Data mining
      • Write back to operational system
    • SAP Portals
      • E-commerce analysis
    • SAP Markets, Procurement
      • Bidding, pattern-based offering
      • Activity reproting, service analytics
    90 ODS Objects Business Information Warehouse
  • 21. CRM Venders – Data Mining Integration
    • Oracle CRM
      • Pre 9i Darwin
      • Post 9i ODM
    • RightPoint and E.piphany
    • SPSS and Siebel
    • SAP CRM
      • Native Data Mining built in SAP BW - Database Independent
      • Interface to IBM Intelligent Miner Interface with SAP BW
    • PeopleSoft CRM
      • No official data mining product or vendor solution
      • Waiting for their response on what they have?
  • 22. Agenda
    • Data Mining and Knowledge Discovery Basics
    • ERP Vendors and Data Mining Solutions
    • Data Mining in SAP Business Information Warehouse
    • Pro and Cons of ERP centric Data Mining
    • Q&A
  • 23. SAP BW 3.0b Data Mining Implementation
    • Currently for Customer Subject Area
    • Algorithm Supported
      • Decision Trees
      • Scoring
      • Clustering/Segmentation
      • Association
    • Data Mining process
      • Model definition
      • Training the model
      • Performing prediction using the training results
      • Uploading the results back into BW
      • Utilizing the mining results (on the operational side)
      • SAPGUI is the Interface to the Data Mining modeling and analysis
    No Extensive Data Staging
  • 24. Modeling a Decision Tree Create a mining model Source: SAP 2 Model c columns 1 Specifying the column parameters 6 Specifying the values in case the original values in the column are to be treated differently Indicating the prediction column 4 Indicating the key column 5 The nature of the column content 3 Data type of the column 7
  • 25. Modeling a Decision Tree Specify Model Parameters Source: SAP Use portion (%) of the data for training or the whole data set for training 1 Size of the window (such as 10%) The number of repeats with different samples Stop training when the no. of cases under the given node is less than/equal to the specified value 4 Stop training when the accuracy is greater than or equal to the expected accuracy 5 If the tree is too big, prune the tree without violating the expected accuracy 6 Use the information gain threshold to check the relevance 7 3 2
  • 26. Create a training source and map the model columns Source: SAP 2 Modeling a Decision Tree BW Query Runtime parameters for query Model columns 1 Selected source columns 3 Mapping between model column and source column 4 5
  • 27.
    • Create a mining model
    • Train the model
    • Predictions using Training results
    • Using the data mining results against BW Query
    SAP BW Data Mining – Process Steps Source: SAP
  • 28. Source: SAP 3 5 Viewing Decision Tree Training Results This decision tree predicts whether the customer has left or is still “on board 1 Chances of a customer leaving is 70.7% if the profession is “LABOURER” 2 Chart shows the distribution at the selected node 28/41 customers are likely to leave 13/41 customers are likely to stay 6 Out of a total of 705 cases, 41 cases are covered under this node 4
  • 29. Data Mining – Decision Trees Uploaded in BW Then BEX for further Analysis Source: SAP
  • 30.
    • Create a Association model
    • Define Model Columns
    • Train the model
    • Predictions using Training results
    • Using the data mining results against BW Query
    Data Mining – Association Source: SAP
  • 31. Data Mining – Association Source: SAP
  • 32.
    • Create a Cluster model
    • Train the model
    • Predictions using Training results
    • Using the data mining results against BW Query
    Data Mining – Cluster Analysis Source: SAP
  • 33. Source: SAP Viewing Cluster Analysis Results 1 2 3
  • 34. Viewing Cluster Analysis results Uploaded in BW Then BEX for further Analysis Source: SAP
  • 35.
    • Good attempt to implement few Data Mining Algorithms
    • Very traditional Data Mining Approach
    • Requires a well versed Statistician or Data Mining Expert to model and interpret the results
    • Source: BEX Query – Big Limitation in DM
    • Weak Visualization
    • BEX for additional discovery - slicing and dicing
    SAP Data Mining
  • 36. SAP BW - IBM Intelligent Miner
    • Copy data from SAP BW to IBM Intelligent Miner
      • Results of reports in BW – Modeling in Business Explorer Analyzer
      • Data direct from InfoCubes (for c ross-selling analysis)
      • Descriptions, hierarchies
    • Results data from IBM IM back into SAP BW
      • Results of segmentation can be loaded as master data or hierarchies
    • Data transport is designed through Wizards in SAP BW
      • Possible to get a good view of Intelligent Miner Results from SAP BW
    IBM Intelligent Miner is designed to:
  • 37. Agenda
    • Data Mining and Knowledge Discovery Basics
    • ERP Vendors and Data Mining Solutions
    • Data Mining in SAP Business Information Warehouse
    • Pro and Cons of ERP centric Data Mining
    • Q&A
  • 38. ERPs and Data Mining: Good and the Bad News
    • Good News
      • Known Business Processes
      • Few data Sources
      • Improved Data Quality
      • Metadata Integration
      • Near real-time data mining
      • Closed-loop Knowledge Discovery
      • Consistent Infrastructure
    • Bad News
      • Complex Data Structures
      • Performance
      • Availability
      • Very few Data Mining algorithms - Today
    • Business Understanding
    • Data Understanding
    • Data Preparation
    • Modeling
    • Evaluation
    • Deployment
    CRISP-DM
  • 39. Data Mining Process and ERP Data Mining
    • Business Understanding
    • Data Understanding
    • Data Preparation
    • Modeling
    • Evaluation
    • Deployment
    Will reduce data mining project time up to 50% Source: http://www.crisp-dm.org/ Good News for Future Business Applications Data Understanding Data Preparation Deployment Business Understanding
  • 40. Agenda
    • Data Mining and Knowledge Discovery Basics
    • ERP Vendors and Data Mining Solutions
    • Data Mining in SAP Business Information Warehouse
    • Pro and Cons of ERP centric Data Mining
    • Q&A
  • 41. INFORMATION FRAMEWORKS Technology/Solution Assessment Product Strategy Solution Strategy Product Positioning Competitive Analysis Software product architecture Marketing Strategy Product Performance and Benchmarking Consulting Hardware Configuration Market Research Market Assessment Competitive Analysis Technology due Seminars Webinars Keynotes Panel Moderator Publications Hands-on training Conferences Executive and Senior IT Management Consulting Enterprise Information Architectures (EIA) Business Case Development Information Architecture Application Deployment Architectures implementation Legacy Application Migration Strategies ERP Application deployment strategies Enterprise Applications Integration (EAI) Architectures, Service Modeling and design, EAI technology assessment Tools and Technology Assessment Vendor Selection and Assessment Conference Room Pilot implementation Business Intelligence and Portals Architectures, Methodologies Tool/technology/Vendor assessment and selection Data Warehouse, Data Marts, Analytics, Information Delivery Deployment Architectures Business Intelligence and eBusiness Integration architectures Portals Strategies, Business case, Assessment, Architectures, Modeling, Planning and knowledge Transfer KNOWLEDGE TRANSFER INFORMATION TECHNOLOGY ORGANIZATION SOFTWARE AND SOLUTION VENDORS INFORMATION TECHNOLOGY INVESTORS http://infoframeworks.com
  • 42. Questions Naeem Hashmi Chief Technology Officer September 10, 2002 Email: nhashmi@infoframeworks.com Web Site: http://infoframeworks.com Tel: 603-432-4550