Data mining (prefinals)
Upcoming SlideShare
Loading in...5

Data mining (prefinals)






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Data mining (prefinals) Data mining (prefinals) Presentation Transcript

  • Data Mining ADDBASE
  • What is data mining? The process of extracting valid previously unknown, comprehensive, and actionable information from large databases and using it to make crucial business decision It starts by developing a representation of simple data. then extended to larger sets of data working on the premise that the larger data has a structure similar to the
  • Data mining Applications It is almost applicable in all areas whether it is for business or for science. Provides different purpose and benefits depending where this technique is applied.
  • Data mining ApplicationsRetail/Marketing Identify buying patterns of customers. Finding association among customer demographic characteristic. Predicting response to mailing campaigns. Market basket analysis.
  • Data mining ApplicationsBanking Detecting patterns of fraudulent credit card use. Identifying loyal customers. Predicting customers likely to change their credit card affiliation. Determining credit card spending by customer groups.
  • Data mining ApplicationsInsurance Claims analysis. Predicting which customers will buy new policies.Medicine Characterizing patient behavior to predict surgery visit. Identifying successful medical therapies for different illnesses.
  • Data mining Operations4 main operations of data mining: Predictive modeling Database segmentation Link analysis Deviation detection
  • Data mining Operations Predictive modeling  Based observations to form a model of the important characteristics of some phenomenon. Database segmentation  Is about partitioning of database into an unknown number of segments or clusters of similar records.
  • Data mining Operations Link analysis  Based on links called associations between the individual records and set of records in a database. Deviation detection  Newest data mining operation  Often a source of true discovery because it identifies outliers which express deviation.
  • Data mining Process Cross-IndustryStandard Process for Data Mining (CRISP-DM)  Specifies a data of data mining process model that is not specific to any industry tool.  Involved from unknown knowledge discovery processes used widely in industry and in direct response to user requirements.
  • Data mining Process (cont…) Major objectives of this specification are to make large data mining projects run more efficiently as well as to make them cheaper, more reliable and more manageable. A hierarchy process model
  • Data mining Process (cont…) The process is divided into 6 different generic phases ranging from business understanding to deployment of project result. The phases of CRISP-DM model are:  Business understanding  Data understanding  Data preparation  Modeling
  • Data mining Process (cont…)  Evaluation  Deployment Business understanding  This phase is focuses on understanding the project objectives and requirements from the business point of view. Data understanding  This phase includes task for initial collection of the data and is concerned with establishing the main characteristics
  • Data mining Process (cont…) Data preparation  This phase involves all the activities for constructing the final data set on which modeling tools can be applied directly. Modeling  This phase is the actual data mining operation and involves selecting modeling techniques, selecting modeling parameters and assessing the model created.
  • Data mining Process (cont…) Evaluation  This phase validates the model from the data analysis point of view.  The model and the steps in modeling are verified within the context of achieving the business goals. Deployment  This phase is all about generating report or as complex as implementing repeatable data mining processing across the enterprise.