Data Mining


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Mining

  1. 1. LOGO DATA MINING Dayanand Academy of Management StudiesLOGO
  2. 2. Contents 1 Data Mining Introduction 2 Data Mining Procedures 3 Data Mining Techniques 4 Data Mining Application
  3. 3. LOGO Data MiningLOGO Introduction
  4. 4. IntoductionWhat is Data Mining?  Data mining is the process of extracting meaningful piece of information from Data warehouses , which can be useful for maximizing profit , fraud detection , marketing perspective and scientific research.
  5. 5.Data Warehouses: According to Stanford University, "A Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated .This makes it much easier and more efficient to run queries over data that originally came from different sources."
  6. 6. Data Minining Steps Fourth Step Knowledge Deployment Third Step Model Building Second Step Data Gathering First Step Problem Definition
  7. 7. Data Mining Procedures Problem Definition:- Data mining project focuses on understanding the objectives and requirements of a particular project of business. The Project must be specified from a business point of view. After that it can be formulated as a data mining problem and develop a preliminary. Data Gathering & Preparation:- This task involves data collection and exploration. It can be done by Removing unnecessary information , Detecting Data Duplicity and supplying some new information.
  8. 8. Data Mining Procedures Model Building and Evaluation:- In this phase, various Modeling Techniques can be applied to build the data model which is likely to be sufficient with the requirement and then An Evaluation can be done to compare the current model with the originally stated project goal. Knowledge Deployment:- Knowledge deployment is the use of data mining within a target environment. In the deployment phase, insight and actionable information can be derived from data.
  9. 9. History of Data Mining Techniques 1950 1960’s 1980’s • Neural • Decision • Support Networks Trees Vector • Clustering Machine 1999 2004 • Cross Industry Standard • Java Data Mining Platform Data Mining Package (JDM 1.0) Package (Crisp DM 1.0)
  10. 10. LOGO Data MiningLOGONeural Networks
  11. 11. Neural Networks:-  Neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data. Using neural networks as a tool, data warehousing firms are extracting information from datasets in the process known as data mining.  Neural network is a techniques derived from artificial intelligence research that uses generalized regression and provide methods to carry it out.  It is self adapted and it uses learning method.
  12. 12. Processing of Neural Networks  Input data is presented to the network and propagated through the network until it reaches the output layer. The predicted output is subtracted from the actual output and an error value for the networks is calculated through supervised learning.  Once back propagation has finished, the forward process starts again, and this cycle is continued until the error between predicted and actual outputs is minimized.
  13. 13. LOGO Data MiningLOGO Clustering
  14. 14. Clustering Clustering is used to segment the data. Clustering models segment records into groups that are similar to each other which is totally distinct from other groups. Typical Applications of Clustering are Online Document Classification and to cluster web log data to discover groups of similar access patterns. Pattern Recognition, Spatial Data Analysis and Image processing are other applications in Scientific areas.
  15. 15. Clustering
  16. 16. LOGO Data MiningLOGO Decision Trees
  17. 17. Decision Trees The Decision Tree algorithm is based on conditional probabilities. Decision trees generate rules. A rule is a conditional statement that can easily be understood by humans and easily used within a database to identify a set of records. The Decision Tree algorithm produces accurate and interpretable models with relatively little user intervention. The algorithm can be used for both binary and multi-class classification problems.
  18. 18. Decision Trees Node 1 sows about married persons and 0 describes single persons. Node 1 has 712 records (cases). Of these, 382 have a target of 0 (not likely to increase spending), and 330 have a target of 1 (likely to increase spending).
  19. 19. LOGO Data MiningLOGO Support Vector Machines
  20. 20. Support Vector Machine An optimal Defined Surface. Linear and non linear Input Space. Linear or High Dimension Feature Space which is specially defined Kernel function. SVM involves the fitting of a hyper plane such that the largest margin is formed between 2 classes of vectors while minimizing the effects of classification errors so that we can classified in to groups.
  21. 21. Support Vector Machine
  22. 22. Support Vector Used For Classification Regression Unsupervised Learning and supervised learning.
  23. 23. LOGO Data MiningLOGO JAVA Data Mining
  24. 24. JAVA Data Mining
  25. 25. Facilities by JDM 1.0 Package
  26. 26. Parallel Processing Resources R1 R2 R3 R4 Processors P1 P2 P3 P4 Output O1 O2 O3 O4
  27. 27. Distributed Computing Processors P1 P2 P3 P4 Http Request Resources R1 R2 R3 R4
  28. 28. LOGO Data MiningLOGO Cross Industry Standard Platform Data Mining
  29. 29. Crisp DM 1.0 Business Understanding Data Deployment Understanding Data Data Evaluation Preparation Data Modeling
  30. 30. LOGO Data MiningLOGO Data Mining Applications
  31. 31. Data Mining Applications Online Searching Business Spatial Data Science Data Mining Mining Security Marketing
  32. 32. Data Mining Applications BUSSINESS PRECPECTIVE:- Data mining helps business to extract information from resources such as print media, television, internet, investment. Data mining tools predicts future trend and behavior allowing business to make proactive knowledge driven decision for increasing revenue, profit of the company. SCIENCTIFIC PRECPECTIVE:- Practical perspective describe how techniques from data mining can be used to address and resolve the modern problem in science and engineering domains.
  33. 33. Data Mining Applications SECURITY PRECPECTIVE:- To prevent or detect for fraud such as showing wrong geographical domain and to identify stolen credit card by transaction history. Data Mining can help to make online transactions more secure and reliable by analyzing previous transaction records. SPATIAL DATA MINING:- Geo-marketing companies doing customer segmentation based on spatial location through data mining by mining the purchase and subscription history .
  34. 34. WEBSITE PROMOTION:- Web owner can attract most number of visitors by mining their data and then modifying their layout on the basis of extracted information.
  35. 35. LOGO www.themegallery.comLOGO Add your company slogan