A review on data mining

897
-1

Published on

this presentation provides information for data mining, process, methods,applications and future scope.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
897
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
48
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

A review on data mining

  1. 1. A REVIEW ON DATA MINING1 Er.Nancy, perusing M-tech in CSE (2012-14), GNDEC, Ludhiana, India.
  2. 2. ABSTRACT Data Mining is an approach to discover or extract knowledge from Databases. Generally to store databases, enterprises make data warehouses and data marts. Data warehouses and data marts contain large mountains of data. The mountains represent valuable resource to the enterprise. Due to extracting knowledge from large data warehouses or depositories, data mining plays great role in various fields of machine learning. Data mining is used in education, medical, scientific, business fields. Various algorithms and programs are 2 used for data mining approach.[1]
  3. 3. INTRODUCTION Now the world of information technology, all things are become automated. Information technology is used in every field of human life such as business, engineering, medical, mathematical, scientific. All these fields of human life have lead to the large volume of data storage in various formats such as records, files, documents, images, sounds, recordings, videos and many new data. Collection of related data is also known as database. To extract correct data from large databases, the proper mechanism is used, that mechanism is also known as DATA MINING. It is a knowledge discovery technique (KDD).[2] 3
  4. 4. PROCESS (1) Selection (2) Pre-processing (3) Transformation (4) Data Mining (5) Interpretation/Evaluation.[2,3] 4
  5. 5. 5
  6. 6. PROCEDURE OF DATA MINING 1. Data cleaning: It is also known as data cleansing; in this phase noise data and irrelevant data are removed from the collection. 2. Data integration: In this stage, multiple data sources, often heterogeneous, are combined in a common source. 3. Data selection: The data relevant to the analysis is decided on and retrieved from the data collection. 4. Data transformation: It is also known as data consolidation; in this phase the selected data is transformed into forms appropriate for the mining procedure. 6
  7. 7. CONTD……. 5. Data mining: It is the crucial step in which clever techniques are applied to extract potentially useful patterns. 6. Pattern evaluation: In this step, interesting patterns representing knowledge are identified based on given measures. 7. Knowledge representation: It is the final phase in which the discovered knowledge is visually presented to the user. This essential step uses visualization techniques to help users understand and interpret the data mining results.[4,2] 7
  8. 8. DATA MINING LIFE CYCLE The Cross Industry Standard Process for Data Mining (CRISP-DM) which defines six phases: (1) Business Understanding (2) Data Understanding (3) Data Preparation (4) Modeling (5) Evaluation (6) Deployment.[6] 8
  9. 9. CONTD…… 1. Business Understanding: This phase focuses on understanding the project objectives and requirements from a business perspective, then converting this knowledge into a data mining problem definition and a preliminary plan designed to achieve the objectives. 2. Data Understanding: It starts with an initial data collection, to get familiar with the data, to identify data quality problems, to discover first insights into the data or to detect interesting subsets to form hypotheses for hidden information. 3. Data Preparation: It covers all activities to construct the final dataset from the initial raw data. 9
  10. 10. CONTD…… 4. Modeling: In this phase, various modeling techniques are selected and applied and their parameters are calibrated to optimal values. 5. Evaluation: In this stage the model is thoroughly evaluated and reviewed. The steps executed to construct the model to be certain it properly achieves the business objectives. 6. Deployment: The purpose of the model is to increase knowledge of the data, the knowledge gained will need to be organized and presented in a way that the 10 customer can use it.[6,8]
  11. 11. TYPES OF DATA MINING SYSTEM Classification of data mining systems according to the type of data source mined: This classification is according to the type of data handled such as spatial data, multimedia Data, time-series data, text data, World Wide Web, etc. Classification of data mining systems according to the data model: This classification based on the data model involved such as relational database, object-oriented database, Data warehouse, transactional database, etc. Classification of data mining systems according to the kind of knowledge discovered: This classification based on the kind of knowledge discovered or data mining functionalities, such as characterization, discrimination, association, classification, clustering, etc 11
  12. 12. CONTD…… Classification of data mining systems according to mining techniques used: This classification is according to the data analysis approach used such as machine learning, Neural networks, genetic algorithms, statistics, visualization, database oriented or data warehouse-oriented, etc.[5,2] 12
  13. 13. DATA MINING METHODS On-Line Analytical  Factor analysis Processing (OLAP)  Neural Networks Classification  Regression analysis Association Rule Mining  Structured data analysis Temporal Data Mining  Sequence mining Time Series Analysis  Text mining Spatial Mining,  Drug Discovery Anomaly/outlier/change  Exploratory data analysis detection  Predictive analytics Association rule learning  Web Mining Cluster analysis  Data analysis etc.[1,8] 13 Decision trees
  14. 14. DATA MINING APPLICATIONS Games Business Science and Engineering Human rights Spatial data mining.[5] 14
  15. 15. CHALLENGES Sensor data mining Pattern mining Visual data mining Subject based data mining Music data mining The Digital Library retrieves.[1,2,5] 15
  16. 16. CONCLUSION In this paper I briefly reviewed data mining, background of data mining, methods, life cycle model, and types of data mining, application of it. This review will be helpful to you to easily understand what data mining is, previously used data mining techniques. Now days use data mining techniques and also process of data mining. In this I completely, explain applications and types of data mining. [1,2,4,5] 16
  17. 17. FUTURE SCOPE Data mining is a very wide concept. It contains many concepts such as various algorithms to extract knowledge from large databases. Soft Computing techniques like Fuzzy logic, Neural Networks and Genetic Programming which contains Complex data objects Includes high dimensional, high speed data streams, sequence, noise in the time series, graph, Multi instance objects, Multi represented objects and temporal data etc. These are used in Business, Web, Medical diagnosis, Scientific and Research analysis fields (bio, remote sensing etc…), Social networking etc.[1,7] 17
  18. 18. REFERENCES 1. WIKIPEDIA.Com 2. Mr. S. P. Deshpande1 and Dr. V. M. Thakar International Journal of Distributed and Parallel systems (IJDPS) Vol.1, No.1, September 2010 DOI. 3. A Review on Data mining from Past to the Future. Venkatadri.M Research Scholar, Dept. of Computer Science, Dravidian University, India. Lokanatha C. Reddy Professor, Dept. of Computer Science, International Journal of Computer Applications (0975 – 8887) Volume 15– No.7, February 2011. 18
  19. 19. CONTD….. 4. Data Mining for High Performance Data Cloud using Association Rule Mining.1 T.V.Mahendra 2N.Deepika 3N.Keasava Rao Professor & HOD, IT, Narayana Engg. College, Nellore, AP, India.2 Sr.Assistant Professor, Dept. of ISE, New Horizon College of Engineering, Bangalore, India 3 Associate Professor, IT, Narayana Engg. College, Gudur, AP, India. 5. Data Mining and KDD: A Shifting Mosaic By Joseph M. Firestone, Ph.D. White Paper No. Two March 12, 1997 the Idea. 19
  20. 20. CONTD…. 6. Data mining and static’s: what’s the connection? Jerome h. Friedman. 7.www.Google.Com. 8.www. IEEEXPLORE.Com. 20
  21. 21. THANKS21

×