introduction to data mining.
Data mining the practice of examining large pre-existing databases in order to generate new information.
samrat tayade,TE IT - ARMIET COLLEGE.
3. Knowledge of databases
• Database : A database is an organized collection of data, generally
stored and accessed electronically from a computer system.
Where databases are more complex they are often developed
using formal design and modeling techniques.
• Database Management System (DBMS) – add, remove, update
records – retrieve data that match certain criteria – cross-reference
data in different tables – perform complex aggregate calculation •
Database consists of columns (attributes) and rows (records).
4. Data warehousing
Data warehousing is the process of
constructing and using a data
warehouse. A data warehouse is
constructed by integrating data from
multiple heterogeneous sources that
support analytical reporting,
structured and/or ad hoc queries, and
decision making. Data warehousing
involves data cleaning, data
integration, and data consolidations.
5. OLAP
Online Analytical Processing Server (OLAP) is based on the
multidimensional data model. It allows managers, and analysts to get
an insight of the information through fast, consistent, and interactive
access to information.
TYPES OF OLAP :
1. Relational OLAP (ROLAP)
2. Multidimensional OLAP (MOLAP)
3. Hybrid OLAP (HOLAP)
4. Specialized SQL Servers
6. Content
• What is data mining
• Kind to be mined
• Technologies used
• Major issues in data
mining
7. What is data mining
• The practice of examining large pre-existing databases in order to generate new
information.
• Data Mining is defined as extracting information from huge sets of data. In other
words, we can say that data mining is the procedure of mining knowledge from
data.
• The information or knowledge extracted so can be used for any of the following
applications −
1. Market Analysis
2. Fraud Detection
3. Customer Retention
4. Production Control
5. Science Exploration
8. Kind to be mined
• Kind of knowledge to be mined
• It refers to the kind of functions to be performed.
• These functions are −
I. Characterization
II. Discrimination
III. Association and Correlation Analysis
IV. Classification
V. Prediction
VI. Clustering
VII. Outlier Analysis
VIII.Evolution Analysis
9. Kind of data mined
1.Flat Files
2.Relational Databases
3.DataWarehouse
4.Transactional Databases
5.Multimedia Databases
6.Spatial Databases
7.Time Series Databases
8.World Wide Web(WWW)