SlideShare a Scribd company logo
Introduction to Data Mining
• What is Data Mining?
• Related technologies
• Data Mining techniques
• Data Mining Goals
• Stages of data mining process
• Knowledge representation methods
• Applications
What is Data Mining?
• The process of extracting information to identify patterns, trends,
and useful data that would allow the business to take the data-
driven decision from huge sets of data is called Data Mining.
• Data mining is the act of automatically searching for large stores
of information to find trends and patterns that go beyond simple
analysis procedures.
• Data Mining is a process used by organizations to extract
specific data from huge databases to solve business problems.
It primarily turns raw data into useful information.
• Data mining utilizes complex mathematical algorithms for data
segments and evaluates the probability of future events. Data
Mining is also called Knowledge Discovery of Data (KDD).
Related Technologies
Data mining is related to many concepts. We briefly
introduce each concept and indicate how it is related to
data mining.
• Machine Learning
• DBMS
• OLAP
• Statistics
Machine Learning
• Machine learning is the area of AI that examines how to write programs that
can learn.
• In data mining, machine learning is often used for prediction or classification.
• Applications that typically use machine learning techniques include speech
recognition, training moving robots, classification of astronomical structures,
and game playing.
• When machine learning is applied to data mining tasks, a model is used to
represent the data (such as a graphical structure like a neural network or a
decision tree).
• During the learning process, a sample of the database is used to train the
system to properly perform the desired task.
• Then the system is applied to the general database to actually perform the
task.
Machine Learning
• Machine learning algorithms are divided into two types:
1. Unsupervised Learning
2. Supervised Learning
1. Unsupervised Machine Learning:
Unsupervised learning does not depend on trained data sets to predict the
results, but it utilizes direct techniques such as clustering and association in
order to predict the results.
2. Supervised Machine Learning:
Supervised learning is a learning process in which we teach or train the
machine using data which is well leveled implies that some data is already
marked with the correct responses. After that, the machine is provided with
the new sets of data so that the supervised learning algorithm analyzes the
training data and gives an accurate result.
OLAP
• OLAP stands for On-Line Analytic Processing.
• OLAP systems are targeted to provide more complex query
results than traditional OLTP or database systems.
• OLAP is performed on data warehouses or data marts. The
primary goal of OLAP is to support ad hoc querying needed to
support DSS.
• The multidimensional view of data is fundamental to OLAP
applications.
• OLAP tools can be classified as ROLAP or MOLAP.
• ROLAP- Relational OLAP
• MOLAP- Multidimensional OLAP
OLAP operations
OLAP operations
There are several types of OLAP operations supported by OLAP tools:
• A simple query may look at a single cell within the cube [Figure (a)] .
• Slice: Look at a subcube to get more specific information. This is performed
by selecting on one dimension. As seen in Figure (c), this is looking at a
portion of the cube.
• Dice: Look at a subcube by selecting on two or more dimensions. This can be
performed by a slice on one dimension and then rotating the cube to select
on a second dimension. In Figure (d)
• Roll up (dimension reduction, aggregation): Roll up allows the user to ask
questions that move up an aggregation hierarchy. Figure (b) represents a roll
up from (a).
• Drill down: Figure (a) represents a drill down from (b). These functions allow a
user to get more detailed fact information by navigating lower in the
aggregation hierarchy.
• Visualization: Visualization allows the OLAP users to actually "see" results of
an operation.
DBMS
• A database is a collection of data usually associated with some
organization or enterprise.
• Schema
– e.g. (ID,Name,Address,Salary,JobNo) may be the schema for a
personnel database.
• A database management system (DBMS) is the software used to access a
database.
• Data model is used to describe the data, attributes, and relationships
among them.
– ER Model.
DBMS
• Transaction
• Query:
SELECT Name
FROM T
WHERE Salary > 100000
• A major difference between data mining queries and those of database
systems is the output .
• Basic database queries always output either a subset of the database or
aggregates of the data. A data mining query outputs a KDD object.
Statistics
• Simple statistical concepts as determining a data distribution and calculating
a mean and a variance can be viewed as data mining techniques.
• Statistical inference: Generalizing a model created from a sample of the
data to the entire dataset.
• Exploratory Data Analysis:
– Data can actually drive the creation of the model
– Opposite of traditional statistical view.
• Statistics research has produced many of the proposed data mining
algorithms.
• The difference between the data mining and statistics is data mining is
targeted to business users not to the statistician.
Goals of Data Mining?
• Data mining is one of the most useful techniques that help
entrepreneurs, researchers, and individuals to extract valuable
information from huge sets of data.
• Data mining Store and manage the data in a multidimensional
database system.
• Data mining Provide data access to business analysts and
information technology professionals.
• Data mining Analyze the data by application software.
• Data mining Present the data in a useful format, such as a
graph or table.

More Related Content

Similar to Lecture2 (1).ppt

Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdf
SaketBansal9
 
Complete unit ii notes
Complete unit ii notesComplete unit ii notes
Complete unit ii notes
Benazir Fathima
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
Dhilsath Fathima
 
Data mining slide for data mining process
Data mining slide for data mining processData mining slide for data mining process
Data mining slide for data mining process
NivaTripathy1
 
MS SQL SERVER: Introduction To Datamining Suing Sql Server
MS SQL SERVER: Introduction To Datamining Suing Sql ServerMS SQL SERVER: Introduction To Datamining Suing Sql Server
MS SQL SERVER: Introduction To Datamining Suing Sql Server
sqlserver content
 
MS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerMS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql Server
DataminingTools Inc
 
Ch_2.pdf
Ch_2.pdfCh_2.pdf
Ch_2.pdf
DawitBirhanu13
 
Data Mining-2023 (2).ppt
Data Mining-2023 (2).pptData Mining-2023 (2).ppt
Data Mining-2023 (2).ppt
SATYAJITJENABTECH
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Ch~2.pdf
Ch~2.pdfCh~2.pdf
Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data Analytics
Utkarsh Sharma
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
Sulman Ahmed
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
Samrat Tayade
 
overview of_data_processing
overview of_data_processingoverview of_data_processing
overview of_data_processing
FEG
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
Valerii Klymchuk
 
Unit 3 part ii Data mining
Unit 3 part ii Data miningUnit 3 part ii Data mining
Unit 3 part ii Data mining
Dhilsath Fathima
 
001 More introduction to big data analytics
001   More introduction to big data analytics001   More introduction to big data analytics
001 More introduction to big data analytics
Dendej Sawarnkatat
 
Data Mining Implementation process.pptx
Data Mining Implementation process.pptxData Mining Implementation process.pptx
Data Mining Implementation process.pptx
Lithal Fragrance
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
Rishikese MR
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
Samraiz Tejani
 

Similar to Lecture2 (1).ppt (20)

Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdf
 
Complete unit ii notes
Complete unit ii notesComplete unit ii notes
Complete unit ii notes
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Data mining slide for data mining process
Data mining slide for data mining processData mining slide for data mining process
Data mining slide for data mining process
 
MS SQL SERVER: Introduction To Datamining Suing Sql Server
MS SQL SERVER: Introduction To Datamining Suing Sql ServerMS SQL SERVER: Introduction To Datamining Suing Sql Server
MS SQL SERVER: Introduction To Datamining Suing Sql Server
 
MS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql ServerMS Sql Server: Introduction To Datamining Suing Sql Server
MS Sql Server: Introduction To Datamining Suing Sql Server
 
Ch_2.pdf
Ch_2.pdfCh_2.pdf
Ch_2.pdf
 
Data Mining-2023 (2).ppt
Data Mining-2023 (2).pptData Mining-2023 (2).ppt
Data Mining-2023 (2).ppt
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Ch~2.pdf
Ch~2.pdfCh~2.pdf
Ch~2.pdf
 
Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data Analytics
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
overview of_data_processing
overview of_data_processingoverview of_data_processing
overview of_data_processing
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
Unit 3 part ii Data mining
Unit 3 part ii Data miningUnit 3 part ii Data mining
Unit 3 part ii Data mining
 
001 More introduction to big data analytics
001   More introduction to big data analytics001   More introduction to big data analytics
001 More introduction to big data analytics
 
Data Mining Implementation process.pptx
Data Mining Implementation process.pptxData Mining Implementation process.pptx
Data Mining Implementation process.pptx
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 

More from Minakshee Patil

oracle.pptx
oracle.pptxoracle.pptx
oracle.pptx
Minakshee Patil
 
Lecture1.ppt
Lecture1.pptLecture1.ppt
Lecture1.ppt
Minakshee Patil
 
Unit 1.ppt
Unit 1.pptUnit 1.ppt
Unit 1.ppt
Minakshee Patil
 
Hierarchical clustering algorithm.pptx
Hierarchical clustering algorithm.pptxHierarchical clustering algorithm.pptx
Hierarchical clustering algorithm.pptx
Minakshee Patil
 
Lecture3 (3).ppt
Lecture3 (3).pptLecture3 (3).ppt
Lecture3 (3).ppt
Minakshee Patil
 
Lecture4.ppt
Lecture4.pptLecture4.ppt
Lecture4.ppt
Minakshee Patil
 

More from Minakshee Patil (7)

Lecture2 (9).ppt
Lecture2 (9).pptLecture2 (9).ppt
Lecture2 (9).ppt
 
oracle.pptx
oracle.pptxoracle.pptx
oracle.pptx
 
Lecture1.ppt
Lecture1.pptLecture1.ppt
Lecture1.ppt
 
Unit 1.ppt
Unit 1.pptUnit 1.ppt
Unit 1.ppt
 
Hierarchical clustering algorithm.pptx
Hierarchical clustering algorithm.pptxHierarchical clustering algorithm.pptx
Hierarchical clustering algorithm.pptx
 
Lecture3 (3).ppt
Lecture3 (3).pptLecture3 (3).ppt
Lecture3 (3).ppt
 
Lecture4.ppt
Lecture4.pptLecture4.ppt
Lecture4.ppt
 

Recently uploaded

Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 

Recently uploaded (20)

Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 

Lecture2 (1).ppt

  • 1. Introduction to Data Mining • What is Data Mining? • Related technologies • Data Mining techniques • Data Mining Goals • Stages of data mining process • Knowledge representation methods • Applications
  • 2. What is Data Mining? • The process of extracting information to identify patterns, trends, and useful data that would allow the business to take the data- driven decision from huge sets of data is called Data Mining. • Data mining is the act of automatically searching for large stores of information to find trends and patterns that go beyond simple analysis procedures. • Data Mining is a process used by organizations to extract specific data from huge databases to solve business problems. It primarily turns raw data into useful information. • Data mining utilizes complex mathematical algorithms for data segments and evaluates the probability of future events. Data Mining is also called Knowledge Discovery of Data (KDD).
  • 3. Related Technologies Data mining is related to many concepts. We briefly introduce each concept and indicate how it is related to data mining. • Machine Learning • DBMS • OLAP • Statistics
  • 4. Machine Learning • Machine learning is the area of AI that examines how to write programs that can learn. • In data mining, machine learning is often used for prediction or classification. • Applications that typically use machine learning techniques include speech recognition, training moving robots, classification of astronomical structures, and game playing. • When machine learning is applied to data mining tasks, a model is used to represent the data (such as a graphical structure like a neural network or a decision tree). • During the learning process, a sample of the database is used to train the system to properly perform the desired task. • Then the system is applied to the general database to actually perform the task.
  • 5. Machine Learning • Machine learning algorithms are divided into two types: 1. Unsupervised Learning 2. Supervised Learning 1. Unsupervised Machine Learning: Unsupervised learning does not depend on trained data sets to predict the results, but it utilizes direct techniques such as clustering and association in order to predict the results. 2. Supervised Machine Learning: Supervised learning is a learning process in which we teach or train the machine using data which is well leveled implies that some data is already marked with the correct responses. After that, the machine is provided with the new sets of data so that the supervised learning algorithm analyzes the training data and gives an accurate result.
  • 6. OLAP • OLAP stands for On-Line Analytic Processing. • OLAP systems are targeted to provide more complex query results than traditional OLTP or database systems. • OLAP is performed on data warehouses or data marts. The primary goal of OLAP is to support ad hoc querying needed to support DSS. • The multidimensional view of data is fundamental to OLAP applications. • OLAP tools can be classified as ROLAP or MOLAP. • ROLAP- Relational OLAP • MOLAP- Multidimensional OLAP
  • 8. OLAP operations There are several types of OLAP operations supported by OLAP tools: • A simple query may look at a single cell within the cube [Figure (a)] . • Slice: Look at a subcube to get more specific information. This is performed by selecting on one dimension. As seen in Figure (c), this is looking at a portion of the cube. • Dice: Look at a subcube by selecting on two or more dimensions. This can be performed by a slice on one dimension and then rotating the cube to select on a second dimension. In Figure (d) • Roll up (dimension reduction, aggregation): Roll up allows the user to ask questions that move up an aggregation hierarchy. Figure (b) represents a roll up from (a). • Drill down: Figure (a) represents a drill down from (b). These functions allow a user to get more detailed fact information by navigating lower in the aggregation hierarchy. • Visualization: Visualization allows the OLAP users to actually "see" results of an operation.
  • 9. DBMS • A database is a collection of data usually associated with some organization or enterprise. • Schema – e.g. (ID,Name,Address,Salary,JobNo) may be the schema for a personnel database. • A database management system (DBMS) is the software used to access a database. • Data model is used to describe the data, attributes, and relationships among them. – ER Model.
  • 10. DBMS • Transaction • Query: SELECT Name FROM T WHERE Salary > 100000 • A major difference between data mining queries and those of database systems is the output . • Basic database queries always output either a subset of the database or aggregates of the data. A data mining query outputs a KDD object.
  • 11. Statistics • Simple statistical concepts as determining a data distribution and calculating a mean and a variance can be viewed as data mining techniques. • Statistical inference: Generalizing a model created from a sample of the data to the entire dataset. • Exploratory Data Analysis: – Data can actually drive the creation of the model – Opposite of traditional statistical view. • Statistics research has produced many of the proposed data mining algorithms. • The difference between the data mining and statistics is data mining is targeted to business users not to the statistician.
  • 12. Goals of Data Mining? • Data mining is one of the most useful techniques that help entrepreneurs, researchers, and individuals to extract valuable information from huge sets of data. • Data mining Store and manage the data in a multidimensional database system. • Data mining Provide data access to business analysts and information technology professionals. • Data mining Analyze the data by application software. • Data mining Present the data in a useful format, such as a graph or table.