WHAT IS DATA MINING 
ELEMENTS 
TECHNIQUES 
APPLICATIONS
DATA 
MINING 
ELEMENTS TECHNIQUES APPLICATIONS 
Data mining (knowledge discovery from data) 
Extraction of interesting (non-trivial, implicit, previously unknown 
and potentially useful) patterns or knowledge from huge amount of 
data 
Alternative names 
Knowledge discovery (mining) in databases (KDD), knowledge 
extraction, data/pattern analysis, data archeology, data dredging, 
information harvesting, business intelligence, etc.
DATA 
MINING 
KDD 
ELEMENTS TECHNIQUES APPLICATIONS 
Data Mining – Core 
of Knowledge 
Discovery Process 
(KDD)
DATA 
MINING 
ELEMENTS 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
Data Relationships 
 Sequential 
Patterns 
 Clusters 
Data Mining Techniques 
 Decision Trees 
 Neural Networks 
 Regression 
 Association Rules 
 Nearest Neighbor Method 
 Genetic Algorithm 
 Artificial Intelligence
DATA 
MINING 
ELEMENTS 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
DATA RELATIONSHIPS 
Sequential Patterns 
 Finding statistically relevant patterns between data 
examples where discrete values delivered in sequence. 
 Problems addressed 
Building efficient databases, indexes for sequence 
information, extracting frequently occurring patterns, 
comparing sequences, recovering missing sequence 
members. 
 Application: 
{Retail Environment} 
Anticipating customer behavior for prediction of future 
customer purchasing habits. 
Increase profit, Decrease cost : Proper management of 
shelf space allocation & products display.
DATA 
MINING 
ELEMENTS 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
DATA RELATIONSHIPS 
Sequential Patterns {Eg.:}
DATA 
MINING 
ELEMENTS 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
DATA RELATIONSHIPS 
Clusters 
Placing data elements into 
related groups without advance 
knowledge of the group definitions. 
Popular clustering 
techniques: K-means, 
Expectation Maximization (EM) 
Problems addressed 
 Find natural groupings 
 Preprocess data to 
identify homogeneous groups 
on which to build supervised 
models. 
 Anomaly detection
DATA 
MINING 
ELEMENTS 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
DATA RELATIONSHIPS 
Clusters 
Application: 
 Plant and animal ecology 
Make spatial and temporal comparisons of 
communities of organisms in heterogeneous environments 
 Medical imaging 
differentiate between different types 
of tissue and blood in a three-dimensional image 
 Business and marketing 
Partition the general population of consumers for use 
in market segmentation, product positioning, new product 
development and Selecting test markets.
DATA 
MINING 
Decision Trees 
ELEMENTS 
 In decision tree technique, the root of the decision tree is a simple 
question or condition that has multiple answers. 
 Each answer then leads to a set of questions or conditions that help 
us determine the data so that we can make the final decision based 
on it. 
 For example, we use the following decision tree to determine 
whether or not to play tennis 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
 Starting at root node, if the outlook 
is overcast then we should 
definitely play tennis. 
 If it is rainy, we should only play 
tennis if the wind is week. 
 If it is sunny then we should play 
tennis in case the humidity is 
normal
DATA 
MINING 
ELEMENTS 
Neural Networks 
TECHNIQUES/ 
ALGORITHMS 
 Set of connected input/output units and each connection has a 
weight present with it. During the learning phase, network learns by 
adjusting weights so as to be able to predict the correct class labels 
of the input tuples. 
 Well suited for continuous valued inputs andoutputs 
 Used to extract patterns and detect trends that are too complex to be 
noticed by. 
 Neural networks are best at identifying patterns or trends in data and 
well suited for prediction of forecasting needs. 
APPLICATIONS 
Example : 
Handwritten character reorganization, for training a computer to 
pronounce English text and many real world business problems and 
have already been successfully applied in many industries.
DATA 
MINING 
Regression 
ELEMENTS 
 Regression technique can be adapted for predication 
 Regression analysis can be used to model the relationship between 
one or more independent variables and dependent variables. In data 
mining independent variables are attributes already known and 
response variables are what we want to predict. 
 However, it cannot be used for areas involving complex variables like 
in sales volumes, stock prices and product failure rates. 
Types of regression methods 
 Linear Regression 
 Multivariate Linear Regression 
 Nonlinear Regression 
 Multivariate Nonlinear Regress 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS
DATA 
MINING 
ELEMENTS 
Association Rules 
TECHNIQUES/ 
ALGORITHMS 
APPLICATIONS 
 In association, a pattern is discovered based on a relationship 
between items in the same transaction. 
 E.g. Rule Form : “Body  Head [support, confidence]” 
Application 
Retailers are using association technique to research 
customer’s buying habits. Based on historical sale data, retailers might 
find out that customers always buy crisps when they buy beers, and 
therefore they can put beers and crisps next to each other to save time 
for customer and increase sales. 
Types 
 Multilevel association rule 
 Multidimensional association 
rule 
 Quantitative association rule
DATA 
MINING 
ELEMENTS TECHNIQUES APPLICATIONS 
 Study of frequent flyer data from an Indian Airline 
 Data selected, prepared : 3 most common sectors flown & points 
redeemed for. 
(Note :Incomplete/Inaccurate Data supplied by airlines) 
 Data Mining results: 
 Patterns about customers flying between metropolitan cities 
 Customers that flew between Mumbai-Delhi also flew to other cities 
like Mumbai-Chennai, Mumbai-Kolkata & Mumbai Bangalore. 
 Customers flying Bangalore-Hyderabad also flew Delhi-Bangalore 
 Those who flew Bagdogra - Guwahati did not fly back; instead flew to 
Delhi
DATA 
MINING 
ELEMENTS TECHNIQUES APPLICATIONS 
 Banking information systems contains huge volumes of data both 
operational and historical. 
 Data mining can assist critical decision making processes in a bank. 
 Areas of application: 
 Marketing 
 Risk management and 
default detection 
 Fraud detection 
 Customer relationship 
management 
 Money laundering detection
DATA 
MINING 
 Wikipedia 
ELEMENTS TECHNIQUES APPLICATIONS 
 http://en.wikipedia.org/wiki/Sequential_Pattern_Mini 
ng 
 http://searchbusinessintelligence.techtarget.in/ 
 Indian Journal of Computer Science & Engg 
 Introduction to Data Mining with Case Studies by 
G.K. Gupta

Data Mining : Concepts

  • 2.
    WHAT IS DATAMINING ELEMENTS TECHNIQUES APPLICATIONS
  • 3.
    DATA MINING ELEMENTSTECHNIQUES APPLICATIONS Data mining (knowledge discovery from data) Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data Alternative names Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.
  • 4.
    DATA MINING KDD ELEMENTS TECHNIQUES APPLICATIONS Data Mining – Core of Knowledge Discovery Process (KDD)
  • 5.
    DATA MINING ELEMENTS TECHNIQUES/ ALGORITHMS APPLICATIONS Data Relationships  Sequential Patterns  Clusters Data Mining Techniques  Decision Trees  Neural Networks  Regression  Association Rules  Nearest Neighbor Method  Genetic Algorithm  Artificial Intelligence
  • 6.
    DATA MINING ELEMENTS TECHNIQUES/ ALGORITHMS APPLICATIONS DATA RELATIONSHIPS Sequential Patterns  Finding statistically relevant patterns between data examples where discrete values delivered in sequence.  Problems addressed Building efficient databases, indexes for sequence information, extracting frequently occurring patterns, comparing sequences, recovering missing sequence members.  Application: {Retail Environment} Anticipating customer behavior for prediction of future customer purchasing habits. Increase profit, Decrease cost : Proper management of shelf space allocation & products display.
  • 7.
    DATA MINING ELEMENTS TECHNIQUES/ ALGORITHMS APPLICATIONS DATA RELATIONSHIPS Sequential Patterns {Eg.:}
  • 8.
    DATA MINING ELEMENTS TECHNIQUES/ ALGORITHMS APPLICATIONS DATA RELATIONSHIPS Clusters Placing data elements into related groups without advance knowledge of the group definitions. Popular clustering techniques: K-means, Expectation Maximization (EM) Problems addressed  Find natural groupings  Preprocess data to identify homogeneous groups on which to build supervised models.  Anomaly detection
  • 9.
    DATA MINING ELEMENTS TECHNIQUES/ ALGORITHMS APPLICATIONS DATA RELATIONSHIPS Clusters Application:  Plant and animal ecology Make spatial and temporal comparisons of communities of organisms in heterogeneous environments  Medical imaging differentiate between different types of tissue and blood in a three-dimensional image  Business and marketing Partition the general population of consumers for use in market segmentation, product positioning, new product development and Selecting test markets.
  • 10.
    DATA MINING DecisionTrees ELEMENTS  In decision tree technique, the root of the decision tree is a simple question or condition that has multiple answers.  Each answer then leads to a set of questions or conditions that help us determine the data so that we can make the final decision based on it.  For example, we use the following decision tree to determine whether or not to play tennis TECHNIQUES/ ALGORITHMS APPLICATIONS  Starting at root node, if the outlook is overcast then we should definitely play tennis.  If it is rainy, we should only play tennis if the wind is week.  If it is sunny then we should play tennis in case the humidity is normal
  • 11.
    DATA MINING ELEMENTS Neural Networks TECHNIQUES/ ALGORITHMS  Set of connected input/output units and each connection has a weight present with it. During the learning phase, network learns by adjusting weights so as to be able to predict the correct class labels of the input tuples.  Well suited for continuous valued inputs andoutputs  Used to extract patterns and detect trends that are too complex to be noticed by.  Neural networks are best at identifying patterns or trends in data and well suited for prediction of forecasting needs. APPLICATIONS Example : Handwritten character reorganization, for training a computer to pronounce English text and many real world business problems and have already been successfully applied in many industries.
  • 12.
    DATA MINING Regression ELEMENTS  Regression technique can be adapted for predication  Regression analysis can be used to model the relationship between one or more independent variables and dependent variables. In data mining independent variables are attributes already known and response variables are what we want to predict.  However, it cannot be used for areas involving complex variables like in sales volumes, stock prices and product failure rates. Types of regression methods  Linear Regression  Multivariate Linear Regression  Nonlinear Regression  Multivariate Nonlinear Regress TECHNIQUES/ ALGORITHMS APPLICATIONS
  • 13.
    DATA MINING ELEMENTS Association Rules TECHNIQUES/ ALGORITHMS APPLICATIONS  In association, a pattern is discovered based on a relationship between items in the same transaction.  E.g. Rule Form : “Body  Head [support, confidence]” Application Retailers are using association technique to research customer’s buying habits. Based on historical sale data, retailers might find out that customers always buy crisps when they buy beers, and therefore they can put beers and crisps next to each other to save time for customer and increase sales. Types  Multilevel association rule  Multidimensional association rule  Quantitative association rule
  • 14.
    DATA MINING ELEMENTSTECHNIQUES APPLICATIONS  Study of frequent flyer data from an Indian Airline  Data selected, prepared : 3 most common sectors flown & points redeemed for. (Note :Incomplete/Inaccurate Data supplied by airlines)  Data Mining results:  Patterns about customers flying between metropolitan cities  Customers that flew between Mumbai-Delhi also flew to other cities like Mumbai-Chennai, Mumbai-Kolkata & Mumbai Bangalore.  Customers flying Bangalore-Hyderabad also flew Delhi-Bangalore  Those who flew Bagdogra - Guwahati did not fly back; instead flew to Delhi
  • 15.
    DATA MINING ELEMENTSTECHNIQUES APPLICATIONS  Banking information systems contains huge volumes of data both operational and historical.  Data mining can assist critical decision making processes in a bank.  Areas of application:  Marketing  Risk management and default detection  Fraud detection  Customer relationship management  Money laundering detection
  • 16.
    DATA MINING Wikipedia ELEMENTS TECHNIQUES APPLICATIONS  http://en.wikipedia.org/wiki/Sequential_Pattern_Mini ng  http://searchbusinessintelligence.techtarget.in/  Indian Journal of Computer Science & Engg  Introduction to Data Mining with Case Studies by G.K. Gupta