Tanmay Sethi 
IT 4th Year 
08220803111
What is Data Mining? 
Data mining [1] refers to the extraction of 
hidden information from large databases. 
Data Mining and KDD i.e Knowledge 
discovery in databases are often used 
interchangabily but Data mining is just a 
step in KDD process. 
What is KDD? 
KDD [1] refers to the broad process of finding 
knowledge in data and it the "high-level" 
application of particular data mining methods.
NEED OF KDD 
Traditional method of turning data into knowledge used manual 
analysis and interpretation. Eg- health care centers may make 
resources available according to the currently spreading 
disease. 
But now due to increased sizes of databases, traditional method 
is impractical to use. 
For example-let say , in some database, there are N no of 
records and M be the no of attributes. 
Its just a wastage of time and energy to do manual analysis in 
the cases where N and M are larger values say 100 or so.
KDD PROCESS 
KDD is the nontrivial process of identifying valid, novel, 
potentially useful, and ultimately understandable patterns 
in data. 
The term process implies that KDD comprises many steps, 
which involve data preparation, search for patterns, 
knowledge evaluation, and refinement. 
By nontrivial, we mean that some search or inference is 
involved; that is, it is not a straightforward computation of 
predefined quantities like computing the average value of a 
set of numbers.
Steps in KDD Process[2] 
STEPS- 
Understanding Goals 
Data Selection 
Data Preprocessing 
Data Mining 
Patterns Recognition 
Interpretation / Evaluation 
Knowledge discovery 
Figure 1-KDD Process
KNOWLEDGE DISCOVERY IN THE REAL 
WORLD[3] 
There are a wide range of applications of KDD in real world like in 
business, artificial intelligence, health care, science, marketing, finance, 
fraud detection, manufacturing, telecommunications, and Internet agents 
and many more. 
APPLICATIONS IN EDUCATIONAL INSTITUTES/SCHOOLS 
 GROUPING OF STUDENTS 
Clustering is used to group similar 
students into a cluster. 
Figure 2-Cluster of students
 PREDICTING THE REGISTRATION OF STUDENTS IN AN EDUCATIONAL 
PROGRAMME 
classification and prediction is used for better assessment, evaluation, 
planning, and decision making in universities/schools so that they 
can allocate resources more effectively. 
Figure 3- Prediction of number of girls this year
PREDICTING STUDENT'S PERFORMANCE 
Decision tree and classification helps an 
instructor to assess the quality of student by 
conducting an online discussion among a 
group of students and use the possible 
indicators such as the time difference 
between posts, frequency distribution of the 
postings, duration between postings and 
replies etc. 
DETECTING CHEATING IN AN EXAMINATION 
Examinations are useful to evaluate 
students’ knowledge. 
The models generated use data 
comprising of different student’s 
personalities, and common practices used 
by students to cheat to obtain a better 
grade on these exams. 
Figure 4-to access the quality of 
students
IDENTIFYING ABNORMAL/ ERRONEOUS VALUES 
The data stored in databases may contain abnormal or erroneous, 
incomplete, exceptional data which may confuse the analysis process. . As a 
result, the accuracy of the discovered patterns can be poor. 
Abnormalities in student’s marks may be due to software fault, data entry 
operator negligence or an extraordinary performance of the student in a 
particular subject. 
Figure 5-Abnormal result of a student 
The student in subject 4 with roll no 104 will be detected as an exception.
FUTURE SCOPE OK KDD[4] 
Although no human being can foretell the future, we believe that there are 
plenty of interesting new challenges ahead of us, and quite a few of them 
cannot be foreseen at the current point of time. Here we describe one of the 
future scope of KDD that is, in chess(computer) game. 
EDUCATIONAL CHESS PROGRAMS 
There could be a program that could analyzes a 
certain position or an entire game on an abstract 
strategic level, tries to understand your opponent’s 
and your own plans, and provides suggestions on 
different ways to proceed. 
Figure 6-Educational 
chess programs
TOURNAMENT PREPARATION 
Another possible application for KDD in chess is to let the player know his 
strength and weakness by providing statistics of his wins and losses 
depending upon his opening move or some specific move. 
INCREASING PLAYING STRENGTH 
Incorporating additional knowledge 
into computer chess programs can 
lead to significant increases in playing 
strength. 
In figure 7, in a particular situation 
FILTZ chess game algorithm gave 
another explanation which was much 
complex than what a normal human 
being would do in such situation. 
Figure 7-Zugzwang situation
REFERENCES 
[1] Paulraj Ponniah, "Data Warehousing Fundamentals for IT 
Professionals",Wiley, pp. 400–402,2010 
[2] Oded Maimon, Lior Rokach," introduction to knowledge discovery in 
databases",Department of Industrial Engineering,2012 
[3] Manoj Bala,"study of applications of data mining techniques in 
education",Vol. No. 1, Issue No. IV, Jan-Mar, (IJRST),2012 
[4] Johannes F¨urnkranz," Knowledge Discovery in Chess 
Databases",Austrian Research Institute for Artificial Intelligence,2001
Application of KDD & its future scope
Application of KDD & its future scope

Application of KDD & its future scope

  • 1.
    Tanmay Sethi IT4th Year 08220803111
  • 2.
    What is DataMining? Data mining [1] refers to the extraction of hidden information from large databases. Data Mining and KDD i.e Knowledge discovery in databases are often used interchangabily but Data mining is just a step in KDD process. What is KDD? KDD [1] refers to the broad process of finding knowledge in data and it the "high-level" application of particular data mining methods.
  • 3.
    NEED OF KDD Traditional method of turning data into knowledge used manual analysis and interpretation. Eg- health care centers may make resources available according to the currently spreading disease. But now due to increased sizes of databases, traditional method is impractical to use. For example-let say , in some database, there are N no of records and M be the no of attributes. Its just a wastage of time and energy to do manual analysis in the cases where N and M are larger values say 100 or so.
  • 4.
    KDD PROCESS KDDis the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. The term process implies that KDD comprises many steps, which involve data preparation, search for patterns, knowledge evaluation, and refinement. By nontrivial, we mean that some search or inference is involved; that is, it is not a straightforward computation of predefined quantities like computing the average value of a set of numbers.
  • 5.
    Steps in KDDProcess[2] STEPS- Understanding Goals Data Selection Data Preprocessing Data Mining Patterns Recognition Interpretation / Evaluation Knowledge discovery Figure 1-KDD Process
  • 6.
    KNOWLEDGE DISCOVERY INTHE REAL WORLD[3] There are a wide range of applications of KDD in real world like in business, artificial intelligence, health care, science, marketing, finance, fraud detection, manufacturing, telecommunications, and Internet agents and many more. APPLICATIONS IN EDUCATIONAL INSTITUTES/SCHOOLS  GROUPING OF STUDENTS Clustering is used to group similar students into a cluster. Figure 2-Cluster of students
  • 7.
     PREDICTING THEREGISTRATION OF STUDENTS IN AN EDUCATIONAL PROGRAMME classification and prediction is used for better assessment, evaluation, planning, and decision making in universities/schools so that they can allocate resources more effectively. Figure 3- Prediction of number of girls this year
  • 8.
    PREDICTING STUDENT'S PERFORMANCE Decision tree and classification helps an instructor to assess the quality of student by conducting an online discussion among a group of students and use the possible indicators such as the time difference between posts, frequency distribution of the postings, duration between postings and replies etc. DETECTING CHEATING IN AN EXAMINATION Examinations are useful to evaluate students’ knowledge. The models generated use data comprising of different student’s personalities, and common practices used by students to cheat to obtain a better grade on these exams. Figure 4-to access the quality of students
  • 9.
    IDENTIFYING ABNORMAL/ ERRONEOUSVALUES The data stored in databases may contain abnormal or erroneous, incomplete, exceptional data which may confuse the analysis process. . As a result, the accuracy of the discovered patterns can be poor. Abnormalities in student’s marks may be due to software fault, data entry operator negligence or an extraordinary performance of the student in a particular subject. Figure 5-Abnormal result of a student The student in subject 4 with roll no 104 will be detected as an exception.
  • 10.
    FUTURE SCOPE OKKDD[4] Although no human being can foretell the future, we believe that there are plenty of interesting new challenges ahead of us, and quite a few of them cannot be foreseen at the current point of time. Here we describe one of the future scope of KDD that is, in chess(computer) game. EDUCATIONAL CHESS PROGRAMS There could be a program that could analyzes a certain position or an entire game on an abstract strategic level, tries to understand your opponent’s and your own plans, and provides suggestions on different ways to proceed. Figure 6-Educational chess programs
  • 11.
    TOURNAMENT PREPARATION Anotherpossible application for KDD in chess is to let the player know his strength and weakness by providing statistics of his wins and losses depending upon his opening move or some specific move. INCREASING PLAYING STRENGTH Incorporating additional knowledge into computer chess programs can lead to significant increases in playing strength. In figure 7, in a particular situation FILTZ chess game algorithm gave another explanation which was much complex than what a normal human being would do in such situation. Figure 7-Zugzwang situation
  • 12.
    REFERENCES [1] PaulrajPonniah, "Data Warehousing Fundamentals for IT Professionals",Wiley, pp. 400–402,2010 [2] Oded Maimon, Lior Rokach," introduction to knowledge discovery in databases",Department of Industrial Engineering,2012 [3] Manoj Bala,"study of applications of data mining techniques in education",Vol. No. 1, Issue No. IV, Jan-Mar, (IJRST),2012 [4] Johannes F¨urnkranz," Knowledge Discovery in Chess Databases",Austrian Research Institute for Artificial Intelligence,2001