2. What is Data Mining?
Data mining [1] refers to the extraction of
hidden information from large databases.
Data Mining and KDD i.e Knowledge
discovery in databases are often used
interchangabily but Data mining is just a
step in KDD process.
What is KDD?
KDD [1] refers to the broad process of finding
knowledge in data and it the "high-level"
application of particular data mining methods.
3. NEED OF KDD
Traditional method of turning data into knowledge used manual
analysis and interpretation. Eg- health care centers may make
resources available according to the currently spreading
disease.
But now due to increased sizes of databases, traditional method
is impractical to use.
For example-let say , in some database, there are N no of
records and M be the no of attributes.
Its just a wastage of time and energy to do manual analysis in
the cases where N and M are larger values say 100 or so.
4. KDD PROCESS
KDD is the nontrivial process of identifying valid, novel,
potentially useful, and ultimately understandable patterns
in data.
The term process implies that KDD comprises many steps,
which involve data preparation, search for patterns,
knowledge evaluation, and refinement.
By nontrivial, we mean that some search or inference is
involved; that is, it is not a straightforward computation of
predefined quantities like computing the average value of a
set of numbers.
6. KNOWLEDGE DISCOVERY IN THE REAL
WORLD[3]
There are a wide range of applications of KDD in real world like in
business, artificial intelligence, health care, science, marketing, finance,
fraud detection, manufacturing, telecommunications, and Internet agents
and many more.
APPLICATIONS IN EDUCATIONAL INSTITUTES/SCHOOLS
GROUPING OF STUDENTS
Clustering is used to group similar
students into a cluster.
Figure 2-Cluster of students
7. PREDICTING THE REGISTRATION OF STUDENTS IN AN EDUCATIONAL
PROGRAMME
classification and prediction is used for better assessment, evaluation,
planning, and decision making in universities/schools so that they
can allocate resources more effectively.
Figure 3- Prediction of number of girls this year
8. PREDICTING STUDENT'S PERFORMANCE
Decision tree and classification helps an
instructor to assess the quality of student by
conducting an online discussion among a
group of students and use the possible
indicators such as the time difference
between posts, frequency distribution of the
postings, duration between postings and
replies etc.
DETECTING CHEATING IN AN EXAMINATION
Examinations are useful to evaluate
students’ knowledge.
The models generated use data
comprising of different student’s
personalities, and common practices used
by students to cheat to obtain a better
grade on these exams.
Figure 4-to access the quality of
students
9. IDENTIFYING ABNORMAL/ ERRONEOUS VALUES
The data stored in databases may contain abnormal or erroneous,
incomplete, exceptional data which may confuse the analysis process. . As a
result, the accuracy of the discovered patterns can be poor.
Abnormalities in student’s marks may be due to software fault, data entry
operator negligence or an extraordinary performance of the student in a
particular subject.
Figure 5-Abnormal result of a student
The student in subject 4 with roll no 104 will be detected as an exception.
10. FUTURE SCOPE OK KDD[4]
Although no human being can foretell the future, we believe that there are
plenty of interesting new challenges ahead of us, and quite a few of them
cannot be foreseen at the current point of time. Here we describe one of the
future scope of KDD that is, in chess(computer) game.
EDUCATIONAL CHESS PROGRAMS
There could be a program that could analyzes a
certain position or an entire game on an abstract
strategic level, tries to understand your opponent’s
and your own plans, and provides suggestions on
different ways to proceed.
Figure 6-Educational
chess programs
11. TOURNAMENT PREPARATION
Another possible application for KDD in chess is to let the player know his
strength and weakness by providing statistics of his wins and losses
depending upon his opening move or some specific move.
INCREASING PLAYING STRENGTH
Incorporating additional knowledge
into computer chess programs can
lead to significant increases in playing
strength.
In figure 7, in a particular situation
FILTZ chess game algorithm gave
another explanation which was much
complex than what a normal human
being would do in such situation.
Figure 7-Zugzwang situation
12. REFERENCES
[1] Paulraj Ponniah, "Data Warehousing Fundamentals for IT
Professionals",Wiley, pp. 400–402,2010
[2] Oded Maimon, Lior Rokach," introduction to knowledge discovery in
databases",Department of Industrial Engineering,2012
[3] Manoj Bala,"study of applications of data mining techniques in
education",Vol. No. 1, Issue No. IV, Jan-Mar, (IJRST),2012
[4] Johannes F¨urnkranz," Knowledge Discovery in Chess
Databases",Austrian Research Institute for Artificial Intelligence,2001