Successfully reported this slideshow.
Petra University                                                                    ‫جامعة البترا‬
    (Private Accredited...
A3. Developing an overview of the concepts involved in Data mining including Types of Data, Classification,
Course Contents :
Week   Topics                            Topic Details                                        Reference ...
The following Measures are taken seriously to continuously improve the quality of the...
5 . Lecturer’s Notes

     The University Regulations on academic dishonesty will be strictly enforced!...
Upcoming SlideShare
Loading in …5

Data Mining - Petra University


Published on

  • Be the first to comment

  • Be the first to like this

Data Mining - Petra University

  1. 1. Petra University ‫جامعة البترا‬ (Private Accredited University) (‫)جامعة خاصة معتمدة‬ Faculty of Information Technology ‫كلية تكنولوجيا المعلومات‬ Department of Computer Science ‫قسم علم الحاسوب‬ Course Title Course No. Credit Hrs Prerequisite Year (semester) Lec./Lab. Credit 601281 Lecture: 3 Data Mining 602386 3 (DB I) 2009-1 Lab : 0 Instructor Name Office e-mail/Web Site Office Hours 7312 Dr. Bassam Haddad Ext. 340 Coordinator Dr. Bassam Haddad Introduction to Data Mining Text Book Pang-Ning Tan, M. Steinbach and V. Kumar, Pearson Education, Inc. 2006 This course presents an introduction to the fundamental concepts and techniques of Data Mining. Areas covered include Types of Data and their Pre-processing, Course Description Predictive Modelling, Association Analysis, Clustering and Anomaly Detection and their applications. Aims The main goal of this course is to provide the students with theoretical and practical knowledge of the Data Mining technology and its fields through acquiring conceptual knowledge and background for the most important topics of Data Mining basics and algorithms. Objectives –  Understanding what is the meaning of Data Mining and how it can be utilized in solving real problems.  Understanding the significance of the Types of Data and Data pre-processing for Data mining techniques.  Recognizing the feasibility of a Data Mining solution for a specific problem.  Recognizing different Data Mining techniques and knowing when a each strategy is appropriate.  Applying basic statistical and others in evaluating the results of Data Mining solution.  Understating the concepts behind the Predictive Modelling such as classification and its techniques such as Decision trees and alternative techniques such as Rule-Based classifiers and probabilistic approaches.  Learning Descriptive Data Mining tasks to derive patterns such as associative analysis, clustering and anomaly detection.  Recognizing the areas for intelligent agents in solving problems in combination with Data Mining strategies.  Developing a general awareness about the structure of Data warehousing.  Understating what on-line analytical processing and it can be applied to analyze data Intended Learning Outcomes : Successful completion of this course should lead to the following learning outcomes: A- Knowledge and Understanding (students should): A1. Identify the different Application’s areas of Data Mining A2. Understand the fundamentals and algorithms of Data Mining such as Predictive Modelling and Descriptive Methods necessary for Data Mining solutions. 1
  2. 2. A3. Developing an overview of the concepts involved in Data mining including Types of Data, Classification, Association Analysis, Clustering and Anomaly Detection, and their advanced approaches such as Artificial Neural Networks, and Expert Systems B- Intellectual Skills (Student should be able to): B1. Analyze, compare and criticize the different Data Mining techniques B2. Synthesize modified algorithms from existing once. B3. Evaluate the basics concepts of Data Mining vs. advanced techniques and Data Mining as a confluence of many disciplines (Statistics, AI, Machine Learning, Pattern Recognition, Data Base Technology, and parallel computing). C) Subject Specific Skills (Student should be able to): C1. Solve a problem requiring an appropriate Data Mining strategy. C2. Learn the essentials of the Data Mining based on some practical data analysis tools. C3. Write a report on a selected Data Mining area D) Transferable Skills (Student should be able to): D1. work in a group in order to design and implement solutions of several Data Mining problems. D2. Conduct research projects and present results D3. Deploy communication skills D4. Deploy proper report writing skills Teaching and Learning Methods: Interactive lectures (IlOs: A1, A2, A3, C2) Lecture on major concepts and issues: Interactive lectures with videos and PowerPoint slides are conducted with lecturer explaining and illustrating the concepts. Students will be invited to share their view and experience in applying the concepts. Group Projects and Presentations (ILOs: B1, B2, B3, C1, C2, C3, D1, D2, D3, D4) Students will work on a course projects (2 to 3 students in a group). Each group will submit a short proposal of their project, including the names of team members starting from the second week of classes. Once the project is approved by the instructor, the group submits a more extended proposal which includes the role of each team member, Time-Plan, and the tools and applications that will be employed in the project. Each group will submit their project with a presentation at the end of the semester. Online search / research and short presentations (ILOs: C2 , C3, D1, D3) Each student will be required to search the net for a new topic that relates to this course. A one page summary of this topic is to be submitted a long with a 10 minute presentation. Textbook Problems (ILOs: A1, A2, A3) Problems have been selected for in-class illustration of certain concepts and applications. Additional textbook problems have been assigned for students to practice and gain better understanding of the concepts discussed. Homework assignments will be collected for grading. Outside-classroom activities (ILOs: B3, C1 ,D2, D4) Students are required to schedule meetings with their groups, and to document the results of such meetings. AI Lab (ILOs: C1, C2, C3) Students are required to visit the AI lab and to experiment with Prolog and the expert System Shells available. 2
  3. 3. Course Contents : Week Topics Topic Details Reference Assessment (chapter) 1 Introduction What is Data Mining? The foundations of Data Mining; Data Chp.1, and Mining and Knowledge Discovery ; The Origins of Data Mining Notes Motivation Application Areas in AI. 2 First View of What can Computer learn; Supervised Learning: Decision tree, Chap .1, Data Mining Unsupervised Clustering, Expert System Vs. Data Mining, Case Ref.1. Study Chap1. Notes 2 Types of Data The Types of Data, Attributes and Measurement, Different Types Chp.2, of Attributes (Qualitative and Quantitative) Notes 3 Data Sets Types of Data Sets, general Characteristics of Data Sets, Record Chp.2 Data, Graph-based Data, ordered Data (Sequential Data, Notes Temporal, Spatial) 4 Data Quality Data Quality, Data Collections errors; Noise, missing and Chp.2, inconsistent values, Redundancy Notes Data Preprocessing; Aggregation, Sampling, Dimension reduction, Feature Creation, Discretization and Binarization 5 Measures of Similarity and Dissimilarity for simple attributes, Dissimilarities Chp. 2 Similarity between Data Objects, Similarities between Data Objects Notes TEST 1 5 Exploring A general Introduction to basics Statistics (Frequency, measures Chp. 3, Data of Location, Measures of Spread; Range) Notes Visualization and Representations. 7 On-Line Representation of Multidimensional Data (Examples) Chp.3 Analytical Analyzing Multidimensional Data Notes Analysis (OLAP) and Multidimensi onal and Data Analysis 8 The Data Data Modelling and normalization, The Relational Model, Data Chp. 3, Warehouse Warehouse Design, Structuring Data Warehouse Ref. 1 Chap. 6 Notes 9 Predictive Classification, Basic Concepts, Decision Tree Induction, Building Chp. 4, Modelling a Decision Tree, Chap. 5 Ref.1 Chap. 3. Notes 10 Advanced Rule Based-Classifier, Bayesian Classifier Chp.5, Classification TEST 2 11 Advanced Nearest- Neighbour Classifiers, Artificial Neural Networks, Chp. 8 Classification 12 Descriptive Association Analysis Basic Concepts; Association Rule Chp. 6 Data Mining Discovery, Frequent Itemset generation, Confidence and Support, Ref.1. 3, Example Notes 13 Cluster Basic Concepts, Types of Clustering, K-means, Example Chp .9, Analysis Ref.1 3, Notes 14 Anomaly Basic Concepts, Causes of Anomalies, Statistical Approach, Chp . 10 Detection Detecting Outliers 15 Final Exam FINAL EXAM 3
  4. 4. CONTINUAL COURSE Quality IMPROVEMENT The following Measures are taken seriously to continuously improve the quality of the course:  Student Feed back: Using the University Student Evaluation, and the IT faculty Special Evaluation Form to provide instructor and department with feedback.  Peer Visitation: Feedback from faculty members with similar specialization  Course Coordinator: Participates in course updates, and monitors course progress  Internal Examiner: Feedback pertaining to course outline, exams and projects, Course objectives and ILOs  External Examiner: Feedback pertaining to course outline, exams and projects, Course objectives and ILOs  ACM, AIS, and AITP Curriculum Guidelines  MOH Guidelines for Standard Efficiency Exams Assesment and Grade Distribution Assessment ILOs Requirement for Grading / Due Date Points Total I. Group Work B1, B2, B3, C1, C2, C3, D1, D2, D3, D4 15% Project Proposal + written Report 10% Presentation Power Point Slides 5% II. Individual Work A1, A2, A3, B1, B2, B3, C1, C2, C3 85% Attendance, Participation, Chapter Homework, Discussions, Short and Home works Presentations 5% Quizzes Unannounced Short quizzes Covers Topics: Introduction, Types of Data, Data Pre-processing, Data Sets and Data Quality and Measures of Similarity First Exam 15% 5 to 7 Multiple Choice Questions worth 25% of exam Grade. Four to Five Essay Questions worth 75% of exam grade. Covers Topics: Exploring Data, on-Line Analytical Analysis (OLAP) and Multidimensional and Data Analysis, The Data Warehouse, Basic Predictive Modelling, Rule Second Exam Based-Classifier, Bayesian Classifier 15% 5 to 7 Multiple Choice Questions worth 25% of exam Grade. Four to Five Essay Questions worth 75% of exam grade. Covers all Topics A Comprehensive Final 10 to 15 Multiple Choice Questions worth 25% of 50% examination exam Grade. Five to six Essay Questions worth 75% of exam grade. TOTAL 100% * Make-up exams will be offered for valid reasons. It may be different from regular exams in content and format. References: 1. Data Mining A Tutorial-Based Primer Richard J. Roiger and M. W. Geatz, Pearson Education, 2005, Low Price Edition 2. Insight into Data Mining, Theory and Practice K.P. Soman, S. Diwakar and V. Ajay, Prentice Hall of India 2006 3. Principles of Data Mining David Hand, H. Mannila and P. Smyth, Prentice Hall of India 2004 4. Data Mining: Generation Challenges and Future Directions H. Kargupta, A Joshi, K. Sivakumar, and Y. Yasha, MIT Press, 2004 4
  5. 5. 5 . Lecturer’s Notes COURSE POLICIES  The University Regulations on academic dishonesty will be strictly enforced! Please check the University Statement on plagiarism.  Make-up Exams: Only students with valid excuses are allowed to have make up exams. All excuses must be signed by the Faculty Dean. Student has the responsibility to arrange with his/her instructor for an exam date before the occurrence of the next regular exam.  All assignment and class work must be submitted at the specified due date. No late work will be accepted.  Attendance policy will be strictly enforced (refer to student's Handbook).  No make up for quizzes under any circumstance. Last updated by Dr. Bassam Haddad, 29-9-2009 Approved by: Name Date Signature Course Coordinator Dr Bassam Haddad 29-29-2009 Curriculum Committee Quality Assurance Committee Faculty Dean 5