Course Syllabus


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Course Syllabus

  1. 1. ICE0403: Data & Text Mining Fall 2007 Instructor Dr. Ji-Ae Shin Email ( The best to reach me is via emails) Office F612 (x 6172) Objective of the Course To study machine learning techniques which are commonly used to extract information/patterns in (textual) data, such as Inductive Logic Programming, Clustering, Statistical Logic Networks, and Decision Trees. Prerequisites Algorithms (A-); Artificial Intelligence (A-); Discrete Mathematics I; Probability & Statistics; Scripting Languages References 1. (Textbook) I. Witten & E. Frank, Data Mining (2nd ed.), Elsevier, 2005 2. (Textbook) T. Mitchell, Machine Learning, McGraw-Hill, 1997 3. D. J. Hand, H. Mannila and P. Smyth. Principles of Data Mining (Adaptive Computation and Machine Learning), MIT Press, 2001 4. T. Hastie, R. Tibshirani, & J. H. Friedman, The Elements of Statistical Learning : Data Mining, Inference, and Prediction Springer Verlag, 2001. 5. R. O. Duda, P. E. Hart, & D. G. Stork, Pattern Classification Wiley- Interscience, 2000. 6. P. Langley, Elements of Machine Learning, Morgan Kaufman Publishers, San Fancisco, CA, 1995. 7. S. M. Weiss & C. A. Kulikowski, Computer Systems that Learn, Morgan Kaufman Publishers, San Fancisco, CA, 1991. 8. W. Shavlik & T. G. Dietterich (Eds.), Readings in Machine Learning, Morgan Kaufman Publishers, San Fancisco, CA, 1990. Grading Policy
  2. 2. Homework (3: Theory & Practice) 36% Term Project 24% Class Participation 10% Final Exam 30% Topics in Consideration 1. Introduction Definition of learning systems. Goals and applications of machine learning. Aspects of developing a learning system: training data, concept representation, function approximation. 2. Inductive Classification The concept learning task. Concept learning as search through a hypothesis space. General-to-specific ordering of hypotheses. Finding maximally specific hypotheses. Version spaces and the candidate elimination algorithm. Learning conjunctive concepts. The importance of inductive bias. 3. Decision Tree Learning Representing concepts as decision trees. Recursive induction of decision trees. Picking the best splitting attribute: entropy and information gain. Searching for simple trees and computational complexity. Occam's razor. Overfitting, noisy data, and pruning. 4. Experimental Evaluation of Learning Algorithms Measuring the accuracy of learned hypotheses. Comparing learning algorithms: cross-validation, learning curves, and statistical hypothesis testing. 5. Rule Learning: Propositional and First-Order Translating decision trees into rules. Heuristic rule induction using separate and conquer and information gain. First-order Horn-clause induction (Inductive Logic Programming) and Foil. Learning recursive rules. Inverse resolution, Golem, and Progol. 6. Artificial Neural Networks (if time permits) Neurons and biological motivation. Linear threshold units. Perceptrons:
  3. 3. representational limitation and gradient descent training. Multilayer networks and backpropagation. Hidden layers and constructing intermediate, distributed representations. Overfitting, learning network structure, recurrent networks. 7. Support Vector Machines Maximum margin linear separators. Quadractic programming solution to finding maximum margin separators. Kernels for learning non-linear functions. 8. Bayesian Learning Probability theory and Bayes rule. Naive Bayes learning algorithm. Parameter smoothing. Bayes nets and Markov nets for representing dependencies. 9. Instance-Based Learning Constructing explicit generalizations versus comparing to past specific examples. k-Nearest-neighbor algorithm. Case-based learning. 10. Text Classification (if time permits) Bag of words representation. Vector space model and cosine similarity. Relevance feedback and Rocchio algorithm. Versions of nearest neighbor and Naive Bayes for text. 11. Clustering and Unsupervised Learning Learning from unclassified data. Clustering. Hierarchical Aglomerative Clustering. k-means partitional clustering. Expectation maximization (EM) for soft clustering. Semi-supervised learning with EM using labeled and unlabled data. 12. Language Learning (if time permits) Classification problems in language: word-sense disambiguation, sequence labeling. Hidden Markov models (HMM's). Veterbi algorithm for determining most-probable state sequences. Forward-backward EM algorithm for training the parameters of HMM's. Use of HMM's for speech recognition, part-of-speech tagging, and information extraction. Conditional random fields (CRF's). Probabilistic context-free grammars (PCFG). Parsing and learning with PCFGs. Lexicalized PCFGs.
  4. 4. 13. Using Prior Knowledge in Learning (if time permits) Chapter 11, Chapter 12. Explanation-based learning. Learning in planning and problem-solving. Knowledge-based learning and theory refinement. Transfer learning.