This document provides an overview and schedule for a Machine Learning course. It will cover various machine learning techniques over 14 weeks, including decision trees, instance-based learning, Bayesian learning, and reinforcement learning. Students will complete 3 lab assignments and a final project. Labs involve using the Weka machine learning software and focus on evaluation, decision trees, and classification. The course aims to introduce machine learning approaches and their applications while providing hands-on experience through practical assignments.
1. General Info General Info (cont’d)
instructor: Jörg Tiedemann (j.tiedemann@rug.nl) Website: http://www.let.rug.nl/˜tiedeman/ml08
Machine Learning, LIX004M5 Harmoniegebouw, room 1311-429 Examination: 3 obligatory lab assignments
Overview and Introduction lab assistant: Ça˘ rı Çöltekin (c.coltekin@rug.nl)
g present and report final project (50%)
prerequisites: open to students in Computer written exam (50%)
J¨ rg Tiedemann
o
tiedeman@let.rug.nl
Science, Artificial Intelligence and Information Exam: Friday, October 24, 9-12 (AZERN)
Science, 2nd year student or higher Literature: Tom Mitchell Machine Learning, New
Informatiekunde background: programming ability, elementary York: McGraw-Hill, 1997
Rijksuniversiteit Groningen statistics additional on-line literature (links available from
schedule: September 1 - October 24 the course website)
• lectures: mondays 9:15-11
• labs: fridays 9-12 (2 groups?)
5 ECT
Machine Learning, LIX004M5 – p.1/29 Machine Learning, LIX004M5 – p.2/29 Machine Learning, LIX004M5 – p.3/29
General Info (cont’d) Preliminary Program - Lectures Preliminary Program - Labs
Purpose of this course We will only manage to look at a selection of the 6 lab sessions, 3 short lab reports, 1 final project
• Introduction to machine learning techniques topics in book:
• evaluation in ML (Ch.5), introduction to topics for the final
• Discussion of several machine learning 1. Organization, Introduction, (Ch.1, Ch.2) project, getting started with WEKA (report 1)
approaches 2. Decision Trees (Ch.3) • select & start with final project
• Examples and applications in various fields • decision trees & instance-based learning (report2)
3. Instance-Based Learning (Ch.8)
• Practical assignments 4. Bayesian Learning & EM (Ch.6) • work on final project
• using Weka - a machine learning package
5. Rule Induction & Reinforcement Learning (Ch.13) • classification & model comparison (report 3)
implemented in Java
6. Sequential data & Markov Models (Ch.9) • work on final project
• some theoretical questions
• independent group work on final project 7. Presentations of Final Projects
Machine Learning, LIX004M5 – p.4/29 Machine Learning, LIX004M5 – p.5/29 Machine Learning, LIX004M5 – p.6/29
What is Machine learning? What is all the hype about ML? Why machine learning?
Machine Learning is data mining: pattern recognition, knowledge
• the study of algorithms that discovery, use historical data to improve future
"Every time I fire a linguist the performance of the decisions, prediction (classification, regression),
• improve their performance
recognizer goes up" data discripton (clustering, summarization,
• at some task
visualization)
• with experience
(probably) said by Fred Jelinek (IBM speech group) in the 80s, quoted by, e.g., complex applications: we cannot program by hand,
Jurafsky and Martin, Speech and Language Processing. (efficient) processing of complex signals
... just like a human being ... (?) self-customizing programs: automatic adjustments
according to usage, dynamic systems
Machine Learning, LIX004M5 – p.7/29 Machine Learning, LIX004M5 – p.8/29 Machine Learning, LIX004M5 – p.9/29
2. Typical Data Mining Task Pattern Recognition Classification
Object detection Personal home page? Company website? Educational site?
Given:
• 9714 patient records, each describing a pregnancy and birth
• Each patient record contains 215 features
Learn to predict:
• Classes of future patients at high risk for Emergency Cesarean Section
Machine Learning, LIX004M5 – p.10/29 Machine Learning, LIX004M5 – p.11/29 Machine Learning, LIX004M5 – p.12/29
Complex applications Automatic customization Machine learning is growing
Robots playing football in RoboCup many more applications:
colour classification (DT,NN), player positioning (RL), behaviors (RL,
GA), team strategy adaptation (mixture of experts), ball kicking (GA)
• speech recognition
... • spam filtering, sorting data
http://www.robocup.org/
http://sserver.sourceforge.net/SIG-learn/
• machine translation
• robot control
• financial data analysis and market predictions
• hand writing recognition
• data clustering and visualization
• pattern recognition in genetics (e.g. DNA
sequences)
• ...
Machine Learning, LIX004M5 – p.13/29 Machine Learning, LIX004M5 – p.14/29 Machine Learning, LIX004M5 – p.15/29
Questions to ask What experience? What exactly should be learned?
Learning = improve with experience at some task • What do we know already about the task and Outcome of the target function
possible solutions? (prior knowledge) • boolean (→ concept learning)
• What experience? • What kind of data do we have available? • discrete values (→ classification)
• What exactly should be learned? • How much data do we need and how clean does it • real values (→ regression)
• How shall it be represented? have to be? (training examples)
• What specific algorithm to learn it? What are the discriminative features? How are many machine learning tasks are classification tasks ...
they connected with each other (dependencies)?
Goal: handle unseen data correctly according to the • Is a “teacher” available (→ supervised learning)
task (use your knowledge inferred from experience!)
or not (→ unsupervised learning)?
How expensive is labeling?
Machine Learning, LIX004M5 – p.16/29 Machine Learning, LIX004M5 – p.17/29 Machine Learning, LIX004M5 – p.18/29
3. How shall it be represented? What algorithm to learn it? Inductive learning as search
Model selection Learning means approximating the real (unknown) Inductive learning: infer a model from training data
target function according to our experience (e.g. example: concept learning
• symbolic representation (e.g. with rules, trees) observed training examples) • a set of instances X with attributes a1 ..an
• subsymbolic representation (neural networks,
SVMs) → Learning = search for a “good” hypothesis/model • Hypotheses H: set of functions h : X → {0, 1}
• Representation: e.g. conjunction of constraints
Do we want to restrict the space of possible solutions? Do we want to prefer certain models?
• Training examples D: a sequence of positive and negative
(→ restriction bias) (→ preference bias)
examples of the unknown target function c : X → {0, 1}
what we want is: hypothesis hc such that hc (x) = c(x) for all x ∈ X
ˆ ˆ
what we can observe: hypothesis h such that h(x) = c(x) for all
x∈D
Machine Learning, LIX004M5 – p.19/29 Machine Learning, LIX004M5 – p.20/29 Machine Learning, LIX004M5 – p.21/29
Instances & Hypotheses Inductive bias Learning Models
• corresponds to prior knowledge about data and Learning means approximating the real (unknown)
task (a priori assumptions) target function according to our experience (e.g.
• depends on learning algorithm and model observed training examples)
representation → Learning = search for a “good” hypothesis/model
Restriction bias:
Which one is better?
• hypothesis space is restricted (also: language bias)
Preference bias:
• prefer certain hypotheses (usually more general ones)
Why do we need inductive bias?
Consistent(h, D) ≡ (∀ x, c(x) ∈ D) h(x) = c(x)
version space: V SH,D ≡ {h ∈ H|Consistent(h, D)}
Machine Learning, LIX004M5 – p.22/29 Machine Learning, LIX004M5 – p.23/29 Machine Learning, LIX004M5 – p.24/29
Learning Models What algorithm to learn it? The roots of ML
Learning means approximating the real (unknown) Learning means approximating the real (unknown) Artificial intelligence: use prior knowledge and training data to guide
target function according to our experience (e.g. target function according to our experience (e.g. learning as a search problem
observed training examples) observed training examples) Baysian methods: probabilistic classifiers, probabilistic reasoning
Statistics: data description, estimation of probability distributions,
→ Learning = search for a “good” hypothesis/model → Learning = search for a “good” hypothesis/model evaluation, confidence
Information theory: entropy, information content, code optimsation and the
Which one is better? decision trees & information gain
minimum description length principle
bayesian techniques & maximum likelihood estimations
Computational complexity theory: trade-off between model (learning)
least mean square algorithm, gradient search
complexity and performance
expectation maximization
Psychology and neurobiology: response improvement with practice, ideas
maximum entropy
that lead to artificical neural networks
minimum description length
Philosophy: Occam’s razor (simple is best)
reinforcement learning
What would be a model without inductive bias?
genetic algorithms, simulated annealing
Machine Learning, LIX004M5 – p.25/29 Machine Learning, LIX004M5 – p.26/29 Machine Learning, LIX004M5 – p.27/29
4. Take-home messages What’s next?
• ML = algorithms that learn from experience This week: Read ch. 1, ch. 2 & ch. 5 of Mitchell and
• generalize instead of memorize look at the exercises
• different types of inductive bias Lab on Friday: Evaluation in ML, introduction of
final projects, small exercises
• ML has many fascinating sub-tasks
Next week: Decision trees, ch. 3, lab session on
Enjoy working with learning systems! Decision Trees & IBL
Machine Learning, LIX004M5 – p.28/29 Machine Learning, LIX004M5 – p.29/29