1.
What is Machine Learning?
Types of Problems and Situations
Content of the Course
Documentation
Statistical Machine Learning from Data
Introduction to Machine Learning
Samy Bengio
IDIAP Research Institute, Martigny, Switzerland, and
Ecole Polytechnique F´d´rale de Lausanne (EPFL), Switzerland
e e
bengio@idiap.ch
http://www.idiap.ch/~bengio
November 30, 2005
Samy Bengio Statistical Machine Learning from Data 1
2.
What is Machine Learning?
Types of Problems and Situations
Content of the Course
Documentation
1 What is Machine Learning?
2 Types of Problems and Situations
3 Content of the Course
4 Documentation
Samy Bengio Statistical Machine Learning from Data 2
3.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
1 What is Machine Learning?
2 Types of Problems and Situations
3 Content of the Course
4 Documentation
Samy Bengio Statistical Machine Learning from Data 3
4.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
What is Machine Learning? (Graphical View)
Samy Bengio Statistical Machine Learning from Data 4
5.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
What is Machine Learning?
Learning is an essential human property
Learning means changing in order to be better (according to a
given criterion) when a similar situation arrives
Learning IS NOT learning by heart
Any computer can learn by heart, the diﬃculty is to generalize
a behavior to a novel situation
Samy Bengio Statistical Machine Learning from Data 5
6.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
Why Learning is Diﬃcult?
Given a ﬁnite amount of training data, you have to derive a
relation for an inﬁnite domain
In fact, there is an inﬁnite number of such relations
How should we draw the relation?
Samy Bengio Statistical Machine Learning from Data 6
7.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
Why Learning is Diﬃcult? (2)
Given a ﬁnite amount of training data, you have to derive a
relation for an inﬁnite domain
In fact, there is an inﬁnite number of such relations
Which relation is the most appropriate?
Samy Bengio Statistical Machine Learning from Data 7
8.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
Why Learning is Diﬃcult? (3)
Given a ﬁnite amount of training data, you have to derive a
relation for an inﬁnite domain
In fact, there is an inﬁnite number of such relations
... the hidden test points...
Samy Bengio Statistical Machine Learning from Data 8
9.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
Occam’s Razor’s Principle
William of Occam: Monk living in the 14th century
Principle of Parcimony:
One should not increase, beyond what is necessary,
the number of entities required to explain anything
When many solutions are available for a given problem, we
should select the simplest one
But what do we mean by simple?
We will use prior knowledge of the problem to solve to deﬁne
what is a simple solution
Example of a prior: smoothness
Samy Bengio Statistical Machine Learning from Data 9
10.
What is Machine Learning?
Types of Problems and Situations What is Machine Learning?
Content of the Course Why Learning is Diﬃcult?
Documentation
Learning as a Search Problem
Initial solution
Set of solutions chosen a priori
Optimal solution
Set of solutions
compatible with training set
Samy Bengio Statistical Machine Learning from Data 10
11.
What is Machine Learning?
Types of Problems
Types of Problems and Situations
Types of Learning Situations
Content of the Course
Types of Applications
Documentation
1 What is Machine Learning?
2 Types of Problems and Situations
3 Content of the Course
4 Documentation
Samy Bengio Statistical Machine Learning from Data 11
12.
What is Machine Learning?
Types of Problems
Types of Problems and Situations
Types of Learning Situations
Content of the Course
Types of Applications
Documentation
Types of Problems
There are 3 kinds of problems:
regression
Samy Bengio Statistical Machine Learning from Data 12
13.
What is Machine Learning?
Types of Problems
Types of Problems and Situations
Types of Learning Situations
Content of the Course
Types of Applications
Documentation
Types of Problems
There are 3 kinds of problems:
regression, classiﬁcation
Samy Bengio Statistical Machine Learning from Data 13
14.
What is Machine Learning?
Types of Problems
Types of Problems and Situations
Types of Learning Situations
Content of the Course
Types of Applications
Documentation
Types of Problems
There are 3 kinds of problems:
regression, classiﬁcation, density estimation
P(X)
X
Samy Bengio Statistical Machine Learning from Data 14
15.
What is Machine Learning?
Types of Problems
Types of Problems and Situations
Types of Learning Situations
Content of the Course
Types of Applications
Documentation
Types of Learning
Supervised learning:
The training data contains the desired behavior
(desired class, outcome, etc)
Reinforcement learning:
The training data contains partial targets
(for instance, simply whether the machine did well or not).
Unsupervised learning:
The training data is raw, no class or target is given
There is often a hidden goal in that task (compression,
maximum likelihood, etc)
Samy Bengio Statistical Machine Learning from Data 15
16.
What is Machine Learning?
Types of Problems
Types of Problems and Situations
Types of Learning Situations
Content of the Course
Types of Applications
Documentation
Applications
Vision Processing
Face detection/veriﬁcation
Handwritten recognition
Speech Processing
Phoneme/Word/Sentence/Person recognition
Others
Indexing: google, text mining, information retrieval
Finance: asset prediction, portfolio and risk management
Telecom: traﬃc prediction
Data mining: make use of huge datasets kept by large
corporations
Games: Backgammon, go
Control: robots
... and plenty of others of course!
Samy Bengio Statistical Machine Learning from Data 16
17.
What is Machine Learning?
Types of Problems and Situations
Content of the Course
Documentation
1 What is Machine Learning?
2 Types of Problems and Situations
3 Content of the Course
4 Documentation
Samy Bengio Statistical Machine Learning from Data 17
18.
What is Machine Learning?
Types of Problems and Situations
Content of the Course
Documentation
Content of the Course
Theoretical Issues
What are the theoretical foundations for statistical learning?
How can we measure the expected performance of a model?
Modeling Issues
Models specialized for classiﬁcation, regression, distributions,
sequences, images, etc
For each model, we need to devise a training algorithm
Others
Other practical issues, such as feature selection, parameter
sharing, etc.
Laboratories
About one third of the course will be through practical
laboratories, using the python programming language
Samy Bengio Statistical Machine Learning from Data 18
19.
What is Machine Learning?
Journals and Conferences
Types of Problems and Situations
Books and Lecture Notes
Content of the Course
Electronic Resources
Documentation
1 What is Machine Learning?
2 Types of Problems and Situations
3 Content of the Course
4 Documentation
Samy Bengio Statistical Machine Learning from Data 19
20.
What is Machine Learning?
Journals and Conferences
Types of Problems and Situations
Books and Lecture Notes
Content of the Course
Electronic Resources
Documentation
Journals and Conferences
Journals:
Journal of Machine Learning Research
Neural Computation
IEEE Transactions on Neural Networks
IEEE Transactions on Pattern Analysis and Machine
Intelligence
Conferences:
NIPS: Neural Information Processing Systems
COLT: Computational Learning Theory
ICML: International Conference on Machine Learning
Samy Bengio Statistical Machine Learning from Data 20
21.
What is Machine Learning?
Journals and Conferences
Types of Problems and Situations
Books and Lecture Notes
Content of the Course
Electronic Resources
Documentation
Books and Lecture Notes
Books:
C. Bishop. Neural Networks for Pattern Recognition, 1995.
V. Vapnik. The Nature of Statistical Learning Theory, 1995.
T. Hastie, R. Tibshirani, J. Friedman. The elements of
Statistical Learning, 2001.
B. Sch¨lkopf, A. J. Smola. Learning with Kernels, 2002.
o
Other lecture notes: (some are in french...)
Bengio, Y.: http://www.iro.umontreal.ca/˜pift6266/A03/
Kegl, B.: http://www.iro.umontreal.ca/˜kegl/ift6266/
Jordan, M.:
http://www.cs.berkeley.edu/˜jordan/courses/281A-fall04/
LeCun, Y.:
http://www.cs.nyu.edu/˜yann/2005f-G22-2565-001/
Samy Bengio Statistical Machine Learning from Data 21
22.
What is Machine Learning?
Journals and Conferences
Types of Problems and Situations
Books and Lecture Notes
Content of the Course
Electronic Resources
Documentation
Electronic Resources
Search engines:
NIPS online: http://nips.djvuzone.org
Citeseer: http://citeseer.ist.psu.edu/
Google scholar: http://scholar.google.com/
Machine learning libraries:
Torch: http://www.Torch.ch
Lush: http://lush.sf.net
Samy Bengio Statistical Machine Learning from Data 22
Views
Actions
Embeds 0
Report content