SlideShare a Scribd company logo
1 of 41
Download to read offline
Machine Learning for Language
Technology
Lecture 2: Basic Concepts
Marina Santini
Department of Linguistics and Philology
Uppsala University, Uppsala, Sweden
Autumn 2014
Acknowledgement: Thanks to Prof. Joakim Nivre for course design and material
Outline
• Definition of Machine Learning
• Type of Machine Learning:
– Classification
– Regression
– Supervised Learning
– Unsupervised Learning
– Reinforcement Learning
• Supervised Learning:
– Supervised Classification
– Training set
– Hypothesis class
– Empirical error
– Margin
– Noise
– Inductive bias
– Generalization
– Model assessment
– Cross-Validation
– Classification in NLP
– Types of Classification
Lecture 2: Basic Concepts 2
What is Machine Learning
• Machine learning is programming computers to
optimize a performance criterion for some task
using example data or past experience
• Why learning?
– No known exact method – vision, speech recognition,
robotics, spam filters, etc.
– Exact method too expensive – statistical physics
– Task evolves over time – network routing
• Compare:
– No need to use machine learning for computing
payroll… we just need an algorithm
Lecture 2: Basic Concepts 3
Machine Learning – Data Mining –
Artificial Intelligence – Statistics
• Machine Learning: creation of a model that uses training data or
past experience
• Data Mining: application of learning methods to large datasets (ex.
physics, astronomy, biology, etc.)
– Text mining = machine learning applied to unstructured textual data
(ex. sentiment analyisis, social media monitoring, etc. Text Mining,
Wikipedia)
• Artificial intelligence: a model that can adapt to a changing
environment.
• Statistics: Machine learning uses the theory of statistics in building
mathematical models, because the core task is making inference from a
sample.
Lecture 2: Basic Concepts 4
The bio-cognitive analogy
• Imagine that a learning algorithm as a single neuron.
• This neuron receives input from other neurons, one
for each input feature.
• The strength of these inputs are the feature values.
• Each input has a weight and the neuron simply sums
up all the weighted inputs.
• Based on this sum, the neuron decides whether to
“fire” or not. Firing is interpreted as being a positive
example and not firing is interpreted as being a
negative example.
Lecture 2: Basic Concepts 5
Elements of Machine Learning
1. Generalization:
– Generalize from specific examples
– Based on statistical inference
2. Data:
– Training data: specific examples to learn from
– Test data: (new) specific examples to assess performance
3. Models:
– Theoretical assumptions about the task/domain
– Parameters that can be inferred from data
4. Algorithms:
– Learning algorithm: infer model (parameters) from data
– Inference algorithm: infer predictions from model
Lecture 2: Basic Concepts 6
Types of Machine Learning
• Association
• Supervised Learning
– Classification
– Regression
• Unsupervised Learning
• Reinforcement Learning
Lecture 2: Basic Concepts 7
Learning Associations
• Basket analysis:
P (Y | X ) probability that somebody who buys
X also buys Y where X and Y are
products/services
Example: P ( chips | beer ) = 0.7
Lecture 2: Basic Concepts 8
Classification
Lecture 2: Basic Concepts 9
• Example: Credit
scoring
• Differentiating
between low-risk and
high-risk customers
from their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
Classification in NLP
• Binary classification:
– Spam filtering (spam vs. non-spam)
– Spelling error detection (error vs. non error)
• Multiclass classification:
– Text categorization (news, economy, culture, sport, ...)
– Named entity classification (person, location,
organization, ...)
• Structured prediction:
– Part-of-speech tagging (classes = tag sequences)
– Syntactic parsing (classes = parse trees)
Lecture 2: Basic Concepts 10
Regression
• Example:
Price of used car
• x : car attributes
y : price
y = g (x | q )
g ( ) model,
q parameters
Lecture 2: Basic Concepts
11
y = wx+w0
Uses of Supervised Learning
• Prediction of future cases:
– Use the rule to predict the output for future inputs
• Knowledge extraction:
– The rule is easy to understand
• Compression:
– The rule is simpler than the data it explains
• Outlier detection:
– Exceptions that are not covered by the rule, e.g., fraud
Lecture 2: Basic Concepts 12
Unsupervised Learning
• Finding regularities in data
• No mapping to outputs
• Clustering:
– Grouping similar instances
• Example applications:
– Customer segmentation in CRM
– Image compression: Color quantization
– NLP: Unsupervised text categorization
Lecture 2: Basic Concepts 13
Reinforcement Learning
• Learning a policy = sequence of
outputs/actions
• No supervised output but delayed reward
• Example applications:
– Game playing
– Robot in a maze
– NLP: Dialogue systems
Lecture 2: Basic Concepts 14
Supervised Classification
• Learning the class C of a “family car” from
examples
– Prediction: Is car x a family car?
– Knowledge extraction: What do people expect from a
family car?
• Output (labels):
Positive (+) and negative (–) examples
• Input representation (features):
x1: price, x2 : engine power
Lecture 2: Basic Concepts 15
Training set X

X  {xt
,rt
}t1
N

r 
1 if x is positive
0 if x is negative



Lecture 2: Basic Concepts
16

x 
x1
x2






Hypothesis class H

p1  price  p2 AND e1  engine power  e2 
Lecture 2: Basic Concepts 17
Empirical (training) error

h(x) 
1 if h says x is positive
0 if h says x is negative




E(h | X)  1 h xt
  rt
 t1
N

Lecture 2: Basic Concepts 18
Empirical error of h on X:
S, G, and the Version Space
Lecture 2: Basic Concepts 19
most specific hypothesis, S
most general hypothesis, G
h  H, between S and G is
consistent [E( h | X) = 0] and
make up the version space
Margin
• Choose h with largest margin
Lecture 2: Basic Concepts 20
Noise
Unwanted anomaly in data
• Imprecision in input attributes
• Errors in labeling data points
• Hidden attributes (relative to H)
Consequence:
• No h in H may be consistent!
Lecture 2: Basic Concepts 21
Noise and Model Complexity
Arguments for simpler model (Occam’s razor principle):
1. Easier to make predictions
2. Easier to train (fewer parameters)
3. Easier to understand
4. Generalizes better (if data is noisy)
Lecture 2: Basic Concepts 22
Inductive Bias
• Learning is an ill-posed problem
– Training data is never sufficient to find a unique
solution
– There are always infinitely many consistent
hypotheses
• We need an inductive bias:
– Assumptions that entail a unique h for a training set X
1. Hypothesis class H – axis-aligned rectangles
2. Learning algorithm – find consistent hypothesis with max-
margin
3. Hyperparameters – trade-off between training error and
margin
Lecture 2: Basic Concepts 23
Model Selection and Generalization
• Generalization – how well a model performs
on new data
– Overfitting: H more complex than C
– Underfitting: H less complex than C
Lecture 2: Basic Concepts 24
Triple Trade-Off
• Trade-off between three factors:
1. Complexity of H, c(H)
2. Training set size N
3. Generalization error E on new data
• Dependencies:
– As N, E
– As c(H), first E and then E
Lecture 2: Basic Concepts 25
Model Selection  Generalization Error
• To estimate generalization error, we need data unseen
during training:
• Given models (hypotheses) h1, ..., hk induced from the
training set X, we can use E(hi | V ) to select the
model hi with the smallest generalization error
Lecture 2: Basic Concepts 26

ˆE  E(h | V)  1 h xt
  rt
 t1
M


V  {xt
,rt
}t1
M
 X
Model Assessment
• To estimate the generalization error of the best
model hi, we need data unseen during training
and model selection
• Standard setup:
1. Training set X (50–80%)
2. Validation (development) set V (10–25%)
3. Test (publication) set T (10–25%)
• Note:
– Validation data can be added to training set before
testing
– Resampling methods can be used if data is limited
Lecture 2: Basic Concepts 27
Cross-Validation
121
31
2
2
2
32
1
1
1



K
K
K
K
K
K
XXXTXV
XXXTXV
XXXTXV




Lecture 2: Basic Concepts 28
• K-fold cross-validation: Divide X into X1, ..., XK
• Note:
– Generalization error estimated by means across K folds
– Training sets for different folds share K–2 parts
– Separate test set must be maintained for model
assessment
Bootstrapping
3680
1
1 1
.





 
e
N
N
Lecture 2: Basic Concepts 29
• Generate new training sets of size N from X by random
sampling with replacement
• Use original training set as validation set (V = X )
• Probability that we do not pick an instance after N
draws
that is, only 36.8% of instances are new!
Measuring Error
• Error rate = # of errors / # of instances = (FP+FN) / N
• Accuracy = # of correct / # of instances = (TP+TN) / N
• Recall = # of found positives / # of positives = TP / (TP+FN)
• Precision = # of found positives / # of found = TP / (TP+FP)
Lecture 2: Basic Concepts 30
Statistical Inference
• Interval estimation to quantify the precision of
our measurements
• Hypothesis testing to assess whether
differences between models are statistically
significant
Lecture 2: Basic Concepts 31

m 1.96

N
e01  e10 1 
2
e01  e10
~ X1
2
Supervised Learning – Summary
• Training data + learner  hypothesis
– Learner incorporates inductive bias
• Test data + hypothesis  estimated generalization
– Test data must be unseen
Lecture 2: Basic Concepts 32
Anatomy of a Supervised Learner
(Dimensions of a supervised machine learning algorithm)
• Model:
• Loss function:
• Optimization procedure:

g x |q 

E q | X  L rt
,g xt
|q  t

Lecture 2: Basic Concepts
33

q*  arg min
q
E q | X 
Supervised Classification: Extension
34
• Divide instances into (two or more) classes
– Instance (feature vector):
• Features may be categorical or numerical
– Class (label):
– Training data:
• Classification in Language Technology
– Spam filtering (spam vs. non-spam)
– Spelling error detection (error vs. no error)
– Text categorization (news, economy, culture, sport, ...)
– Named entity classification (person, location, organization, ...)

X  {xt
,yt
}t1
N

x  x1, , xm

y
Lec 2: Decision Trees - Nearest Neighbors
NLP: Classification (i)
NLP: Classification (ii)
NLP: Classification (iii)
Types of Classification (i)
Types of Classification (ii)
Reading
• Alpaydin (2010): chs 1-2; 19
• Daume’ III (2012): ch 4: only 4.5-4.6
Lecture 2: Basic Concepts 40
End of Lecture 2
Lecture 2: Basic Concepts 41

More Related Content

What's hot

Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine LearningSamra Shahzadi
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learningKien Le
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANNMohamed Talaat
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1Srinivasan R
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb pptbutest
 
Semantic net in AI
Semantic net in AISemantic net in AI
Semantic net in AIShahDhruv21
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Deep Learning Frameworks slides
Deep Learning Frameworks slides Deep Learning Frameworks slides
Deep Learning Frameworks slides Sheamus McGovern
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Machine Learning
Machine LearningMachine Learning
Machine LearningRahul Kumar
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.butest
 
Unit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxUnit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxjawad184956
 
Evaluating hypothesis
Evaluating  hypothesisEvaluating  hypothesis
Evaluating hypothesisswapnac12
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 

What's hot (20)

Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
Regularization
RegularizationRegularization
Regularization
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Semantic net in AI
Semantic net in AISemantic net in AI
Semantic net in AI
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Deep Learning Frameworks slides
Deep Learning Frameworks slides Deep Learning Frameworks slides
Deep Learning Frameworks slides
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Text Mining
Text MiningText Mining
Text Mining
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Unit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxUnit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptx
 
Evaluating hypothesis
Evaluating  hypothesisEvaluating  hypothesis
Evaluating hypothesis
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 

Viewers also liked

EASA Part 66 Module 5.10 : Fibre Optic
EASA Part 66 Module 5.10 : Fibre OpticEASA Part 66 Module 5.10 : Fibre Optic
EASA Part 66 Module 5.10 : Fibre Opticsoulstalker
 
DETERMINATION OF THERMAL CONDUCTIVITY OF OIL
DETERMINATION OF THERMAL CONDUCTIVITY OF OILDETERMINATION OF THERMAL CONDUCTIVITY OF OIL
DETERMINATION OF THERMAL CONDUCTIVITY OF OILSamuel Alexander
 
Review journal Acoustic –essential requirement for public building”
Review journal Acoustic –essential requirement for public building”Review journal Acoustic –essential requirement for public building”
Review journal Acoustic –essential requirement for public building”Sayed Umam
 
Ruby & Nd YAG laser By Sukdeep Singh
Ruby & Nd YAG laser By Sukdeep SinghRuby & Nd YAG laser By Sukdeep Singh
Ruby & Nd YAG laser By Sukdeep SinghSukhdeep Bisht
 
B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber
B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber
B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber Abhi Hirpara
 
determination of thermal conductivity
determination of thermal conductivitydetermination of thermal conductivity
determination of thermal conductivityAnjali Sudhakar
 
Geiger muller counter
Geiger muller counterGeiger muller counter
Geiger muller counterAvi Dhawal
 
Acoustic Problems in ADM Building
Acoustic Problems in ADM BuildingAcoustic Problems in ADM Building
Acoustic Problems in ADM Buildingyanying
 
Mechanical Design Presentation
Mechanical Design PresentationMechanical Design Presentation
Mechanical Design Presentationraytec
 
Transmission Of Heat
Transmission Of HeatTransmission Of Heat
Transmission Of Heatscotfuture
 
Basic Engineering Design (Part 2): Researching the Need
Basic Engineering Design (Part 2): Researching the NeedBasic Engineering Design (Part 2): Researching the Need
Basic Engineering Design (Part 2): Researching the NeedDenise Wilson
 
Thermal Conductivity
Thermal ConductivityThermal Conductivity
Thermal Conductivitytalatameen42
 
Concepts in engineering design
Concepts in engineering designConcepts in engineering design
Concepts in engineering designMITS Gwalior
 
physics b.tech. 1st sem fibre optics,u 4
physics b.tech. 1st sem fibre optics,u 4physics b.tech. 1st sem fibre optics,u 4
physics b.tech. 1st sem fibre optics,u 4Kumar
 
The Design Cycle @DISK Introduction
The Design Cycle @DISK IntroductionThe Design Cycle @DISK Introduction
The Design Cycle @DISK IntroductionSean Thompson 
 

Viewers also liked (20)

EASA Part 66 Module 5.10 : Fibre Optic
EASA Part 66 Module 5.10 : Fibre OpticEASA Part 66 Module 5.10 : Fibre Optic
EASA Part 66 Module 5.10 : Fibre Optic
 
ruby laser
ruby laser ruby laser
ruby laser
 
Solid state detector mamita
Solid state detector mamitaSolid state detector mamita
Solid state detector mamita
 
DETERMINATION OF THERMAL CONDUCTIVITY OF OIL
DETERMINATION OF THERMAL CONDUCTIVITY OF OILDETERMINATION OF THERMAL CONDUCTIVITY OF OIL
DETERMINATION OF THERMAL CONDUCTIVITY OF OIL
 
Review journal Acoustic –essential requirement for public building”
Review journal Acoustic –essential requirement for public building”Review journal Acoustic –essential requirement for public building”
Review journal Acoustic –essential requirement for public building”
 
Ruby & Nd YAG laser By Sukdeep Singh
Ruby & Nd YAG laser By Sukdeep SinghRuby & Nd YAG laser By Sukdeep Singh
Ruby & Nd YAG laser By Sukdeep Singh
 
B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber
B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber
B.Tech sem I Engineering Physics U-I Chapter 1-Optical fiber
 
determination of thermal conductivity
determination of thermal conductivitydetermination of thermal conductivity
determination of thermal conductivity
 
Geiger muller counter
Geiger muller counterGeiger muller counter
Geiger muller counter
 
Acoustic Problems in ADM Building
Acoustic Problems in ADM BuildingAcoustic Problems in ADM Building
Acoustic Problems in ADM Building
 
Laser physics
Laser   physicsLaser   physics
Laser physics
 
Mechanical Design Presentation
Mechanical Design PresentationMechanical Design Presentation
Mechanical Design Presentation
 
Transmission Of Heat
Transmission Of HeatTransmission Of Heat
Transmission Of Heat
 
Transfer of heat
Transfer of heatTransfer of heat
Transfer of heat
 
Basic Engineering Design (Part 2): Researching the Need
Basic Engineering Design (Part 2): Researching the NeedBasic Engineering Design (Part 2): Researching the Need
Basic Engineering Design (Part 2): Researching the Need
 
Thermal Conductivity
Thermal ConductivityThermal Conductivity
Thermal Conductivity
 
Concepts in engineering design
Concepts in engineering designConcepts in engineering design
Concepts in engineering design
 
physics b.tech. 1st sem fibre optics,u 4
physics b.tech. 1st sem fibre optics,u 4physics b.tech. 1st sem fibre optics,u 4
physics b.tech. 1st sem fibre optics,u 4
 
GEIGER MULLER COUNTER
GEIGER MULLER COUNTER GEIGER MULLER COUNTER
GEIGER MULLER COUNTER
 
The Design Cycle @DISK Introduction
The Design Cycle @DISK IntroductionThe Design Cycle @DISK Introduction
The Design Cycle @DISK Introduction
 

Similar to Lecture 2 Basic Concepts in Machine Learning for Language Technology

Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning ParrotAI
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with TensorflowShubham Sharma
 
Machine Learning part 2 - Introduction to Data Science
Machine Learning part 2 -  Introduction to Data Science Machine Learning part 2 -  Introduction to Data Science
Machine Learning part 2 - Introduction to Data Science Frank Kienle
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfSisayNegash4
 
06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptxSaharA84
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning台灣資料科學年會
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos butest
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptAnshika865276
 

Similar to Lecture 2 Basic Concepts in Machine Learning for Language Technology (20)

Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning Fundementals of Machine Learning and Deep Learning
Fundementals of Machine Learning and Deep Learning
 
Week 1.pdf
Week 1.pdfWeek 1.pdf
Week 1.pdf
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
 
Machine Learning part 2 - Introduction to Data Science
Machine Learning part 2 -  Introduction to Data Science Machine Learning part 2 -  Introduction to Data Science
Machine Learning part 2 - Introduction to Data Science
 
Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
 
MLE.pdf
MLE.pdfMLE.pdf
MLE.pdf
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
 
Launching into machine learning
Launching into machine learningLaunching into machine learning
Launching into machine learning
 
06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
module 6 (1).ppt
module 6 (1).pptmodule 6 (1).ppt
module 6 (1).ppt
 
Unit-1.ppt
Unit-1.pptUnit-1.ppt
Unit-1.ppt
 

More from Marina Santini

Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Marina Santini
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsMarina Santini
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-Marina Santini
 
An Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesAn Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesMarina Santini
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word CloudsMarina Santini
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebMarina Santini
 
Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: SummarizationMarina Santini
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question AnsweringMarina Santini
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)Marina Santini
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Marina Santini
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationMarina Santini
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingMarina Santini
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational SemanticsMarina Santini
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Marina Santini
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Marina Santini
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Marina Santini
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 

More from Marina Santini (20)

Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
 
An Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesAn Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability Features
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic Web
 
Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: Summarization
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
 
Lecture: Word Senses
Lecture: Word SensesLecture: Word Senses
Lecture: Word Senses
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 

Recently uploaded

How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 

Recently uploaded (20)

How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

Lecture 2 Basic Concepts in Machine Learning for Language Technology

  • 1. Machine Learning for Language Technology Lecture 2: Basic Concepts Marina Santini Department of Linguistics and Philology Uppsala University, Uppsala, Sweden Autumn 2014 Acknowledgement: Thanks to Prof. Joakim Nivre for course design and material
  • 2. Outline • Definition of Machine Learning • Type of Machine Learning: – Classification – Regression – Supervised Learning – Unsupervised Learning – Reinforcement Learning • Supervised Learning: – Supervised Classification – Training set – Hypothesis class – Empirical error – Margin – Noise – Inductive bias – Generalization – Model assessment – Cross-Validation – Classification in NLP – Types of Classification Lecture 2: Basic Concepts 2
  • 3. What is Machine Learning • Machine learning is programming computers to optimize a performance criterion for some task using example data or past experience • Why learning? – No known exact method – vision, speech recognition, robotics, spam filters, etc. – Exact method too expensive – statistical physics – Task evolves over time – network routing • Compare: – No need to use machine learning for computing payroll… we just need an algorithm Lecture 2: Basic Concepts 3
  • 4. Machine Learning – Data Mining – Artificial Intelligence – Statistics • Machine Learning: creation of a model that uses training data or past experience • Data Mining: application of learning methods to large datasets (ex. physics, astronomy, biology, etc.) – Text mining = machine learning applied to unstructured textual data (ex. sentiment analyisis, social media monitoring, etc. Text Mining, Wikipedia) • Artificial intelligence: a model that can adapt to a changing environment. • Statistics: Machine learning uses the theory of statistics in building mathematical models, because the core task is making inference from a sample. Lecture 2: Basic Concepts 4
  • 5. The bio-cognitive analogy • Imagine that a learning algorithm as a single neuron. • This neuron receives input from other neurons, one for each input feature. • The strength of these inputs are the feature values. • Each input has a weight and the neuron simply sums up all the weighted inputs. • Based on this sum, the neuron decides whether to “fire” or not. Firing is interpreted as being a positive example and not firing is interpreted as being a negative example. Lecture 2: Basic Concepts 5
  • 6. Elements of Machine Learning 1. Generalization: – Generalize from specific examples – Based on statistical inference 2. Data: – Training data: specific examples to learn from – Test data: (new) specific examples to assess performance 3. Models: – Theoretical assumptions about the task/domain – Parameters that can be inferred from data 4. Algorithms: – Learning algorithm: infer model (parameters) from data – Inference algorithm: infer predictions from model Lecture 2: Basic Concepts 6
  • 7. Types of Machine Learning • Association • Supervised Learning – Classification – Regression • Unsupervised Learning • Reinforcement Learning Lecture 2: Basic Concepts 7
  • 8. Learning Associations • Basket analysis: P (Y | X ) probability that somebody who buys X also buys Y where X and Y are products/services Example: P ( chips | beer ) = 0.7 Lecture 2: Basic Concepts 8
  • 9. Classification Lecture 2: Basic Concepts 9 • Example: Credit scoring • Differentiating between low-risk and high-risk customers from their income and savings Discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk
  • 10. Classification in NLP • Binary classification: – Spam filtering (spam vs. non-spam) – Spelling error detection (error vs. non error) • Multiclass classification: – Text categorization (news, economy, culture, sport, ...) – Named entity classification (person, location, organization, ...) • Structured prediction: – Part-of-speech tagging (classes = tag sequences) – Syntactic parsing (classes = parse trees) Lecture 2: Basic Concepts 10
  • 11. Regression • Example: Price of used car • x : car attributes y : price y = g (x | q ) g ( ) model, q parameters Lecture 2: Basic Concepts 11 y = wx+w0
  • 12. Uses of Supervised Learning • Prediction of future cases: – Use the rule to predict the output for future inputs • Knowledge extraction: – The rule is easy to understand • Compression: – The rule is simpler than the data it explains • Outlier detection: – Exceptions that are not covered by the rule, e.g., fraud Lecture 2: Basic Concepts 12
  • 13. Unsupervised Learning • Finding regularities in data • No mapping to outputs • Clustering: – Grouping similar instances • Example applications: – Customer segmentation in CRM – Image compression: Color quantization – NLP: Unsupervised text categorization Lecture 2: Basic Concepts 13
  • 14. Reinforcement Learning • Learning a policy = sequence of outputs/actions • No supervised output but delayed reward • Example applications: – Game playing – Robot in a maze – NLP: Dialogue systems Lecture 2: Basic Concepts 14
  • 15. Supervised Classification • Learning the class C of a “family car” from examples – Prediction: Is car x a family car? – Knowledge extraction: What do people expect from a family car? • Output (labels): Positive (+) and negative (–) examples • Input representation (features): x1: price, x2 : engine power Lecture 2: Basic Concepts 15
  • 16. Training set X  X  {xt ,rt }t1 N  r  1 if x is positive 0 if x is negative    Lecture 2: Basic Concepts 16  x  x1 x2      
  • 17. Hypothesis class H  p1  price  p2 AND e1  engine power  e2  Lecture 2: Basic Concepts 17
  • 18. Empirical (training) error  h(x)  1 if h says x is positive 0 if h says x is negative     E(h | X)  1 h xt   rt  t1 N  Lecture 2: Basic Concepts 18 Empirical error of h on X:
  • 19. S, G, and the Version Space Lecture 2: Basic Concepts 19 most specific hypothesis, S most general hypothesis, G h  H, between S and G is consistent [E( h | X) = 0] and make up the version space
  • 20. Margin • Choose h with largest margin Lecture 2: Basic Concepts 20
  • 21. Noise Unwanted anomaly in data • Imprecision in input attributes • Errors in labeling data points • Hidden attributes (relative to H) Consequence: • No h in H may be consistent! Lecture 2: Basic Concepts 21
  • 22. Noise and Model Complexity Arguments for simpler model (Occam’s razor principle): 1. Easier to make predictions 2. Easier to train (fewer parameters) 3. Easier to understand 4. Generalizes better (if data is noisy) Lecture 2: Basic Concepts 22
  • 23. Inductive Bias • Learning is an ill-posed problem – Training data is never sufficient to find a unique solution – There are always infinitely many consistent hypotheses • We need an inductive bias: – Assumptions that entail a unique h for a training set X 1. Hypothesis class H – axis-aligned rectangles 2. Learning algorithm – find consistent hypothesis with max- margin 3. Hyperparameters – trade-off between training error and margin Lecture 2: Basic Concepts 23
  • 24. Model Selection and Generalization • Generalization – how well a model performs on new data – Overfitting: H more complex than C – Underfitting: H less complex than C Lecture 2: Basic Concepts 24
  • 25. Triple Trade-Off • Trade-off between three factors: 1. Complexity of H, c(H) 2. Training set size N 3. Generalization error E on new data • Dependencies: – As N, E – As c(H), first E and then E Lecture 2: Basic Concepts 25
  • 26. Model Selection  Generalization Error • To estimate generalization error, we need data unseen during training: • Given models (hypotheses) h1, ..., hk induced from the training set X, we can use E(hi | V ) to select the model hi with the smallest generalization error Lecture 2: Basic Concepts 26  ˆE  E(h | V)  1 h xt   rt  t1 M   V  {xt ,rt }t1 M  X
  • 27. Model Assessment • To estimate the generalization error of the best model hi, we need data unseen during training and model selection • Standard setup: 1. Training set X (50–80%) 2. Validation (development) set V (10–25%) 3. Test (publication) set T (10–25%) • Note: – Validation data can be added to training set before testing – Resampling methods can be used if data is limited Lecture 2: Basic Concepts 27
  • 28. Cross-Validation 121 31 2 2 2 32 1 1 1    K K K K K K XXXTXV XXXTXV XXXTXV     Lecture 2: Basic Concepts 28 • K-fold cross-validation: Divide X into X1, ..., XK • Note: – Generalization error estimated by means across K folds – Training sets for different folds share K–2 parts – Separate test set must be maintained for model assessment
  • 29. Bootstrapping 3680 1 1 1 .        e N N Lecture 2: Basic Concepts 29 • Generate new training sets of size N from X by random sampling with replacement • Use original training set as validation set (V = X ) • Probability that we do not pick an instance after N draws that is, only 36.8% of instances are new!
  • 30. Measuring Error • Error rate = # of errors / # of instances = (FP+FN) / N • Accuracy = # of correct / # of instances = (TP+TN) / N • Recall = # of found positives / # of positives = TP / (TP+FN) • Precision = # of found positives / # of found = TP / (TP+FP) Lecture 2: Basic Concepts 30
  • 31. Statistical Inference • Interval estimation to quantify the precision of our measurements • Hypothesis testing to assess whether differences between models are statistically significant Lecture 2: Basic Concepts 31  m 1.96  N e01  e10 1  2 e01  e10 ~ X1 2
  • 32. Supervised Learning – Summary • Training data + learner  hypothesis – Learner incorporates inductive bias • Test data + hypothesis  estimated generalization – Test data must be unseen Lecture 2: Basic Concepts 32
  • 33. Anatomy of a Supervised Learner (Dimensions of a supervised machine learning algorithm) • Model: • Loss function: • Optimization procedure:  g x |q   E q | X  L rt ,g xt |q  t  Lecture 2: Basic Concepts 33  q*  arg min q E q | X 
  • 34. Supervised Classification: Extension 34 • Divide instances into (two or more) classes – Instance (feature vector): • Features may be categorical or numerical – Class (label): – Training data: • Classification in Language Technology – Spam filtering (spam vs. non-spam) – Spelling error detection (error vs. no error) – Text categorization (news, economy, culture, sport, ...) – Named entity classification (person, location, organization, ...)  X  {xt ,yt }t1 N  x  x1, , xm  y Lec 2: Decision Trees - Nearest Neighbors
  • 40. Reading • Alpaydin (2010): chs 1-2; 19 • Daume’ III (2012): ch 4: only 4.5-4.6 Lecture 2: Basic Concepts 40
  • 41. End of Lecture 2 Lecture 2: Basic Concepts 41