ML_1.pdf

Lecture 1
Introduction
TO
Machine
Learning
By: Amir El-ghamry

Course Information
¨ Instructor: Amir EL-Ghamry
¤amir_nabil@mans.edu.eg
¤amirelghamry@email.arizona.edu
¤amirnabilsaleh@gmail.com
¤Facebook @amir.n.saleh
¤Twitter @AmirNabilSaleh
¤Instagram @amirnabilsaleh
(C) Dhruv Batra
2

Course Information
¨ About Me:
¤Bachelor of computer science at
Mansoura university @2006
¤Grade : Excellent
¤Rank : First
¤Master degree @ 2012
¤Ph.D. degree @ 2019
(C) Dhruv Batra
3

Course Information
¨ About Me:
¤Demonstrator at CS dept 2007 – 2012
¤Assistant Lecturer at CS dept 2012 – 2019
¤Joint supervision mission to USA 2016 – 2018
¤University of Texas at Dallas
4

Course Information
¨ About Me:
¤Member of Open Event Data Alliance
(OEDA) Research project at the university of
Texas at Dallas 2017 – 2018
¤ Consultant of Minerva Research project at
Arizona University 2019 – till now
5

Syllabus
¨ Basics of Statistical Learning
nLoss functions, bias-variance tradeoff,
overfitting, cross-validation
¨ Supervised Learning
nNearest Neighbour, Naïve Bayes, Logistic
Regression, Support Vector Machines,
Neural Networks, Decision Trees
6

Syllabus
¨ Unsupervised Learning
nClustering: k-means
nDimensionality reduction: PCA
¨ Advanced Topics
nConvolution neural network
7

Syllabus
¨ You will learn about the methods you
heard about
¨ You will understand algorithms, theory,
applications, and implementations
¨ It’s going to be FUN and HARD WORK J
8

Prerequisites
¨ Probability and Statistics
¨ Calculus and Linear Algebra
¨ Algorithms
¨ Programming
¤Python or Your language of choice for project.
¨ Ability to deal with abstract mathematical
concepts
9

Textbook
¨ We will have lecture notes.
¨ Reference Books:
¤ [Online]
Machine Learning: A Probabilistic Perspective
Kevin Murphy
¤ [Free PDF from author’s webpage]
Bayesian reasoning and machine learning
David Barber
http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwik
i.php?n=Brml.HomePage
¤ Fundamental of neural network by laurene fausett
10

Grading
¨ HomeWorks (%)
¨ Final project (%)
¨ Midterm (%)
¨ Final (%)
¨ Class Participation (%)
11

Lets start with Fruit Classification
12
Consider a program that takes image file
as input and output type of fruit

Fruit Classification
13
Consider you have two class (Apple - Orange)

Traditional programming
14
Manual rule
Compare number of orange and green pixels
And calculate ratio

Traditional Programming
15
Computer
Data
Program
Output
Data. : image file
Program : Manual rule (ratio of orange
and green pixels )
Output : is it apple or orange

Messy Real World
16
Your rules starts to break . Why
1- What if grey black and white
2- image with No apple or oranges

Problem of traditional programming
19
Ok – What if a new problem
You need to change the rules
Traditional programming is Not suitable

Machine Learning
20
We need an algorithm that figure out the
Rules for us
So we don’t have to write them by hand

Machine Learning
21
Computer
Data
Output
Program
Data :
Output :
Program : Rules

Classifier
22
A classifier is a function that that some data as
input and assign label (class) to it as output
Fruit classifier
E-mail classifier

Learning
23
Apple
Input Output (Label)
Apple
Apple

Learning Data
25
• Training set
– Used to fit (train) model parameters
• Validation set
– Used to check performance on independent
data and tune parameters
• Test set
– final evaluation of performance after all
parameters fixed

Supervised Learning
27
A Technique that build the classifier Automatically
• Create classifier by finding patterns in image
Collect
training data
Train
Classifier
Make
Predictions

Collecting Training Data
28
Take description of fruits image input which is
Called features

29
Training Data
• features : Weight , Texture
• Output : Label

30
Examples : one raw of the Traing Data

31
Note :
The more training data you have
the better classifier you can create .

32
Note : you can use the image as a features

Changing the Training data
33
Classifying the car based on Horsepower and
number of seats

Machine Learning is better
34
• You can create new classifier for new problem
by just changing the training data instead of
Writing rules for each problem
• Reusable

Many types of Classifiers
35
• Artificial Neural Network (ANN)
• Support vector machine (SVM)
• Decision tree
• Random forest
• Naïve Bayes
• … etc

Good features
36
Classifier is good as the feature you provide
è So
Coming up with good features is one of the
most important jobs in Machine Learning

Good features
38
We can use Two features

Good features – sample population
39
• Population of 1000 dogs
• Number of greyhounds = 500.
• Number of Labradors = 500.
• We Draw a histogram for their heights
• greyhounds color : red
• Labradors color : blue

40

41
If we want to predict the dog with heights
20 , 25 and 35 inches
• Dogs with 20 inches è Labrador
• Dogs with 35 inches è greyhound

43
• Dog with 25 inches è ????
Probability that it is a greyhound
Or Labrador is very close
That mean:
Height is a useful feature but not
perfect

Good features
44
Conclusion :
In Machine Learning we need multiple
features for better results
If you want to know which features to use
Do a thought experiment

How many features
45
Other features :
Hair length speed weight

Good features – Eye color
46
Height Distribution is equal :

Useless features – Eye color
47
• Eye color features tell us nothing regarding
type of dog
• Useless features can affect classifier accuracy

Independent features
48
• Independent features gives different type of
information
• Remove highly correlated (redundant)
features

Easy to understand features
49
• Imagine we want to Predict how many days
to mail a letter between two different cities

50
• Easy Feature:
distance between the cities in miles

51
• Complex Feature:
cities location given by longitude and
latitude
So : Simple relationships are easier to learn

Features conclusion
52
ideal features are
1- informative
2- independent
3- simple

ML_1.pdf

More Related Content

Similar to ML_1.pdf

More from AmirMohamedNabilSale

Recently uploaded

ML_1.pdf