An Introduction to Machine Learning - Key Concepts, Algorithms, and Applications

{
An Introduction to
Machine Learning
By : Avinash Kumar Saw
Department of Information Technology
IIEST Shibpur

Table of Contents :
• What is Machine Learning ?
• How does it work ?
• An example on Machine Learning
• Why it is required ?
• Various Machine learning algorithms
- Collaborative filtering
• Applications
• References

What is Machine Learning ?
Machine Learning is subfield of Artificial Intelligence
concerned with algorithms that allow computers to learn.
It is the ability of a machine to improve its own performance
through repetition and experience.

“Machine Learning is the field of study that gives computers the
ability to learn without being explicitly programmed.” – Arthur
Samuel way back in 1959.
“A computer program is said to learn from experience E with
respect to some task T and some performance measure P, if its
performance on T, as measured by P, improves with experience
E.” – Tom Mitchell, Carnegie Mellon University

How does it work ?
By implementing algorithms that can learn from and
make predictions on data.
Such algorithms operate by building a model from
example inputs in order to make data-driven predictions
or decisions, rather than following strictly static program
instructions.

Machine Learning algorithms are often categorized as :
• Supervised machine learning : The program is “trained” on a
pre-defined set of “training examples”, which then facilitate its
ability to reach an accurate conclusion when given new data.
• Unsupervised machine learning : The program is given a bunch
of data and must find patterns and relationships therein.

In the majority of supervised learning applications, the
ultimate goal is to develop a finely tuned predictor
function h(x) (sometimes called the “hypothesis”).
A very simple predictor function can be of form :
h(x) = θ0 + θ1 x
where θ0 and θ1 are constants. Our goal is to find the
perfect values of θ0 and θ1 to make our predictor work as
well as possible.

A simple machine learning example :
Company employees satisfaction rating based on employee salary

First we have to initialize our predictor h(x) with some reasonable values of θ0 and θ1 .
Now our predictor looks like this when placed over our training set:
h(x) = 12.00 + 0.20 x

If we ask this predictor for the satisfaction of an employee making $60k, it would predict
a rating of 24 , which is a terrible guess because the machine doesn’t know much.

So now, let’s give this predictor all the salaries from our training set, then the
predictor function may look like this :
h(x) = 13.12 + 0.62 x

If we repeat this process , say 1000 times, this will give us a better predictor
function h(x) = 15.54 + 0.75 x

Now if we ask the machine again for the satisfaction rating of the employee who
makes $60k, it will predict a rating of roughly 60.

Why it is required ?
• To deal with complex real world problems.
• To produce models that can analyse bigger, more complex data
and deliver faster, more accurate results – even on a very large
scale.
• Machine Learning solves problems that cannot be solved by
numerical means alone.

This is possible because of :
• More cheaper and powerful computational processing
available these days.
• Growing volumes and varieties of available data.
• Affordable data storage.

Some Machine Learning algorithms :
• Collaborative filtering
• Decision Tree learning
• Artificial neural networks
• Clustering
• Genetic algorithm
• Inductive logic programming
• Similarity and metric learning
• Bayesian networks

Collaborative filtering :
A technique used by some recommender systems.
Also called social filtering , because it filters information based
on the recommendations of other people.
Used by online shopping sites like Amazon , like suggesting
products to the users or recommending an item to the user
based on the information collected from them.

An example of collaborative filtering :
Suppose , you want a recommendation of a movie from your friends.
But your friends may not have the same “taste” in movies as yours.
As more and more options become available, it becomes less practical
to decide what you want by asking a small group of people, since they
may not be aware of all the options. This is why a set of techniques
called collaborative filtering was developed.
The algorithm consists of :
• Collecting the preferences
• Finding similar user - 1. Euclidean Distance Score
2. Pearson Correlation Score
• Ranking the critic
• Recommending items

Collecting preferences :
Collect the preferences from your friends , ie , the rating given by them to
different movies and store them in a dictionary(in python) or a Map(in C++)
# A dictionary of movie critics and their ratings of a small set of movies
Critics = {‘Ranjan': {‘Dabang': 2.5, ‘Tashan': 3.5,
'Just My Luck': 3.0, 'Superman Returns': 3.5, ‘Dilwale': 1.5,
'The Night Listener': 3.0},
‘Govind': {' Dabang ': 3.0, ' Tashan ': 3.5,
'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0,
' Dilwale': 3.5},
‘Shankul': {' Dabang ': 4.5, ' Tashan ': 3.0,
'Superman Returns': 3.5, 'The Night Listener': 4.0},
‘Ashish': {' Tashan ': 3.5, 'Just My Luck': 3.0,
'The Night Listener': 4.5, 'Superman Returns': 4.0,
' Dilwale ': 2.5},
‘Niraj': {' Dabang ': 3.0, ' Tashan ': 4.0,
'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0,
' Dilwale ': 2.0},
‘Shubham': {' Dabang ': 3.0, ' Tashan ': 4.0,
'The Night Listener': 3.0, 'Superman Returns': 5.0, ' Dilwale ': 3.5},
‘Manish': {' Tashan ':4.5,' Dilwale ':1.0,'Superman Returns':4.0}}

Finding similar users :
• Determine how similar people are in their tastes.
• Compare each person with every other person and
calculate a similarity score.
• Two ways to do this : Euclidean distance and Pearson
correlation.

Euclidean Distance Score :
people in preference space
Dabang
5
Manish
4 Shankul
Shubham
3 Niraj Govind
2 Ranjan
1
0 Dilwale
1 2 3 4 5

Calculate the Euclidean distance between every pair of points.
For eg , to calculate the Euclidean distance between Manish and Niraj :
float euclid_distance = sqrt( pow(4.5 – 3, 2) + pow(3 – 2, 2) )
This formula calculates the distance, which will be smaller for people who
are more similar. However, you need a function that gives higher values for
people who are similar. This can be done by adding 1 to the function (to
prevent a division-by zero error) and inverting it:
float req _dist = 1/(euclid_distance + 1)
So now, the req_dist will give values between 0 and 1. A value close to 1 will
indicate that the two person’s tastes are similar , ie, they have identical
preferences.

#Returns a distance-based similarity score for person1 and person2
def similarity(prefs,person1,person2):
# Get the list of shared_items
si={ }
for item in prefs[person1]:
if item in prefs[person2]:
si[item]=1
# if they have no ratings in common, return 0
if len(si)==0 : return 0
# Add up the squares of all the differences
sum_of_squares = sum([pow(prefs[person1][item] - prefs[person2][item],2)
for item in prefs[person1] if item in prefs[person2]])
return 1/(1+sum_of_squares)

Ranking the critics :
Score everyone against a given person and finds the closest matches. In this
case, you’re interested in learning which movie critics have tastes similar to
you so that you know whose advice you should take when deciding on a
movie.
# Returns the best matches for person from the prefs dictionary.
# Number of results and similarity function are optional params.
def topMatches(prefs, person, n=7, similarity=req_dist):
scores=[(similarity(prefs, person, other), other)
for other in prefs if other!=person]
# Sort the list so the highest scores appear at the top
scores.sort( )
scores.reverse( )
return scores[0:n]

Recommending items :
Critic Similarity
Govind 0.99
Ranjan 0.98
Shankul 0.92
Manish 0.85
Ashish 0.96
Shubham 0.88
Niraj 0.84
This table suggests that you
should take Govind’s advice on
what movie to watch.

Applications
Few examples of machine learning applications that you may be familiar with :
• The self-driving Google car.
• Search engines
• Online recommendation offers like those from Amazon and Netflix -
Machine learning applications for everyday life.
• Facebook's News Feed uses machine learning to personalize each member's
feed.
• Knowing what customers are saying about you on Twitter - Machine
learning combined with linguistic rule creation.

Applications of Machine Learning in various domains :
• Web search
• Computational biology
• Finance
• E-commerce
• Space exploration
• Robotics
• Information extraction
• Social networks
• Debugging

If you want to predict the questions that will be asked in
your semester examinations , you can simply run a
machine learning algorithm with the previous years
questions as the input data, and then have fun.
Some other applications

Some popular books on Machine Learning :
Programming Collective Intelligence
- O’Reilly Media, Inc.
Machine Learning for Hackers
- O’Reilly Media, Inc.

References :
• Wikipedia - https://en.wikipedia.org/wiki/Machine_learning
• Book – Programming Collective Intelligence , O’Reilly Media, Inc.

An Introduction to Machine Learning - Key Concepts, Algorithms, and Applications

An Introduction to Machine Learning - Key Concepts, Algorithms, and Applications

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

Similar to An Introduction to Machine Learning - Key Concepts, Algorithms, and Applications

Similar to An Introduction to Machine Learning - Key Concepts, Algorithms, and Applications (20)

Recently uploaded

Recently uploaded (20)

An Introduction to Machine Learning - Key Concepts, Algorithms, and Applications