2. Big Data
Widespread use of personal computers and
wireless communication leads to “big data”
We are both producers and consumers of data
Data is not random, it has structure, e.g.,
customer behavior
We need “big theory” to extract that structure from
data for
(a) Understanding the process
(b) Making predictions for the future
Dr.Ranjitha M, Kristu Jayanti College
2
3. Why “Learn” ?
Machine learning is programming computers to
optimize a performance criterion using example
data or past experience.
There is no need to “learn” to calculate payroll
Learning is used when:
Human expertise does not exist (navigating on
Mars),
Humans are unable to explain their expertise
(speech recognition)
Solution changes in time (routing on a computer
network)
Solution needs to be adapted to particular cases
(user biometrics)
Dr.Ranjitha M, Kristu Jayanti College
3
4. What We Talk About When
We Talk About “Learning”
Learning general models from a data of particular
examples
Data is cheap and abundant (data warehouses,
data marts); knowledge is expensive and scarce.
Example in retail: Customer transactions to
consumer behavior:
People who bought “Blink” also bought “Outliers”
(www.amazon.com)
Build a model that is a good and useful
approximation to the data.
Dr.Ranjitha M, Kristu Jayanti College
4
5. Data Mining
Retail: Market basket analysis, Customer
relationship management (CRM)
Finance: Credit scoring, fraud detection
Manufacturing: Control, robotics, troubleshooting
Medicine: Medical diagnosis
Telecommunications: Spam filters, intrusion
detection
Bioinformatics: Motifs, alignment
Web mining: Search engines
...
Dr.Ranjitha M, Kristu Jayanti College
5
6. Basic Difference in ML and
Traditional Programming?
Traditional Programming : We feed in DATA
(Input) + PROGRAM (logic), run it on machine
and get output.
Machine Learning : We feed in DATA(Input)
+ Output, run it on machine during training
and the machine creates its own
program(logic), which can be evaluated while
testing.
Dr.Ranjitha M, Kristu Jayanti College
6
7. What does exactly learning
means for a computer?
A computer is said to be learning
from Experiences with respect to some class
of Tasks, if its performance in a given Task
improves with the Experience.
A computer program is said to learn from
experience E with respect to some class of
tasks T and performance measure P, if its
performance at tasks in T, as measured by P,
improves with experience E
Dr.Ranjitha M, Kristu Jayanti College
7
8. How ML works?
Gathering past data in any form suitable for processing.
The better the quality of data, the more suitable it will
be for modeling
Data Processing –The data collected is in the raw form
and it needs to be pre-processed.
Example:
Some tuples may have missing values for certain
attributes,
in this case, it has to be filled with suitable values in
order to perform machine learning or any form of data
mining.
Dr.Ranjitha M, Kristu Jayanti College
8
9. How ML works?
Missing values for numerical attributes such as the price
of the house may be replaced with the mean value of
the attribute whereas missing values for categorical
attributes may be replaced with the attribute with the
highest mode.
This invariably depends on the types of filters we use.
If data is in the form of text or images then converting it
to numerical form will be required, be it a list or array or
matrix.
Simply, Data is to be made relevant and consistent.
It is to be converted into a format understandable by the
machine
Dr.Ranjitha M, Kristu Jayanti College
9
10. How ML works?
Divide the input data into training, cross-validation and test sets.
The ratio between the respective sets must be 6:2:2
Building models with suitable algorithms and techniques on the
training set.
Testing our conceptualized model with data which was not fed to
the model at the time of training and evaluating its performance
using metrics such as F1 score, precision and recall.
Pre-requisites to learn ML:
Linear Algebra
Statistics and Probability
Calculus
Graph theory
Programming Skills – Language such as Python, R, MATLAB, C++ or
Octave
Dr.Ranjitha M, Kristu Jayanti College
10
11. What is Machine Learning?
Optimize a performance criterion using
example data or past experience.
Role of Statistics: Inference from a sample
Role of Computer science: Efficient algorithms
to
Solve the optimization problem
Representing and evaluating the model for
inference
Dr.Ranjitha M, Kristu Jayanti College
11
12. Definition of machine
learning
Arthur Samuel, an early American leader in the field of computer gaming
and artificial intelligence, coined the term “Machine Learning” in 1959
while at IBM. He defined machine learning as “the field of study that
gives computers the ability to learn without being explicitly
programmed.” However, there is no universally accepted definition for
machine learning. Different authors define the term differently.
two more definitions.
1.Machine learning is programming computers to optimize a performance
criterion using example data or past experience. We have a model
defined up to some parameters, and learning is the execution of a
computer program to optimize the parameters of the model using the
training data or past experience. The model may be predictive to make
predictions in the future, or descriptive to gain knowledge from data, or
both
2.The field of study known as machine learning is concerned with the
question of how to construct computer programs that automatically
improve with experience
Dr.Ranjitha M, Kristu Jayanti College
12
13. Learning
Definition of learning –Recall from earlier Slide –
Lecture 1
A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P , if its performance at tasks T , as measured
by P , improves with experience E.
i)Handwriting recognition learning problem
•Task T : Recognising and classifying handwritten
words within images
•Performance P : Percent of words correctly classified
•Training experience E: A dataset of handwritten words
with given classifications
Dr.Ranjitha M, Kristu Jayanti College
13
14. Learning
ii)A robot driving learning problem
•Task T : Driving on highways using vision
sensors
•Performance measure P : Average distance
traveled before an error
•training experience: A sequence of images
and steering commands recorded while
observing a human driver
Dr.Ranjitha M, Kristu Jayanti College
14
15. Learning
iii)A chess learning problem
•Task T : Playing chess
•Performance measure P : Percent of games won
against opponents
•Training experience E: Playing practice games
against itself
Definition
A computer program which learns from
experience is called a machine learning program
or simply a learning program. Such a program is
sometimes also referred to as a learner.
Dr.Ranjitha M, Kristu Jayanti College
15