SlideShare a Scribd company logo
1 of 102
UNIT 1
Introduction
1. Tom M. Mitchell,―Machine Learning, McGraw-Hill Education (India) Private Limited, 2013.
2. Ethem Alpaydin,―Introduction to Machine Learning (Adaptive Computation and Machine
Learning), The MIT Press 2004.
3. Stephen Marsland, ―Machine Learning: An Algorithmic Perspective, CRC Press, 2009.
4. Bishop, C., Pattern Recognition and Machine Learning. Berlin: Springer- Verlag.
1
KCS 055/ KOE 073: Machine Learning Introduction
Anurag Malik
(Associate Prof. CS & E)
CS & E Dept. M.I.T Moradabad
B.Tech V/ VII CS /ME
Recommended Books:
October 14, 2023
Syllabus KCS 055
2
October 14, 2023
3
October 14, 2023
What is Learning?
 “Learning denotes changes in a system that ... enable a system to do the
same task … more efficiently the next time.” - Herbert Simon
• Learning is the process of acquiring new understanding, knowledge,
behaviors, skills, values, attitudes and preferences.
• The ability to learn is possessed by humans, animals, and some machines.
 “Learning is making useful changes in our minds.” - Marvin Minsky
• Some learning is immediate, induced by a single event (e.g. being burned
by a hot stove), but much skill and knowledge accumulates from repeated
experiences.
October 14, 2023 4
Types of Learning
1. Visual (Spatial) :By representing information and with
images, students are able to focus on meaning, such as
architecture, engineering, project management, or design.
2. Aural (Auditory-Musical): If you need someone to
tell you something out loud to understand it, you are an
auditory learner. such as musician, recording engineer,
speech pathologist, or language teacher.
3. Verbal (Linguistic): People who find it easier to express
themselves by writing or speaking can be regarded as a verbal learner.
4. Physical (Kinesthetic) :In this style, learning happens
when the learner carries out a physical activity, rather
than listening to a lecture or watching a demonstration.
5
October 14, 2023
Types of Learning (Cont…)
5. Logical (Mathematical) :When you like using your
brain for logical and mathematical reasoning,
you’re a logical learner. You easily recognise patterns
and can connect seemingly meaningless concepts easily.
such as scientific research, accountancy, bookkeeping
or computer programming.
6. Social (Interpersonal) : If you’re at best in socializing
and communicating with people, both verbally and
non-verbally, this is what you are; a social learner.
People often come to you to listen and ask for
advice. counseling, teaching, training and coaching,
sales, politics, and human resources among others.
6
October 14, 2023
Related Fields
Machine learning is primarily concerned with the accuracy and
effectiveness of the computer system.
psychological models
data
mining
cognitive science
decision theory
information theory
databases
machine
learning
neuroscience
statistics
evolutionary
models
control theory
7
Well – Posed Learning Problems
 Learning can be defined through a computer program that improves its
performance at some task through experience.
 Definition of Learning: A computer program is said to learn from
experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P, improves
with experience E.
 Lets have some examples of Well Posed Learning Problems
 Learn to Play Checkers
 Learn to recognize spoken words (SPHINX System)
 Learning to drive an autonomous vehicle (ALVINN System)
 Learning to classify new astronomical structures
 Predict recovery rates of pneumonia patients
 Detect fraudulent use of credit cards
8
October 14, 2023
Well – Posed Learning Problems
 Three features: the class of tasks, the measure of performance to be
improved, and the source of experience.
 A checkers learning problem:
 Task T: playing checkers
 Performance measure P: percent of games won against opponents
 Training experience E: playing practice games against itself
 We can specify many learning problems in this fashion, such as learning
to recognize handwritten words, or learning to drive a robotic automobile
autonomously.
 A handwriting recognition learning problem:
 Task T: recognizing and classifying handwritten words within images
 Performance measure P: percent of words correctly classified
 Training experience E: a database of handwritten words with given
classifications
9
October 14, 2023
Well – Posed Learning Problems
 A robot driving learning problem:
Task T: driving on public four-lane highways using vision
sensors
Performance measure P: average distance traveled before
an error (as judged by human overseer)
Training experience E: a sequence of images and steering
commands recorded while observing a human driver
10
October 14, 2023
11
DESIGNING A LEARNING SYSTEM
1. Choosing the Training Experience
2. Choosing the Target Function
3. Choosing a Representation for the Target
Function
4. Choosing a Function Approximation Algorithm
5. The Final Design
October 14, 2023
Designing a Learning System
 While designing a Learning system various design issues and approaches
must be consider.
1. Choosing the Training Experience: The first design choice we face is to
choose the type of training experience from which our system will
learn. The type of training experience available can have a significant
impact on success or failure of the learner.
 One key attribute is whether the training experience provides direct or
indirect feedback regarding the choices made by the performance system.
 A second important attribute of the training experience is the degree to
which the learner controls the sequence of training examples.
 A third important attribute of the training experience is how well it
represents the distribution of examples over which the final system
performance P must be measured.
12
October 14, 2023
13
Designing a Learning System
A checkers learning problem:
Task T: Playing checkers (draughts)
Performance Measures P: percent of games won in world tournament
Training Experience E: games played against itself
What experience?
What exactly should be learned?
How shall it be represented?
What specific algorithm to learn it?
October 14, 2023
14
Direct versus Indirect Learning
1. Individual checkers board states and correct
move for each
2. Move sequences and final outcomes of various
games played
Credit assignment problem - the degree to which
each move in the sequence deserves credit or
blame for the final outcome - game can be lost
even when early moves are optimal, if these are
followed later by poor moves or vice versa
October 14, 2023
15
Teacher or not?
Degree to which learner controls the sequence of training examples
1. Teacher selects informative board states & provides the correct
moves
2. For each proposed board state the learner finds particularly
confusing it asks the teacher for correct move
3. Learner may have complete control as it does when it learns by
playing itself with no teacher - learner may choose between
experimenting with novel board states or honing its skill by
playing minor variations of promising lines of play
October 14, 2023
16
1. Choose Training Experience
How well training experience represents the distribution of examples over
which the final system performance P must be measured
P is percent of games in the world tournament, obvious danger when E
consists of only games played against itself (probably can’t get world
champion to teach computer!)
Most current theories of machine learning assume that the distribution of
training examples is identical to the distribution of test examples
It is IMPORTANT to keep in mind that this assumption must often by
violated in practice.
E: play games against itself (advantage of getting a lot of data this way)
October 14, 2023
17
2. Choose a Target Function
The next design choice is to determine exactly what type of knowledge
will be learned and how this will be used by the performance
program.
ChooseMove: B -> M where B is any legal board state and M is a legal
move (hopefully the “best” legal move)
Alternatively, function V: B ->  which maps from B to some real value
where higher scores are assigned to better board states
Now use the legal moves to generate every subsequent board state
and use V to choose the best one and therefore the best legal move
October 14, 2023
18
Choose a Target Function II
Let us define the target value V(b) for an
arbitrary board state b in B, as follows
V(b) = 100, if b is a final board state that is won
V(b) = -100, if b is a final board state that is lost
V(b) = 0, if b is a final board state that is a draw
V(b) = V(b´), if b is not a final state where b´ is
the best final board state starting from b
assuming both players play optimally
October 14, 2023
3. Choosing a Representation for the Target
Function
 Given the ideal target function V, we will choose a representation that the
learning system will use to describe V' that it will learn.
 The function V' will be calculated as a linear combination of the following
board features:
 xl: the number of black pieces on the board
 x2: the number of red pieces on the board
 x3: the number of black kings on the board
 x4: the number of red kings on the board
 x5: the number of black pieces threatened by red (which can be captured
on red's next turn)
 x6: the number of red pieces threatened by black
19
October 14, 2023
3. Choosing a Representation for the
Target Function
 Thus, learning program will represent V'(b) as a linear
function of the form:
 V'(b) = w0+ w1x1+ w2x2+ w3x3+ w4x4+ w5x5+ w6x6
 where wi is the numerical coefficient or weight to
determine the relative importance of the various board
features and xi is the number of i-th objects on the board.
 where w0 through w6 are numerical coefficients or weights
to be chosen by the learning algorithm
20
October 14, 2023
21
Design So Far
T: Checkers
P: percent of games won in world tournament
E: games played against self
V: Board -> 
Target Function Representation:
V´(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6
October 14, 2023
22
4. Choose Function Approximation Algorithm
 In order to learn the target function f we require a set of training
examples, each describing a specific board state b and the training
value Vtrain(b) for b.
 In other words, each training example is an ordered pair of the form
(b,Vtrain(b)).
 First need Set of training examples <b,Vtrain(b)>
 For instance, the following training example describes a board state b in
which black has won the game (note x2 = 0 indicates that red has no
remaining pieces) and for which the target function value Vtrain(b) is
therefore +100.
<(x1=3,x2=0,x3=1,x4=0,x5=0,x6=0),+100> because x2=0
a) Estimating Training Values:
b) Adjusting the weights:
October 14, 2023
5. The Final Design
 The final design of our checkers learning system can be naturally described by four
distinct program modules that represent the central components in many learning
systems. These four modules
1. Performance System: Solve performance task using learned target function(s). It
takes instance of new problem as input and a trace of its solution (history) as output.
2. Critic: Take history of problem as input and produce a set of training examples of
target function as output.
3. Generalizer: Take training examples as input and produce estimate of target function
as output hypothesis. It generalizes from specific training examples, hypothesizing a
general function that covers all examples.
4. Experiment Generator: Take current hypothesis (currently learned function) as
input and outputs a new problem (i.e., initial board state) for Performance System to
explore. Its role is to pick new practice problems that will maximize the learning rate
of the overall system.
23
October 14, 2023
5. The Final Design
Fig. Final Design of checkers learner problem
24
October 14, 2023
What is Machine Learning?
 A branch of artificial intelligence, concerned with the
design and development of algorithms that allow
computers to evolve behaviors based on empirical data.
 As intelligence requires knowledge, it is necessary for
the computers to acquire knowledge.
 “Machine learning refers to a system capable of the
autonomous acquisition and integration of knowledge.”
 https://www.youtube.com/watch?v=Cx5aNwnZYDc
 https://www.youtube.com/watch?v=YhSeTEumjVA
 https://www.youtube.com/watch?v=ZoemTySxFso
October 14, 2023 25
Machine Learning Paradigms
 rote learning
 learning by being told (advice-taking)
 learning from examples (induction)
 learning by analogy
 speed-up learning
 concept learning
 clustering
 discovery
26
October 14, 2023
Why Machine Learning?
 No human experts
 industrial/manufacturing control
 mass spectrometer analysis, drug design, astronomic discovery
 Black-box human expertise
 face/handwriting/speech recognition
 driving a car, flying a plane
 Rapidly changing phenomena
 credit scoring, financial modeling
 diagnosis, fraud detection
 Need for customization/personalization
 personalized news reader
 movie/book recommendation
 Recent progress in algorithms and theory
 Growing Flood of online data
 Computational power is available
October 14, 2023 27
AI vs ML vs DL
October 14, 2023 28
Continue……
October 14, 2023 29
S. No.
Data Science Machine Learning
1.
Data Science is a field about processes
and systems to extract data from
structured and semi-structured data.
Machine Learning is a field of study that gives
computers the capability to learn without being
explicitly programmed.
2. Need the entire analytics universe. Combination of Machine and Data Science.
3. Branch that deals with data.
Machines utilize data science techniques to
learn about the data.
4.
Data in Data Science maybe or maybe
not evolved from a machine or
mechanical process.
It uses various techniques like regression and
supervised clustering.
5.
Data Science as a broader term not only
focuses on algorithms statistics but also
takes care of the data processing.
But it is only focused on algorithm statistics.
6.
It is a broad term for multiple
disciplines.
It fits within data science.
7.
Many operations of data science that is,
data gathering, data cleaning, data
manipulation, etc.
It is three types: Unsupervised learning,
Reinforcement learning, Supervised learning.
8.
Example: Netflix uses Data Science
technology.
Example: Facebook uses Machine Learning
technology. 30
October 14, 2023
Tools used for AI,ML and Deep Learning
October 14, 2023 31
Continue……
1. Tensorflow
 TensorFlow is basically an open source software library that is used for numerical computation with the help of
data flow graph. It came into sight by the dedicated efforts of engineers and researchers working on the Google
Brain Team. The flexible architecture of Tensorflow allows you to deploy computation to multiple GPUs or CPUs
in a server/mobile device/desktop by using just a single API.
2. IBM Watson
 IBM has been a viking in the field of Artificial Intelligence as it is working on this technology for a very long time.
The company has its own AI platform named Watson that comes housing numerous AI Tools for both business
users and developers. Watson is available as a set of open APIs, by which users can simply access a lot of starter
kits and sample codes. Users can use them to make virtual agents and cognitive search engines. Moreover, the
cherry on the cake for Watson is its chatbot building platform that is developed focusing on beginners and requires
little machine learning skills.
3. Caffe
 Caffe is a deep learning C++ framework that has been developed keeping modularity, expression, and speed in
mind. Talking about its working, Caffe’s focus remains stable on Convolutional Networks for computer vision
applications.
October 14, 2023 32
Continue……
4. Deeplearning4j
 Deeplearning4j is termed as the first open-source, commercial grade, distributed deep learning library
developed for Scala and Java. It's easy to use infrastructure makes it a panacea for non-researchers. The
most fascinating quality of DL4J is that it can import neural net models from many major frameworks via
Keras, which include Theano, Caffe, and TensorFlow.
5. Torch
 Torch is also an open source machine learning library, which is being used by many giant IT firms
including Yandex, IBM, Idiap Research Institute, & Facebook AI Research Group. It can also be termed
as a scientific computing framework and a script language that is based on Lua programming language.
After its successful execution on web platforms, Torch has also been extended for the use on iOS and
Android.
October 14, 2023 33
Learning System Model
Input
Samples
Learning
Method
System
Training
Testing
October 14, 2023 34
Training and Testing
Training set
(observed)
Universal set
(unobserved)
Testing set
(unobserved)
Data
acquisition
Practical
usage
October 14, 2023 35
Training and Testing
 Training is the process of making the system able to learn.
 No free lunch rule:
 Training set and testing set come from the same distribution
 Need to make some assumptions or bias
October 14, 2023 36
Algorithms
Supervised
learning
Unsupervised
learning
October 14, 2023 37
EXAMPLES OF ML
•Personalization: Online services like Amazon/Netflix
use AI to personalize our experience. They learn from
our, other users previous purchases and recommend
relevant content for us.
•Image recognition: ML can be used for face detection
in an image. There is a separate category for each
person in a database of several people.
•Medical diagnoses: ML is trained to recognize
cancerous tissues.
38
October 14, 2023
•Speech Recognition: It is translation of spoken words
in text. It is used in voice searches and more. Voice
user interfaces include voice dialing, call routing, and
appliance control. (also Natural language processing)
•Data mining: The application of ML methods to large
databases.
•Fraud detection: Banks use AI to determine strange
activity on our account. Unexpected activity, such as
foreign transactions, could be flagged by the algorithm.
39
October 14, 2023
DATA MINING (KDD)
40
October 14, 2023
KDD Process
October 14, 2023 41
Selection: Obtain data from various sources.
Preprocessing: Cleanse data.
Transformation: Convert to common format. Transform
to new format.
Data Mining: Obtain desired results.
Interpretation/Evaluation: Present results to user in
meaningful manner.
October 14, 2023 42
KDD Process: Several Key Steps
Many people treat data mining as a synonym for another popularly used term, Knowledge
Discovery from Data, or KDD. Alternatively, others view data mining as simply an essential
step in the process of knowledge discovery. Knowledge discovery as a process
is depicted in Figure 1.4 and consists of an iterative sequence of the following steps:
1. Data cleaning (to remove noise and inconsistent data)
2. Data integration (where multiple data sources may be combined)
3. Data selection (where data relevant to the analysis task are retrieved from the
database)
4. Data transformation (where data are transformed or consolidated into forms
appropriate for mining by performing summary or aggregation operations, for instance)
5. Data mining (an essential process where intelligent methods are applied in order to
extract data patterns)
6. Pattern evaluation (to identify the truly interesting patterns representing knowledge
based on some interestingness measures.
7. Knowledge presentation (where visualization and knowledge representation
techniques are used to present the mined knowledge to the user)
Steps 1 to 4 are different forms of data preprocessing, where the data are prepared for
mining. The data mining step may interact with the user or a knowledge base. The
interesting patterns are presented to the user and may be stored as new knowledge in the
knowledge base
•Email filtering: Email services use AI to filter
incoming emails. Users can train their spam filters
by marking emails as spam.
•Prediction: ML can be used in prediction systems.
Considering the loan example, to compute the
probability of a fault, the system will need to
classify the available data in groups.
•Computer vision, Computational biology, Robot
control, Handwriting recognition
43
October 14, 2023
44
October 14, 2023
History of ML
 1950 — Alan Turing creates the “Turing Test” to determine if a computer has
real intelligence. To pass the test, a computer must be able to fool a human into
believing it is also human.
 1952 — Arthur Samuel wrote the first computer learning program. The program
was the game of checkers, and the IBM computer improved at the game the more it
played, studying which moves made up winning strategies and incorporating those
moves into its program.
 1957 — Frank Rosenblatt designed the first neural network for computers (the
perceptron), which simulate the thought processes of the human brain.
 1967 — The “nearest neighbor” algorithm was written, allowing computers to
begin using very basic pattern recognition. This could be used to map a route for
traveling salesmen, starting at a random city but ensuring they visit all cities during a
short tour.
 1979 — Students at Stanford University invent the “Stanford Cart” which can
navigate obstacles in a room on its own.
 1981 — Gerald Dejong introduces the concept of Explanation Based Learning
(EBL), in which a computer analyses training data and creates a general rule it can
follow by discarding unimportant data.
 45
October 14, 2023
History of ML
 1985 — Terry Sejnowski invents NetTalk, which learns to pronounce words
the same way a baby does.
 1990s — Work on machine learning shifts from a knowledge-driven
approach to a data-driven approach. Scientists begin creating programs for
computers to analyze large amounts of data and draw conclusions — or “learn” —
from the results.
 1997 — IBM’s Deep Blue beats the world champion at chess.
 2006 — Geoffrey Hinton coins the term “deep learning” to explain new
algorithms that let computers “see” and distinguish objects and text in images
and videos.
 2010 — The Microsoft Kinect can track 20 human features at a rate of 30
times per second, allowing people to interact with the computer via movements
and gestures.
 2011 — IBM’s Watson beats its human competitors at Jeopardy.
 2011 — Google Brain is developed, and its deep neural network can learn to
discover and categorize objects much the way a cat does.
46
October 14, 2023
History of ML
 2012 – Google’s X Lab develops a machine learning algorithm that is able to
autonomously browse YouTube videos to identify the videos that contain cats.
 2014 – Facebook develops DeepFace, a software algorithm that is able to
recognize or verify individuals on photos to the same level as humans can.
 2015 – Amazon launches its own machine learning platform.
 2015 – Microsoft creates the Distributed Machine Learning Toolkit, which
enables the efficient distribution of machine learning problems across multiple
computers.
 2015 – Over 3,000 AI and Robotics researchers, endorsed by Stephen
Hawking, Elon Musk and Steve Wozniak (among many others), sign an open letter
warning of the danger of autonomous weapons which select and engage targets
without human intervention.
 2016 – Google’s artificial intelligence algorithm beats a professional player at
the Chinese board game Go, which is considered the world’s most complex board
game and is many times harder than chess. The AlphaGo algorithm developed by
Google DeepMind managed to win five games out of five in the Go competition.
47
October 14, 2023
Some Issues in Machine Learning
 What algorithms can approximate functions well (and when)?
 How does number of training examples influence accuracy?
 How does complexity of hypothesis representation impact it?
 How does noisy data influence accuracy?
 What are the theoretical limits of learnability?
 How can prior knowledge of learner help?
 What clues can we get from biological learning systems?
 How can systems alter their own representations?
 Understanding Which Processes Need Automation.
 Lack of Quality Data.
 Inadequate Infrastructure.
 Implementation.
 Lack of Skilled Resources.
48
October 14, 2023
TRADITIONAL PROGRAMMING VS ML
49
October 14, 2023
Machine Learning Approaches
50
October 14, 2023
Machine Learning Approaches
51
October 14, 2023
TYPES OF ML
Using data for answering questions
Training Predicting
52
October 14, 2023
Supervised vs. Unsupervised Learning
 Supervised learning (classification)
 Supervision: The training data (observations, measurements, etc.)
are accompanied by labels indicating the class of the observations
 New data is classified based on the training set
 No new class is generated
 Unsupervised learning (clustering)
 The class labels of training data is unknown
 Given a set of measurements, observations, etc. with the aim of
establishing the existence of classes or clusters in the data
 New classes can be generated.
October 14, 2023 53
1. SUPERVISED LEARNING
• Supervised Learning Algorithms are the ones that involve direct
supervision of the operation.
• Developer labels sample data and set strict boundaries upon which the
algorithm operates.
• Learn through examples of which we know the desired output (what we
want to predict).
• The primary purpose of supervised learning is to scale the scope of data
and to make predictions of unavailable, future or unseen data based on
labeled sample data
54
October 14, 2023
1. SUPERVISED LEARNING
• It is a spoon-fed version of machine learning:
 you select what kind of information output (samples) to “feed” the
algorithm;
 what kind of results it is desired (for example “yes/no” or
“true/false”).
• Example:
 Is this a cat or a dog?
 Are these emails spam or not?
 Predict the market value of houses, given the square meters, number
of rooms, neighborhood, etc.
55
October 14, 2023
56
October 14, 2023
57
October 14, 2023
58
October 14, 2023
TYPES OF SUPERVISED LEARNING
59
• Classification separates the data, Regression fits the data.
October 14, 2023
TYPES OF SUPERVISED LEARNING
I. Classification (Categorial Target Variable) –
• Classification is the process where incoming data is labeled
based on past data samples and manually trains the algorithm to
recognize certain types of objects and categorize them
accordingly.
• The system has to know how to differentiate types of
information, perform an optical character, image, or binary
recognition (whether a particular bit of data is compliant or non-
compliant to specific requirements in a manner of “yes” or
“no”).
• eg. Medical Imaging.
60
October 14, 2023
TYPES OF SUPERVISED LEARNING
II. Regression (Continuous Target Variable)
• Regression is the process of identifying patterns and
calculating the predictions of continuous outcomes.
• The system has to understand the numbers, their values,
grouping (for example, heights and widths), etc.
• eg. Housing Price Prediction
61
October 14, 2023
PROS & CONS OF SUPERVISED LEARNING
 PROS
• It allows to collect and produce data from previous experience.
• It is more trustworthy compared to unsupervised learning, which can be
computationally complex and less accurate in some instances.
 CONS
• Concrete examples are required for training classifiers.
• Decision boundaries can be over trained in absence of right examples.
• Difficulty in classifying big data.
62
October 14, 2023
EXAMPLE OF SUPERVISED
LEARNING ALGORITHMS
• Linear Regression
• k-Nearest Neighbor
• Naive Bayes
• Decision Trees
• Support Vector Machine (SVM)
• Random Forest
• Neural Networks (Deep learning)
63
October 14, 2023
October 14, 2023 64
Classification by Decision Tree Induction (DTI)
 DTI is the learning of decision trees from class_labeled training tuples.
 A decision tree is a flowchart-like tree structure, where each internal node
(non-leaf node) denotes a test on an attribute, each branch represents an
outcome of the test, and each leaf node (or terminal node) holds a class
label. The top most node is the root node.
 Why are DT Classifier so popular ?
 The construction of DT classifiers does not require any domain
knowledge or parameter setting, and therefore is appropriate for
exploratory knowledge discovery.
 DT can handle high dimensional data.
 Their representation of acquired knowledge in tree form is intuitive
and generally easy to assimilate by humans
 They have good accuracy.
 They may be used in medicine manufacturing, production, financial
analysis, astronomy and molecular biology.
October 14, 2023 65
Output: A Decision Tree for “buys_computer”
age?
overcast
student? credit rating?
<=30 >40
no yes yes
yes
31..40
fair
excellent
yes
no
October 14, 2023 66
Bayesian Classification: Why?
 A statistical classifier: performs probabilistic prediction, i.e., predicts
class membership probabilities( that a given tuple belongs to a particular
class)
 Foundation: Based on Bayes’ Theorem given by Thomas Bayes
 Performance: A simple Bayesian classifier, naïve Bayesian classifier,
has comparable performance with decision tree and selected neural
network classifiers.
 Class Conditional Independence : Naïve Bayesian Classifiers assume
that the effect of an attribute value on a given class is independent of the
values of the other attributes. This assumption is called class conditional
independence.
 Incremental: Each training example can incrementally
increase/decrease the probability that a hypothesis is correct — prior
knowledge can be combined with observed data
 Standard: Even when Bayesian methods are computationally
intractable, they can provide a standard of optimal decision making
against which other methods can be measured
 Bayesian Belief Network: are graphical models that allow the
representation of dependencies among subsets of attributes
October 14, 2023 67
Naïve Bayesian Classification
 Naïve Bayes classifier use all the attributes
 Two assumptions:
 –Attributes are equally important
 – Attributes are statistically independent
i.e., knowing the value of one attribute
says nothing about the value of another
 Equally important & independence assumptions
are never correct in real-life datasets
October 14, 2023 68
Bayesian Theorem: Basics
 Let X be a data sample (“evidence”): class label is unknown
 Let H be a hypothesis that X belongs to class C
 E.g. Our world of tuples is confined to customers described by the attributes age and
income. X is a35 year old customer with an income $40,000.
 Classification is to determine P(H|X), the probability that the hypothesis holds given
the observed data sample X. P(H|X) reflects the probability that customer X will buy a
computer given that we know the customer’s age and income.
 P(H) (prior probability), the initial probability
 E.g., X will buy computer, regardless of age, income, …
 P(X): prior probability of X. probability that sample data is observed( that a person
from our set of customers is 35 years old and earns $40,000
 P(X|H) (posteriori probability), the probability of observing the sample X, given that
the hypothesis holds
 E.g., Given that X will buy computer, the prob. that X is 31..40, medium income
October 14, 2023 69
Bayesian Theorem
 Given training data X, posteriori probability of a hypothesis H,
P(H|X), follows the Bayes theorem
 Informally, this can be written as
posteriori = likelihood x prior/evidence
 Predicts X belongs to Ci iff the probability P(Ci|X) is the
highest among all the P(Ck|X) for all the k classes
 Practical difficulty: require initial knowledge of many
probabilities, significant computational cost
)
(
)
(
)
|
(
)
|
(
X
X
X
P
H
P
H
P
H
P 
October 14, 2023 70
Artificial Neural Networks
 Artificial Neural Networks (ANN) Started by psychologists and neurobiologists to develop
and test computational analogues of neurons
 Other names:
1.Connectionist learning 2.Prediction by N N 3. Adaptive networks,
4. Neural computation 5.Parallel distributed processing 6. Collective computation
 Artificial neural networks components:
 Units : A neural network is composed of a number of nodes, or units. It is Metaphor for
nerve cell body
 Links: Units connected by links. Links represent synaptic connections from one unit to
another
 Weight : Each link has a numeric weight
October 14, 2023 71
Genetic Algorithms (GA)
 Genetic Algorithm: based on an analogy to biological evolution
 An initial population is created consisting of randomly generated rules
 Each rule is represented by a string of bits
 E.g., if A1 and ¬A2 then C2 can be encoded as 100
 If an attribute has k > 2 values, k bits can be used
 Based on the notion of survival of the fittest, a new population is formed to
consist of the fittest rules and their offsprings
 The fitness of a rule is represented by its classification accuracy on a set of
training examples
 Offsprings are generated by crossover and mutation
 The process continues until a population P evolves when each rule in P
satisfies a prespecified threshold
 Slow but easily parallelizable
October 14, 2023 72
Genetic Algorithms
 A Genetic Algorithm (GA) is a computational model
consisting of five parts:
 A starting set of individuals, P.
 Crossover: technique to combine two parents to
create offspring.
 Mutation: randomly change an individual.
 Fitness: determine the best individuals.
 Algorithm which applies the crossover and
mutation techniques to P iteratively using the
fitness function to determine the best
individuals in P to keep.
What is the Support Vector Machine?
 “Support Vector Machine” (SVM) is a
supervised machine learning algorithm that can be
used for both classification or regression
challenges. However, it is mostly used in
classification problems. In the SVM algorithm, we
plot each data item as a point in n-dimensional
space (where n is a number of features you have)
with the value of each feature being the value of a
particular coordinate. Then, we perform
classification by finding the hyper-plane that
differentiates the two classes very.
 Support Vectors are simply the coordinates of
individual observation. The SVM classifier is a
frontier that best segregates the two classes
(hyper-plane/ line).
73
October 14, 2023
2. UNSUPERVISED LEARNING
• Unsupervised learning feeds on unlabeled data.
• Supervised Learning needs to know the results and sort
out the data, whereas in unsupervised machine learning
algorithms the desired results are unknown and yet to
be defined.
• As no teacher is provided that means no training will
be given to the machine. Therefore machine is restricted
to find the hidden structure in unlabeled data by itself.
74
October 14, 2023
2. UNSUPERVISED LEARNING
• The unsupervised machine learning algorithm is used
for:
 exploring the structure of the information;
 extracting valuable insights;
 detecting patterns;
 descriptive modeling.
• Eg. I have photos and want to put them in 20 groups.
75
October 14, 2023
TYPES OF UNSUPERVISED LEARNING
76
October 14, 2023
TYPES OF UNSUPERVISED LEARNING
I. Clustering(Target Variable not available) –
• It is an exploration of data used to segment it into meaningful
groups (i.e., clusters) based on their internal patterns without
prior knowledge of group credentials.
• The credentials are defined by similarity of individual data
objects and also aspects of its dissimilarity from the rest.
• eg. Customer segmentation -grouping customers by purchasing
behavior.
77
October 14, 2023
TYPES OF UNSUPERVISED LEARNING
II. Association(Target Variable not available) –
• An association rule learning problem is where you want to
discover rules that describe large portions of your data, such as
people that buy X also tend to buy Y.
eg. Market Basket Analysis
78
October 14, 2023
EXAMPLE OF UNSUPERVISED
LEARNING ALGORITHMS
•PCA
•t-SNE
•k-means
•DBSCAN
•Apriori algorithm
• FP – Growth
Dimensionality reduction: There is a lot of noise in the incoming
data. Machine learning algorithms use dimensionality reduction to
remove this noise while distilling the relevant information.
79
October 14, 2023
REINFORCEMENT LEARNING
• Reinforcement learning is about taking suitable action to
maximize reward in a particular situation.
• It uses exploration/exploitation. Action takes place,
consequences are observed and the next action considers
the results of first action.
• In supervised learning, training data has answer key with it
so the model is trained with correct answer itself. Whereas,
in reinforcement learning there is no answer but agent
decides what to do to perform given task. In the absence of
a training dataset, it is bound to learn from its experience.
80
October 14, 2023
REINFORCEMENT LEARNING
• Agent is an assumed entity which performs actions in an
environment to gain some reward.
• Environment is a scenario that
 an agent has to face & gives
 feedback via positive or negative
 reward signal.
• State (s) is the current situation returned by the environment.
81
October 14, 2023
REINFORCEMENT LEARNING
• Two main types of reward signals are:
 Positive reward signal encourages continuing performance in a
particular sequence of action.
 Negative reward signal penalizes for performing certain activities
and urges to correct algorithm to stop getting penalties.
• However, the function of reward signal may vary
depending on the nature of information.
• Overall, the system tries to maximize positive rewards and
minimize the negatives.
 https://www.yotube.com/watch?v=KiHdKynXDtw 82
October 14, 2023
REINFORCEMENT LEARNING
• Input: It is initial state from which model will start.
• Output: There are many possible output as there are variety of solution to a
particular problem.
• Training: Training is based upon input, the model will return a state and
user will decide to reward or punish the model based on its output.
• The model keeps continues to learn.
• The best solution is decided based on the maximum reward.
83
October 14, 2023
REINFORCEMENT LEARNING
Various Practical applications of Reinforcement
Learning –
• RL can be used in robotics for industrial automation.
• RL can be used in machine learning and data processing.
• RL can be used to create training systems that provide
custom instruction and materials according to the
requirement of students.
84
October 14, 2023
REINFORCEMENT LEARNING
There are two important learning models
reinforcement learning:
• Markov Decision Process
• Q learning
85
October 14, 2023
APPLICATIONS OF
SUPERVISED LEARNING,
UNSUPERVISED LEARNING
& REINFORCEMENT
LEARNING
86
October 14, 2023
87
October 14, 2023
88
TYPES OF ML
88
UNSUPERVISED LEARNING
(Data Driven)
(Identitfy Clusters)
•Clustering
SVD
PCA
K - Means
•Dimensionality Reduction
Text Mining
Face Recognition
Big Data Visualization
Image Recognition
•Association Analysis
Apriori
FP – Growth
•Hidden Markov Model
REINFORCEMENT
LEARNING
(Learn from errors)
•Dynamic Programming
•Monte Carlo Tree Search
(MCTS)
•Heuristic Methods
•Q-Learning;
•Deep Adversarial
Networks
•Temporal Difference (TD)
•Asynchronous Actor-Critic
Agents (A3C)
SUPERVISED LEARNING
(Task Driven)
(Predict next values)
•Regression
Linear
Polynomial
•Decision Tree
•Random Forest
•Classification
KNN
Trees
Logistic Regression
NaiveBayes
SVM
October 14, 2023
STEPS TO SOLVE A MACHINE
LEARNING PROBLEM
89
Data Gathering Collect data from various sources
Data Preprocessing Clean data to have homogeneity
Feature Engineering Making your data more useful
Algorithm Selection &
Training
Selecting the right machine learning
model
Making Predictions Evaluate the model
October 14, 2023
1. Data Gathering
• Might depend on human work-
 Manual labeling for supervised learning.
 Domain knowledge. Maybe even experts.
• May come for free, or “sort of”
 E.g., Machine Translation.
• The more the better: Some algorithms need large amounts
of data to be useful (e.g., neural networks).
• Quantity and quality of data dictate model accuracy.
90
October 14, 2023
2. Data Preprocessing
• Is there anything wrong with the data?
 Missing values
 Outliers
 Bad encoding (for text)
 Wrongly-labeled examples
 Biased data
• Do I have many more
samples of one class than the rest?
• Need to fix/remove data?
91
October 14, 2023
3. Feature Engineering
• A feature is an individual measurable property of a
phenomenon being observed.
• Our inputs are represented by a set of features.
• To classify spam email, features could be:
 Number of words that have been ch4ng3d like this.
 Language of the email (0=English, 1=Spanish).
 Number of emojis.
92
October 14, 2023
3. Feature Engineering
• Extract more information from existing data-
 Make it more useful
 With good features, most algorithms can learn faster
• Requires thought and knowledge of the data
• Two steps:
 Variable transformation (e.g., dates into weekdays,
normalizing)
 Feature creation (e.g., n-grams for texts, if word is
capitalized to detect names, etc.) 93
October 14, 2023
4. Algorithm Selection & Training
94
• Supervised
•Linear classifier
•Naive Bayes
•Support Vector Machines
(SVM)
•Decision Tree
•Random Forests
•k-Nearest Neighbors
•Neural Networks (Deep
learning)
• Unsupervised
•PCA
•t-SNE
•k-means
•DBSCAN
•Apriori algorithm
•FP – Growth
• Reinforcement
•SARSA–λ
•Q-Learning
•Markov Decision
Process
October 14, 2023
4. Algorithm Selection & Training
• Goal of training: making the correct prediction as often as
possible .
• Incremental improvement:
• Use of metrics for evaluating performance and comparing
solutions.
• Hyperparameter tuning (A hyperparameter is a parameter whose value
is used to control the learning process)
95
October 14, 2023
5. Making Predictions
96
October 14, 2023
October 14, 2023 97
Type of data in clustering analysis
 Interval-Scaled Attributes
 Binary Attributes
 Nominal Attributes
 Ordinal Attributes
 Ratio-Scaled Attributes
 Attributes of Mixed Type
October 14, 2023 98
Data Types
Interval-Scaled Attributes
 Continuous measurements on a roughly
linear scale Example
Height Scale Weight Scale
1. Scale ranges over the
metre or foot scale
2. Need to standardize
heights as different scale
can be used to express
same absolute
measurement
1. Scale ranges over the
kilogram or pound scale
20kg
40kg
60kg 100kg
80kg 120kg
October 14, 2023 99
Binary Variables
 A contingency table for binary
data
 Distance measure for
symmetric binary variables:
 Distance measure for
asymmetric binary variables:
 Jaccard coefficient (similarity
measure for asymmetric binary
variables):
d
c
b
a
c
b
j
i
d





)
,
(
c
b
a
c
b
j
i
d




)
,
(
p
d
b
c
a
sum
d
c
d
c
b
a
b
a
sum




0
1
0
1
Object i
Object j
c
b
a
a
j
i
simJaccard



)
,
(
October 14, 2023 100
Nominal / Categorical Variables
 A generalization of the binary variable in that it can take more
than 2 states, e.g., red, yellow, blue, green
 Method 1: Simple matching
 m: # of matches, p: total # of variables
 Method 2: use a large number of binary variables
 creating a new binary variable for each of the M nominal
states
p
m
p
j
i
d 

)
,
(
October 14, 2023 Data Mining: Concepts and Techniques 101
Ratio-Scaled Variables
 Ratio-scaled variable: a positive measurement on a nonlinear
scale, approximately at exponential scale, such as AeBt or Ae-Bt
 Methods:
 treat them like interval-scaled variables—not a good
choice! (why?—the scale can be distorted)
 apply logarithmic transformation
yif = log(xif)
 treat them as continuous ordinal data treat their rank as
interval-scaled
S.No Machine Learning Deep Learning
1. Machine Learning is a superset of Deep Learning Deep Learning is a subset of Machine Learning
2.
The data represented in Machine Learning is
quite different as compared to Deep Learning as
it uses structured data
The data representation is used in Deep
Learning is quite different as it uses neural
networks(ANN).
3. Machine Learning is an evolution of AI
Deep Learning is an evolution to Machine
Learning. Basically it is how deep is the
machine learning.
4.
Machine learning consists of thousands of data
points.
Big Data: Millions of data points.
5.
Outputs: Numerical Value, like classification of
score
Anything from numerical values to free-form
elements, such as free text and sound.
6.
Uses various types of automated algorithms that
turn to model functions and predict future action
from data.
Uses neural network that passes data through
processing layers to the interpret data features
and relations.
7.
Algorithms are detected by data analysts to
examine specific variables in data sets.
Algorithms are largely self-depicted on data
analysis once they’re put into production.
8.
Machine Learning is highly used to stay in the
competition and learn new things.
Deep Learning solves complex machine
learning issues.
October 14, 2023 102

More Related Content

Similar to ML Unit 1 CS.ppt

Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1Srinivasan R
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learningbutest
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learningbutest
 
Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics coursegideymichael
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1JagadishPogu
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research OverviewKathirvel Ayyaswamy
 
Chapter01.ppt
Chapter01.pptChapter01.ppt
Chapter01.pptbutest
 
Machine Learning 1 - Introduction
Machine Learning 1 - IntroductionMachine Learning 1 - Introduction
Machine Learning 1 - Introductionbutest
 
Machine Learning- Introduction.pptx
Machine Learning- Introduction.pptxMachine Learning- Introduction.pptx
Machine Learning- Introduction.pptxDrSakthiMalaK
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfSisayNegash4
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.pptSeshuSrinivas2
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inferencebutest
 
Unit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxUnit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxjawad184956
 
Basic Notions of Learning, Introduction to Learning ...
Basic Notions of Learning, Introduction to Learning ...Basic Notions of Learning, Introduction to Learning ...
Basic Notions of Learning, Introduction to Learning ...butest
 
Machine Learning for Begineers First .pptx
Machine Learning for Begineers First .pptxMachine Learning for Begineers First .pptx
Machine Learning for Begineers First .pptxFarhanaMariyam1
 
Machine Learning in Finance
Machine Learning in FinanceMachine Learning in Finance
Machine Learning in FinanceHamed Vaheb
 

Similar to ML Unit 1 CS.ppt (20)

ML_Lecture_1.ppt
ML_Lecture_1.pptML_Lecture_1.ppt
ML_Lecture_1.ppt
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics course
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1
 
Module 1.pdf
Module 1.pdfModule 1.pdf
Module 1.pdf
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
 
Chapter01.ppt
Chapter01.pptChapter01.ppt
Chapter01.ppt
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning 1 - Introduction
Machine Learning 1 - IntroductionMachine Learning 1 - Introduction
Machine Learning 1 - Introduction
 
Machine Learning- Introduction.pptx
Machine Learning- Introduction.pptxMachine Learning- Introduction.pptx
Machine Learning- Introduction.pptx
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
 
chapter1-introduction1.ppt
chapter1-introduction1.pptchapter1-introduction1.ppt
chapter1-introduction1.ppt
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
Unit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxUnit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptx
 
Basic Notions of Learning, Introduction to Learning ...
Basic Notions of Learning, Introduction to Learning ...Basic Notions of Learning, Introduction to Learning ...
Basic Notions of Learning, Introduction to Learning ...
 
Machine Learning for Begineers First .pptx
Machine Learning for Begineers First .pptxMachine Learning for Begineers First .pptx
Machine Learning for Begineers First .pptx
 
Machine Learning in Finance
Machine Learning in FinanceMachine Learning in Finance
Machine Learning in Finance
 
ML
MLML
ML
 

Recently uploaded

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Recently uploaded (20)

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 

ML Unit 1 CS.ppt

  • 1. UNIT 1 Introduction 1. Tom M. Mitchell,―Machine Learning, McGraw-Hill Education (India) Private Limited, 2013. 2. Ethem Alpaydin,―Introduction to Machine Learning (Adaptive Computation and Machine Learning), The MIT Press 2004. 3. Stephen Marsland, ―Machine Learning: An Algorithmic Perspective, CRC Press, 2009. 4. Bishop, C., Pattern Recognition and Machine Learning. Berlin: Springer- Verlag. 1 KCS 055/ KOE 073: Machine Learning Introduction Anurag Malik (Associate Prof. CS & E) CS & E Dept. M.I.T Moradabad B.Tech V/ VII CS /ME Recommended Books: October 14, 2023
  • 4. What is Learning?  “Learning denotes changes in a system that ... enable a system to do the same task … more efficiently the next time.” - Herbert Simon • Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes and preferences. • The ability to learn is possessed by humans, animals, and some machines.  “Learning is making useful changes in our minds.” - Marvin Minsky • Some learning is immediate, induced by a single event (e.g. being burned by a hot stove), but much skill and knowledge accumulates from repeated experiences. October 14, 2023 4
  • 5. Types of Learning 1. Visual (Spatial) :By representing information and with images, students are able to focus on meaning, such as architecture, engineering, project management, or design. 2. Aural (Auditory-Musical): If you need someone to tell you something out loud to understand it, you are an auditory learner. such as musician, recording engineer, speech pathologist, or language teacher. 3. Verbal (Linguistic): People who find it easier to express themselves by writing or speaking can be regarded as a verbal learner. 4. Physical (Kinesthetic) :In this style, learning happens when the learner carries out a physical activity, rather than listening to a lecture or watching a demonstration. 5 October 14, 2023
  • 6. Types of Learning (Cont…) 5. Logical (Mathematical) :When you like using your brain for logical and mathematical reasoning, you’re a logical learner. You easily recognise patterns and can connect seemingly meaningless concepts easily. such as scientific research, accountancy, bookkeeping or computer programming. 6. Social (Interpersonal) : If you’re at best in socializing and communicating with people, both verbally and non-verbally, this is what you are; a social learner. People often come to you to listen and ask for advice. counseling, teaching, training and coaching, sales, politics, and human resources among others. 6 October 14, 2023
  • 7. Related Fields Machine learning is primarily concerned with the accuracy and effectiveness of the computer system. psychological models data mining cognitive science decision theory information theory databases machine learning neuroscience statistics evolutionary models control theory 7
  • 8. Well – Posed Learning Problems  Learning can be defined through a computer program that improves its performance at some task through experience.  Definition of Learning: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.  Lets have some examples of Well Posed Learning Problems  Learn to Play Checkers  Learn to recognize spoken words (SPHINX System)  Learning to drive an autonomous vehicle (ALVINN System)  Learning to classify new astronomical structures  Predict recovery rates of pneumonia patients  Detect fraudulent use of credit cards 8 October 14, 2023
  • 9. Well – Posed Learning Problems  Three features: the class of tasks, the measure of performance to be improved, and the source of experience.  A checkers learning problem:  Task T: playing checkers  Performance measure P: percent of games won against opponents  Training experience E: playing practice games against itself  We can specify many learning problems in this fashion, such as learning to recognize handwritten words, or learning to drive a robotic automobile autonomously.  A handwriting recognition learning problem:  Task T: recognizing and classifying handwritten words within images  Performance measure P: percent of words correctly classified  Training experience E: a database of handwritten words with given classifications 9 October 14, 2023
  • 10. Well – Posed Learning Problems  A robot driving learning problem: Task T: driving on public four-lane highways using vision sensors Performance measure P: average distance traveled before an error (as judged by human overseer) Training experience E: a sequence of images and steering commands recorded while observing a human driver 10 October 14, 2023
  • 11. 11 DESIGNING A LEARNING SYSTEM 1. Choosing the Training Experience 2. Choosing the Target Function 3. Choosing a Representation for the Target Function 4. Choosing a Function Approximation Algorithm 5. The Final Design October 14, 2023
  • 12. Designing a Learning System  While designing a Learning system various design issues and approaches must be consider. 1. Choosing the Training Experience: The first design choice we face is to choose the type of training experience from which our system will learn. The type of training experience available can have a significant impact on success or failure of the learner.  One key attribute is whether the training experience provides direct or indirect feedback regarding the choices made by the performance system.  A second important attribute of the training experience is the degree to which the learner controls the sequence of training examples.  A third important attribute of the training experience is how well it represents the distribution of examples over which the final system performance P must be measured. 12 October 14, 2023
  • 13. 13 Designing a Learning System A checkers learning problem: Task T: Playing checkers (draughts) Performance Measures P: percent of games won in world tournament Training Experience E: games played against itself What experience? What exactly should be learned? How shall it be represented? What specific algorithm to learn it? October 14, 2023
  • 14. 14 Direct versus Indirect Learning 1. Individual checkers board states and correct move for each 2. Move sequences and final outcomes of various games played Credit assignment problem - the degree to which each move in the sequence deserves credit or blame for the final outcome - game can be lost even when early moves are optimal, if these are followed later by poor moves or vice versa October 14, 2023
  • 15. 15 Teacher or not? Degree to which learner controls the sequence of training examples 1. Teacher selects informative board states & provides the correct moves 2. For each proposed board state the learner finds particularly confusing it asks the teacher for correct move 3. Learner may have complete control as it does when it learns by playing itself with no teacher - learner may choose between experimenting with novel board states or honing its skill by playing minor variations of promising lines of play October 14, 2023
  • 16. 16 1. Choose Training Experience How well training experience represents the distribution of examples over which the final system performance P must be measured P is percent of games in the world tournament, obvious danger when E consists of only games played against itself (probably can’t get world champion to teach computer!) Most current theories of machine learning assume that the distribution of training examples is identical to the distribution of test examples It is IMPORTANT to keep in mind that this assumption must often by violated in practice. E: play games against itself (advantage of getting a lot of data this way) October 14, 2023
  • 17. 17 2. Choose a Target Function The next design choice is to determine exactly what type of knowledge will be learned and how this will be used by the performance program. ChooseMove: B -> M where B is any legal board state and M is a legal move (hopefully the “best” legal move) Alternatively, function V: B ->  which maps from B to some real value where higher scores are assigned to better board states Now use the legal moves to generate every subsequent board state and use V to choose the best one and therefore the best legal move October 14, 2023
  • 18. 18 Choose a Target Function II Let us define the target value V(b) for an arbitrary board state b in B, as follows V(b) = 100, if b is a final board state that is won V(b) = -100, if b is a final board state that is lost V(b) = 0, if b is a final board state that is a draw V(b) = V(b´), if b is not a final state where b´ is the best final board state starting from b assuming both players play optimally October 14, 2023
  • 19. 3. Choosing a Representation for the Target Function  Given the ideal target function V, we will choose a representation that the learning system will use to describe V' that it will learn.  The function V' will be calculated as a linear combination of the following board features:  xl: the number of black pieces on the board  x2: the number of red pieces on the board  x3: the number of black kings on the board  x4: the number of red kings on the board  x5: the number of black pieces threatened by red (which can be captured on red's next turn)  x6: the number of red pieces threatened by black 19 October 14, 2023
  • 20. 3. Choosing a Representation for the Target Function  Thus, learning program will represent V'(b) as a linear function of the form:  V'(b) = w0+ w1x1+ w2x2+ w3x3+ w4x4+ w5x5+ w6x6  where wi is the numerical coefficient or weight to determine the relative importance of the various board features and xi is the number of i-th objects on the board.  where w0 through w6 are numerical coefficients or weights to be chosen by the learning algorithm 20 October 14, 2023
  • 21. 21 Design So Far T: Checkers P: percent of games won in world tournament E: games played against self V: Board ->  Target Function Representation: V´(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6 October 14, 2023
  • 22. 22 4. Choose Function Approximation Algorithm  In order to learn the target function f we require a set of training examples, each describing a specific board state b and the training value Vtrain(b) for b.  In other words, each training example is an ordered pair of the form (b,Vtrain(b)).  First need Set of training examples <b,Vtrain(b)>  For instance, the following training example describes a board state b in which black has won the game (note x2 = 0 indicates that red has no remaining pieces) and for which the target function value Vtrain(b) is therefore +100. <(x1=3,x2=0,x3=1,x4=0,x5=0,x6=0),+100> because x2=0 a) Estimating Training Values: b) Adjusting the weights: October 14, 2023
  • 23. 5. The Final Design  The final design of our checkers learning system can be naturally described by four distinct program modules that represent the central components in many learning systems. These four modules 1. Performance System: Solve performance task using learned target function(s). It takes instance of new problem as input and a trace of its solution (history) as output. 2. Critic: Take history of problem as input and produce a set of training examples of target function as output. 3. Generalizer: Take training examples as input and produce estimate of target function as output hypothesis. It generalizes from specific training examples, hypothesizing a general function that covers all examples. 4. Experiment Generator: Take current hypothesis (currently learned function) as input and outputs a new problem (i.e., initial board state) for Performance System to explore. Its role is to pick new practice problems that will maximize the learning rate of the overall system. 23 October 14, 2023
  • 24. 5. The Final Design Fig. Final Design of checkers learner problem 24 October 14, 2023
  • 25. What is Machine Learning?  A branch of artificial intelligence, concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data.  As intelligence requires knowledge, it is necessary for the computers to acquire knowledge.  “Machine learning refers to a system capable of the autonomous acquisition and integration of knowledge.”  https://www.youtube.com/watch?v=Cx5aNwnZYDc  https://www.youtube.com/watch?v=YhSeTEumjVA  https://www.youtube.com/watch?v=ZoemTySxFso October 14, 2023 25
  • 26. Machine Learning Paradigms  rote learning  learning by being told (advice-taking)  learning from examples (induction)  learning by analogy  speed-up learning  concept learning  clustering  discovery 26 October 14, 2023
  • 27. Why Machine Learning?  No human experts  industrial/manufacturing control  mass spectrometer analysis, drug design, astronomic discovery  Black-box human expertise  face/handwriting/speech recognition  driving a car, flying a plane  Rapidly changing phenomena  credit scoring, financial modeling  diagnosis, fraud detection  Need for customization/personalization  personalized news reader  movie/book recommendation  Recent progress in algorithms and theory  Growing Flood of online data  Computational power is available October 14, 2023 27
  • 28. AI vs ML vs DL October 14, 2023 28
  • 30. S. No. Data Science Machine Learning 1. Data Science is a field about processes and systems to extract data from structured and semi-structured data. Machine Learning is a field of study that gives computers the capability to learn without being explicitly programmed. 2. Need the entire analytics universe. Combination of Machine and Data Science. 3. Branch that deals with data. Machines utilize data science techniques to learn about the data. 4. Data in Data Science maybe or maybe not evolved from a machine or mechanical process. It uses various techniques like regression and supervised clustering. 5. Data Science as a broader term not only focuses on algorithms statistics but also takes care of the data processing. But it is only focused on algorithm statistics. 6. It is a broad term for multiple disciplines. It fits within data science. 7. Many operations of data science that is, data gathering, data cleaning, data manipulation, etc. It is three types: Unsupervised learning, Reinforcement learning, Supervised learning. 8. Example: Netflix uses Data Science technology. Example: Facebook uses Machine Learning technology. 30 October 14, 2023
  • 31. Tools used for AI,ML and Deep Learning October 14, 2023 31
  • 32. Continue…… 1. Tensorflow  TensorFlow is basically an open source software library that is used for numerical computation with the help of data flow graph. It came into sight by the dedicated efforts of engineers and researchers working on the Google Brain Team. The flexible architecture of Tensorflow allows you to deploy computation to multiple GPUs or CPUs in a server/mobile device/desktop by using just a single API. 2. IBM Watson  IBM has been a viking in the field of Artificial Intelligence as it is working on this technology for a very long time. The company has its own AI platform named Watson that comes housing numerous AI Tools for both business users and developers. Watson is available as a set of open APIs, by which users can simply access a lot of starter kits and sample codes. Users can use them to make virtual agents and cognitive search engines. Moreover, the cherry on the cake for Watson is its chatbot building platform that is developed focusing on beginners and requires little machine learning skills. 3. Caffe  Caffe is a deep learning C++ framework that has been developed keeping modularity, expression, and speed in mind. Talking about its working, Caffe’s focus remains stable on Convolutional Networks for computer vision applications. October 14, 2023 32
  • 33. Continue…… 4. Deeplearning4j  Deeplearning4j is termed as the first open-source, commercial grade, distributed deep learning library developed for Scala and Java. It's easy to use infrastructure makes it a panacea for non-researchers. The most fascinating quality of DL4J is that it can import neural net models from many major frameworks via Keras, which include Theano, Caffe, and TensorFlow. 5. Torch  Torch is also an open source machine learning library, which is being used by many giant IT firms including Yandex, IBM, Idiap Research Institute, & Facebook AI Research Group. It can also be termed as a scientific computing framework and a script language that is based on Lua programming language. After its successful execution on web platforms, Torch has also been extended for the use on iOS and Android. October 14, 2023 33
  • 35. Training and Testing Training set (observed) Universal set (unobserved) Testing set (unobserved) Data acquisition Practical usage October 14, 2023 35
  • 36. Training and Testing  Training is the process of making the system able to learn.  No free lunch rule:  Training set and testing set come from the same distribution  Need to make some assumptions or bias October 14, 2023 36
  • 38. EXAMPLES OF ML •Personalization: Online services like Amazon/Netflix use AI to personalize our experience. They learn from our, other users previous purchases and recommend relevant content for us. •Image recognition: ML can be used for face detection in an image. There is a separate category for each person in a database of several people. •Medical diagnoses: ML is trained to recognize cancerous tissues. 38 October 14, 2023
  • 39. •Speech Recognition: It is translation of spoken words in text. It is used in voice searches and more. Voice user interfaces include voice dialing, call routing, and appliance control. (also Natural language processing) •Data mining: The application of ML methods to large databases. •Fraud detection: Banks use AI to determine strange activity on our account. Unexpected activity, such as foreign transactions, could be flagged by the algorithm. 39 October 14, 2023
  • 41. KDD Process October 14, 2023 41 Selection: Obtain data from various sources. Preprocessing: Cleanse data. Transformation: Convert to common format. Transform to new format. Data Mining: Obtain desired results. Interpretation/Evaluation: Present results to user in meaningful manner.
  • 42. October 14, 2023 42 KDD Process: Several Key Steps Many people treat data mining as a synonym for another popularly used term, Knowledge Discovery from Data, or KDD. Alternatively, others view data mining as simply an essential step in the process of knowledge discovery. Knowledge discovery as a process is depicted in Figure 1.4 and consists of an iterative sequence of the following steps: 1. Data cleaning (to remove noise and inconsistent data) 2. Data integration (where multiple data sources may be combined) 3. Data selection (where data relevant to the analysis task are retrieved from the database) 4. Data transformation (where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance) 5. Data mining (an essential process where intelligent methods are applied in order to extract data patterns) 6. Pattern evaluation (to identify the truly interesting patterns representing knowledge based on some interestingness measures. 7. Knowledge presentation (where visualization and knowledge representation techniques are used to present the mined knowledge to the user) Steps 1 to 4 are different forms of data preprocessing, where the data are prepared for mining. The data mining step may interact with the user or a knowledge base. The interesting patterns are presented to the user and may be stored as new knowledge in the knowledge base
  • 43. •Email filtering: Email services use AI to filter incoming emails. Users can train their spam filters by marking emails as spam. •Prediction: ML can be used in prediction systems. Considering the loan example, to compute the probability of a fault, the system will need to classify the available data in groups. •Computer vision, Computational biology, Robot control, Handwriting recognition 43 October 14, 2023
  • 45. History of ML  1950 — Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human.  1952 — Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program.  1957 — Frank Rosenblatt designed the first neural network for computers (the perceptron), which simulate the thought processes of the human brain.  1967 — The “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city but ensuring they visit all cities during a short tour.  1979 — Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own.  1981 — Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data.  45 October 14, 2023
  • 46. History of ML  1985 — Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does.  1990s — Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results.  1997 — IBM’s Deep Blue beats the world champion at chess.  2006 — Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers “see” and distinguish objects and text in images and videos.  2010 — The Microsoft Kinect can track 20 human features at a rate of 30 times per second, allowing people to interact with the computer via movements and gestures.  2011 — IBM’s Watson beats its human competitors at Jeopardy.  2011 — Google Brain is developed, and its deep neural network can learn to discover and categorize objects much the way a cat does. 46 October 14, 2023
  • 47. History of ML  2012 – Google’s X Lab develops a machine learning algorithm that is able to autonomously browse YouTube videos to identify the videos that contain cats.  2014 – Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can.  2015 – Amazon launches its own machine learning platform.  2015 – Microsoft creates the Distributed Machine Learning Toolkit, which enables the efficient distribution of machine learning problems across multiple computers.  2015 – Over 3,000 AI and Robotics researchers, endorsed by Stephen Hawking, Elon Musk and Steve Wozniak (among many others), sign an open letter warning of the danger of autonomous weapons which select and engage targets without human intervention.  2016 – Google’s artificial intelligence algorithm beats a professional player at the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess. The AlphaGo algorithm developed by Google DeepMind managed to win five games out of five in the Go competition. 47 October 14, 2023
  • 48. Some Issues in Machine Learning  What algorithms can approximate functions well (and when)?  How does number of training examples influence accuracy?  How does complexity of hypothesis representation impact it?  How does noisy data influence accuracy?  What are the theoretical limits of learnability?  How can prior knowledge of learner help?  What clues can we get from biological learning systems?  How can systems alter their own representations?  Understanding Which Processes Need Automation.  Lack of Quality Data.  Inadequate Infrastructure.  Implementation.  Lack of Skilled Resources. 48 October 14, 2023
  • 49. TRADITIONAL PROGRAMMING VS ML 49 October 14, 2023
  • 52. TYPES OF ML Using data for answering questions Training Predicting 52 October 14, 2023
  • 53. Supervised vs. Unsupervised Learning  Supervised learning (classification)  Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations  New data is classified based on the training set  No new class is generated  Unsupervised learning (clustering)  The class labels of training data is unknown  Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data  New classes can be generated. October 14, 2023 53
  • 54. 1. SUPERVISED LEARNING • Supervised Learning Algorithms are the ones that involve direct supervision of the operation. • Developer labels sample data and set strict boundaries upon which the algorithm operates. • Learn through examples of which we know the desired output (what we want to predict). • The primary purpose of supervised learning is to scale the scope of data and to make predictions of unavailable, future or unseen data based on labeled sample data 54 October 14, 2023
  • 55. 1. SUPERVISED LEARNING • It is a spoon-fed version of machine learning:  you select what kind of information output (samples) to “feed” the algorithm;  what kind of results it is desired (for example “yes/no” or “true/false”). • Example:  Is this a cat or a dog?  Are these emails spam or not?  Predict the market value of houses, given the square meters, number of rooms, neighborhood, etc. 55 October 14, 2023
  • 59. TYPES OF SUPERVISED LEARNING 59 • Classification separates the data, Regression fits the data. October 14, 2023
  • 60. TYPES OF SUPERVISED LEARNING I. Classification (Categorial Target Variable) – • Classification is the process where incoming data is labeled based on past data samples and manually trains the algorithm to recognize certain types of objects and categorize them accordingly. • The system has to know how to differentiate types of information, perform an optical character, image, or binary recognition (whether a particular bit of data is compliant or non- compliant to specific requirements in a manner of “yes” or “no”). • eg. Medical Imaging. 60 October 14, 2023
  • 61. TYPES OF SUPERVISED LEARNING II. Regression (Continuous Target Variable) • Regression is the process of identifying patterns and calculating the predictions of continuous outcomes. • The system has to understand the numbers, their values, grouping (for example, heights and widths), etc. • eg. Housing Price Prediction 61 October 14, 2023
  • 62. PROS & CONS OF SUPERVISED LEARNING  PROS • It allows to collect and produce data from previous experience. • It is more trustworthy compared to unsupervised learning, which can be computationally complex and less accurate in some instances.  CONS • Concrete examples are required for training classifiers. • Decision boundaries can be over trained in absence of right examples. • Difficulty in classifying big data. 62 October 14, 2023
  • 63. EXAMPLE OF SUPERVISED LEARNING ALGORITHMS • Linear Regression • k-Nearest Neighbor • Naive Bayes • Decision Trees • Support Vector Machine (SVM) • Random Forest • Neural Networks (Deep learning) 63 October 14, 2023
  • 64. October 14, 2023 64 Classification by Decision Tree Induction (DTI)  DTI is the learning of decision trees from class_labeled training tuples.  A decision tree is a flowchart-like tree structure, where each internal node (non-leaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a class label. The top most node is the root node.  Why are DT Classifier so popular ?  The construction of DT classifiers does not require any domain knowledge or parameter setting, and therefore is appropriate for exploratory knowledge discovery.  DT can handle high dimensional data.  Their representation of acquired knowledge in tree form is intuitive and generally easy to assimilate by humans  They have good accuracy.  They may be used in medicine manufacturing, production, financial analysis, astronomy and molecular biology.
  • 65. October 14, 2023 65 Output: A Decision Tree for “buys_computer” age? overcast student? credit rating? <=30 >40 no yes yes yes 31..40 fair excellent yes no
  • 66. October 14, 2023 66 Bayesian Classification: Why?  A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities( that a given tuple belongs to a particular class)  Foundation: Based on Bayes’ Theorem given by Thomas Bayes  Performance: A simple Bayesian classifier, naïve Bayesian classifier, has comparable performance with decision tree and selected neural network classifiers.  Class Conditional Independence : Naïve Bayesian Classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is called class conditional independence.  Incremental: Each training example can incrementally increase/decrease the probability that a hypothesis is correct — prior knowledge can be combined with observed data  Standard: Even when Bayesian methods are computationally intractable, they can provide a standard of optimal decision making against which other methods can be measured  Bayesian Belief Network: are graphical models that allow the representation of dependencies among subsets of attributes
  • 67. October 14, 2023 67 Naïve Bayesian Classification  Naïve Bayes classifier use all the attributes  Two assumptions:  –Attributes are equally important  – Attributes are statistically independent i.e., knowing the value of one attribute says nothing about the value of another  Equally important & independence assumptions are never correct in real-life datasets
  • 68. October 14, 2023 68 Bayesian Theorem: Basics  Let X be a data sample (“evidence”): class label is unknown  Let H be a hypothesis that X belongs to class C  E.g. Our world of tuples is confined to customers described by the attributes age and income. X is a35 year old customer with an income $40,000.  Classification is to determine P(H|X), the probability that the hypothesis holds given the observed data sample X. P(H|X) reflects the probability that customer X will buy a computer given that we know the customer’s age and income.  P(H) (prior probability), the initial probability  E.g., X will buy computer, regardless of age, income, …  P(X): prior probability of X. probability that sample data is observed( that a person from our set of customers is 35 years old and earns $40,000  P(X|H) (posteriori probability), the probability of observing the sample X, given that the hypothesis holds  E.g., Given that X will buy computer, the prob. that X is 31..40, medium income
  • 69. October 14, 2023 69 Bayesian Theorem  Given training data X, posteriori probability of a hypothesis H, P(H|X), follows the Bayes theorem  Informally, this can be written as posteriori = likelihood x prior/evidence  Predicts X belongs to Ci iff the probability P(Ci|X) is the highest among all the P(Ck|X) for all the k classes  Practical difficulty: require initial knowledge of many probabilities, significant computational cost ) ( ) ( ) | ( ) | ( X X X P H P H P H P 
  • 70. October 14, 2023 70 Artificial Neural Networks  Artificial Neural Networks (ANN) Started by psychologists and neurobiologists to develop and test computational analogues of neurons  Other names: 1.Connectionist learning 2.Prediction by N N 3. Adaptive networks, 4. Neural computation 5.Parallel distributed processing 6. Collective computation  Artificial neural networks components:  Units : A neural network is composed of a number of nodes, or units. It is Metaphor for nerve cell body  Links: Units connected by links. Links represent synaptic connections from one unit to another  Weight : Each link has a numeric weight
  • 71. October 14, 2023 71 Genetic Algorithms (GA)  Genetic Algorithm: based on an analogy to biological evolution  An initial population is created consisting of randomly generated rules  Each rule is represented by a string of bits  E.g., if A1 and ¬A2 then C2 can be encoded as 100  If an attribute has k > 2 values, k bits can be used  Based on the notion of survival of the fittest, a new population is formed to consist of the fittest rules and their offsprings  The fitness of a rule is represented by its classification accuracy on a set of training examples  Offsprings are generated by crossover and mutation  The process continues until a population P evolves when each rule in P satisfies a prespecified threshold  Slow but easily parallelizable
  • 72. October 14, 2023 72 Genetic Algorithms  A Genetic Algorithm (GA) is a computational model consisting of five parts:  A starting set of individuals, P.  Crossover: technique to combine two parents to create offspring.  Mutation: randomly change an individual.  Fitness: determine the best individuals.  Algorithm which applies the crossover and mutation techniques to P iteratively using the fitness function to determine the best individuals in P to keep.
  • 73. What is the Support Vector Machine?  “Support Vector Machine” (SVM) is a supervised machine learning algorithm that can be used for both classification or regression challenges. However, it is mostly used in classification problems. In the SVM algorithm, we plot each data item as a point in n-dimensional space (where n is a number of features you have) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiates the two classes very.  Support Vectors are simply the coordinates of individual observation. The SVM classifier is a frontier that best segregates the two classes (hyper-plane/ line). 73 October 14, 2023
  • 74. 2. UNSUPERVISED LEARNING • Unsupervised learning feeds on unlabeled data. • Supervised Learning needs to know the results and sort out the data, whereas in unsupervised machine learning algorithms the desired results are unknown and yet to be defined. • As no teacher is provided that means no training will be given to the machine. Therefore machine is restricted to find the hidden structure in unlabeled data by itself. 74 October 14, 2023
  • 75. 2. UNSUPERVISED LEARNING • The unsupervised machine learning algorithm is used for:  exploring the structure of the information;  extracting valuable insights;  detecting patterns;  descriptive modeling. • Eg. I have photos and want to put them in 20 groups. 75 October 14, 2023
  • 76. TYPES OF UNSUPERVISED LEARNING 76 October 14, 2023
  • 77. TYPES OF UNSUPERVISED LEARNING I. Clustering(Target Variable not available) – • It is an exploration of data used to segment it into meaningful groups (i.e., clusters) based on their internal patterns without prior knowledge of group credentials. • The credentials are defined by similarity of individual data objects and also aspects of its dissimilarity from the rest. • eg. Customer segmentation -grouping customers by purchasing behavior. 77 October 14, 2023
  • 78. TYPES OF UNSUPERVISED LEARNING II. Association(Target Variable not available) – • An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y. eg. Market Basket Analysis 78 October 14, 2023
  • 79. EXAMPLE OF UNSUPERVISED LEARNING ALGORITHMS •PCA •t-SNE •k-means •DBSCAN •Apriori algorithm • FP – Growth Dimensionality reduction: There is a lot of noise in the incoming data. Machine learning algorithms use dimensionality reduction to remove this noise while distilling the relevant information. 79 October 14, 2023
  • 80. REINFORCEMENT LEARNING • Reinforcement learning is about taking suitable action to maximize reward in a particular situation. • It uses exploration/exploitation. Action takes place, consequences are observed and the next action considers the results of first action. • In supervised learning, training data has answer key with it so the model is trained with correct answer itself. Whereas, in reinforcement learning there is no answer but agent decides what to do to perform given task. In the absence of a training dataset, it is bound to learn from its experience. 80 October 14, 2023
  • 81. REINFORCEMENT LEARNING • Agent is an assumed entity which performs actions in an environment to gain some reward. • Environment is a scenario that  an agent has to face & gives  feedback via positive or negative  reward signal. • State (s) is the current situation returned by the environment. 81 October 14, 2023
  • 82. REINFORCEMENT LEARNING • Two main types of reward signals are:  Positive reward signal encourages continuing performance in a particular sequence of action.  Negative reward signal penalizes for performing certain activities and urges to correct algorithm to stop getting penalties. • However, the function of reward signal may vary depending on the nature of information. • Overall, the system tries to maximize positive rewards and minimize the negatives.  https://www.yotube.com/watch?v=KiHdKynXDtw 82 October 14, 2023
  • 83. REINFORCEMENT LEARNING • Input: It is initial state from which model will start. • Output: There are many possible output as there are variety of solution to a particular problem. • Training: Training is based upon input, the model will return a state and user will decide to reward or punish the model based on its output. • The model keeps continues to learn. • The best solution is decided based on the maximum reward. 83 October 14, 2023
  • 84. REINFORCEMENT LEARNING Various Practical applications of Reinforcement Learning – • RL can be used in robotics for industrial automation. • RL can be used in machine learning and data processing. • RL can be used to create training systems that provide custom instruction and materials according to the requirement of students. 84 October 14, 2023
  • 85. REINFORCEMENT LEARNING There are two important learning models reinforcement learning: • Markov Decision Process • Q learning 85 October 14, 2023
  • 86. APPLICATIONS OF SUPERVISED LEARNING, UNSUPERVISED LEARNING & REINFORCEMENT LEARNING 86 October 14, 2023
  • 88. 88 TYPES OF ML 88 UNSUPERVISED LEARNING (Data Driven) (Identitfy Clusters) •Clustering SVD PCA K - Means •Dimensionality Reduction Text Mining Face Recognition Big Data Visualization Image Recognition •Association Analysis Apriori FP – Growth •Hidden Markov Model REINFORCEMENT LEARNING (Learn from errors) •Dynamic Programming •Monte Carlo Tree Search (MCTS) •Heuristic Methods •Q-Learning; •Deep Adversarial Networks •Temporal Difference (TD) •Asynchronous Actor-Critic Agents (A3C) SUPERVISED LEARNING (Task Driven) (Predict next values) •Regression Linear Polynomial •Decision Tree •Random Forest •Classification KNN Trees Logistic Regression NaiveBayes SVM October 14, 2023
  • 89. STEPS TO SOLVE A MACHINE LEARNING PROBLEM 89 Data Gathering Collect data from various sources Data Preprocessing Clean data to have homogeneity Feature Engineering Making your data more useful Algorithm Selection & Training Selecting the right machine learning model Making Predictions Evaluate the model October 14, 2023
  • 90. 1. Data Gathering • Might depend on human work-  Manual labeling for supervised learning.  Domain knowledge. Maybe even experts. • May come for free, or “sort of”  E.g., Machine Translation. • The more the better: Some algorithms need large amounts of data to be useful (e.g., neural networks). • Quantity and quality of data dictate model accuracy. 90 October 14, 2023
  • 91. 2. Data Preprocessing • Is there anything wrong with the data?  Missing values  Outliers  Bad encoding (for text)  Wrongly-labeled examples  Biased data • Do I have many more samples of one class than the rest? • Need to fix/remove data? 91 October 14, 2023
  • 92. 3. Feature Engineering • A feature is an individual measurable property of a phenomenon being observed. • Our inputs are represented by a set of features. • To classify spam email, features could be:  Number of words that have been ch4ng3d like this.  Language of the email (0=English, 1=Spanish).  Number of emojis. 92 October 14, 2023
  • 93. 3. Feature Engineering • Extract more information from existing data-  Make it more useful  With good features, most algorithms can learn faster • Requires thought and knowledge of the data • Two steps:  Variable transformation (e.g., dates into weekdays, normalizing)  Feature creation (e.g., n-grams for texts, if word is capitalized to detect names, etc.) 93 October 14, 2023
  • 94. 4. Algorithm Selection & Training 94 • Supervised •Linear classifier •Naive Bayes •Support Vector Machines (SVM) •Decision Tree •Random Forests •k-Nearest Neighbors •Neural Networks (Deep learning) • Unsupervised •PCA •t-SNE •k-means •DBSCAN •Apriori algorithm •FP – Growth • Reinforcement •SARSA–λ •Q-Learning •Markov Decision Process October 14, 2023
  • 95. 4. Algorithm Selection & Training • Goal of training: making the correct prediction as often as possible . • Incremental improvement: • Use of metrics for evaluating performance and comparing solutions. • Hyperparameter tuning (A hyperparameter is a parameter whose value is used to control the learning process) 95 October 14, 2023
  • 97. October 14, 2023 97 Type of data in clustering analysis  Interval-Scaled Attributes  Binary Attributes  Nominal Attributes  Ordinal Attributes  Ratio-Scaled Attributes  Attributes of Mixed Type
  • 98. October 14, 2023 98 Data Types Interval-Scaled Attributes  Continuous measurements on a roughly linear scale Example Height Scale Weight Scale 1. Scale ranges over the metre or foot scale 2. Need to standardize heights as different scale can be used to express same absolute measurement 1. Scale ranges over the kilogram or pound scale 20kg 40kg 60kg 100kg 80kg 120kg
  • 99. October 14, 2023 99 Binary Variables  A contingency table for binary data  Distance measure for symmetric binary variables:  Distance measure for asymmetric binary variables:  Jaccard coefficient (similarity measure for asymmetric binary variables): d c b a c b j i d      ) , ( c b a c b j i d     ) , ( p d b c a sum d c d c b a b a sum     0 1 0 1 Object i Object j c b a a j i simJaccard    ) , (
  • 100. October 14, 2023 100 Nominal / Categorical Variables  A generalization of the binary variable in that it can take more than 2 states, e.g., red, yellow, blue, green  Method 1: Simple matching  m: # of matches, p: total # of variables  Method 2: use a large number of binary variables  creating a new binary variable for each of the M nominal states p m p j i d   ) , (
  • 101. October 14, 2023 Data Mining: Concepts and Techniques 101 Ratio-Scaled Variables  Ratio-scaled variable: a positive measurement on a nonlinear scale, approximately at exponential scale, such as AeBt or Ae-Bt  Methods:  treat them like interval-scaled variables—not a good choice! (why?—the scale can be distorted)  apply logarithmic transformation yif = log(xif)  treat them as continuous ordinal data treat their rank as interval-scaled
  • 102. S.No Machine Learning Deep Learning 1. Machine Learning is a superset of Deep Learning Deep Learning is a subset of Machine Learning 2. The data represented in Machine Learning is quite different as compared to Deep Learning as it uses structured data The data representation is used in Deep Learning is quite different as it uses neural networks(ANN). 3. Machine Learning is an evolution of AI Deep Learning is an evolution to Machine Learning. Basically it is how deep is the machine learning. 4. Machine learning consists of thousands of data points. Big Data: Millions of data points. 5. Outputs: Numerical Value, like classification of score Anything from numerical values to free-form elements, such as free text and sound. 6. Uses various types of automated algorithms that turn to model functions and predict future action from data. Uses neural network that passes data through processing layers to the interpret data features and relations. 7. Algorithms are detected by data analysts to examine specific variables in data sets. Algorithms are largely self-depicted on data analysis once they’re put into production. 8. Machine Learning is highly used to stay in the competition and learn new things. Deep Learning solves complex machine learning issues. October 14, 2023 102