SlideShare a Scribd company logo
 What is machine learning?
 Learning system model
 Training and testing
 Performance
 Learning techniques
 Machine learning structure
 Machine learning Algorithms
 Machine learning Applications
 Conclusion
 Machine learning is a type of artificial intelligence
that allows software applications to become more
accurate in predicting outcomes without being
explicitly programmed.
 A branch of artificial intelligence, concerned with
the design and development of algorithms that
allow computers to evolve behaviors based on
empirical data.
 As intelligence requires knowledge, it is necessary
for the computers to acquire knowledge.
 Email spam Filtering
 Online Fraud Detection
 Face Recognition
 Search Engine and Result Refining
 Traffic Predictions
 Product Recommendations
 Image Recognition
 Speech Recognition
 Face detection
 Character detection
 Medical diagnosis
 Web Advertising
 There are several factors affecting the performance:
◦ Types of training provided
◦ The form and extent of any initial background knowledge
◦ The type of feedback provided
◦ The learning algorithms used
 Two important factors:
◦ Modeling
◦ Optimization
 Training is the process of making the system able to
learn.
 No free lunch rule:
◦ Training set and testing set come from the same
distribution
◦ Need to make some assumptions or bias
 The success of machine learning system also
depends on the algorithms.
 The algorithms control the search to find and
build the knowledge structures.
 The learning algorithms should extract useful
information from training examples.
 Supervised learning categories and
techniques
◦ Linear classifier (numerical functions)
◦ Parametric (Probabilistic functions)
 Naïve Bayes, Gaussian discriminant
analysis (GDA), Hidden Markov models
(HMM), Probabilistic graphical models
◦ Non-parametric (Instance-based functions)
 K-nearest neighbors, Kernel regression,
Kernel density estimation, Local
regression
◦ Non-metric (Symbolic functions)
 Classification and regression tree (CART),
 Techniques:
◦ Perceptron
◦ Logistic regression
◦ Support vector machine (SVM)
◦ Ada-line
◦ Multi-layer perceptron (MLP)
 Using perceptron learning algorithm(PLA)
Machine
Learning
Supervised
Learning
Develop
Predictive model
based on both
input and
output data
Unsupervised
Learning
Group and
interpret data
based on only
input data
Classification
Regression
Clustering
 It is a process related to categorization, the
process in which ideas and objects are
recognized, differentiated, and understood
 A technique for determining the
statistical relationship between two or more
variables where a change in a dependent
variable .
 It is associated with, and depends on, a
change in one or more independent variables.
 It is the task of grouping a set of objects in
such a way that objects in the same group
(called a cluster) are more similar (in some
sense) to each other than to those in other
groups (clusters).
 It is a main task of exploratory data mining ,
and a common technique for statistical data
analysis, used in many fields,
including machine learning, pattern
recognition, image analysis,
 Information retrieval, bioinformatics, data
compression, and computer graphics.
 Supervised learning
◦ Prediction
◦ Classification (discrete labels), Regression (real values)
 Unsupervised learning
◦ Clustering
◦ Probability distribution estimation
◦ Finding association (in features)
◦ Dimension reduction
 Semi-supervised learning
 Reinforcement learning
◦ Decision making (robot, chess machine)
Machine Learning
techniques
Supervised
Learning
Unsupervised
Learning
Semi-
supervised
Learning
Reinforceme
nt Clustering
Concerned
with
classified
labeled data
Concern with
unclassified
unlabeled data
Concern with
mixture of
classified and
unclassified
data
No data
 Supervised Learning:-Learning from the
known label data to create a model then
predicting target class for the given input
data
1. Linear regression & multiple linear
regression
2. Logistic Regression
3. Polynomial Regression
4. Decision trees
5. Support Vector Machine(SVM)
6. K-nearest Neighbors (KNN)
7. Naive Bays
8. Random Forest
 It is a basic and commonly used type of
Predictive analysis .The relationship between
variable (Y) and one or more independent
variable.
 Simple Linear Regression:- There is only one
input variable (x).
 Multiple Linear Regression :- There are
input variables (e.g. x1, x2, etc.) then this
would be called multiple regression.
 It is called the sigmoid function was
developed by statisticians to describe
properties of population growth in ecology,
rising quickly and maxing out at the carrying
capacity of the environment.
 It is used to find the Probability of event
success and event failure.
 Minimized in the same way as linear
regression
 For example cubic fit with one feature x:
h()=+x+x2+x3
 Generate new feature by squaring cubing the
original feature
It is mostly used in classification.
Types of Decision tree:-
1. Categorical variable decision tree:- It has
categorical target variable then it called as
categorical .
2. Continuous variable decision tree:- It has
continuous target variable then it is called as
variable decesion tree.
 Easy to understand
 Useful in data exploration
 Less data cleaning required
 Data type is not constraint
 Non parametric method
 It is supervised learning algorithm.
 It is mostly used for classification problems
There are two types classifiers
1. Linear svm
2. Non linear svm
 In linear svm the data points are separated by
an apparent gap.
 It predicts a straight hyper plane dividing 2
classes.
 The hyper plane is called as a maximum
margin hyper plane
 In non linear svm data points plotted in
higher dimensional space .
 Here kernel trick used for maximum margin
hyper plane.
 Allows use of relatively small parameter
algorithms to redirect a chaotic system to the
target.
 Reduces waiting time for chaotic systems.
 Maintains the performance of systems.
 Face detection
 Text and hypertext categorization
 Classification of images
 Bioinformatics
 Protein fold and remote homology detection
 Handwriting recognition
 Geo and Environmental Sciences
 Generalized predictive control(GPC)
 It is used for both classification and
regression predictive problems.
 It widely used in classification industry.
 It is to predict the target label by finding the
nearest neighbor class . The closest class will
be identified using the distance measures like
Euclidean distance.
 By using cross validation technique we can
test KNN algorithm with Different Values of K.
 A small value of K means that noise will have
higher influence on the result i.e the
probability of over fitting is very high.
 A large value of K makes it computationally
expensive and defeats the basic idea behind.
 KNN classifier is very simple classifier that
works well on basic recognition problems.
 It is a straight forward and powerful
algorithm for the classification task.
 It works on Bayes theorem of probability to
predict the class of unknown data set.
 It is applicable for discrete data.
 It is used for continuous values.
 In this classifier continuous values associated
with each feature are assumed to be
distributed according to a Gaussian
Distribution and it also called normal
distribution.
 It gives a bell shaped curve which symmetric
about mean if the featured values.
 It is used for both classification and
regression kind of problem.
 This algorithm creates the forest with a
number of decision trees.
 More trees in the forest the more robust the
forest looks like .Like this the higher the
number of trees in the forest gives the high
accuracy results.
 Banking
 Medicine
 Stock market
 E-commerce
 Example: decision trees tools that create
rules
 Prediction of future cases: Use the rule to
predict the output for future inputs
 Knowledge extraction: The rule is easy to
understand
 Compression: The rule is simpler than the
data it explains
 Outlier detection: Exceptions that are not
covered by the rule, e.g., fraud
 Unsupervised Learning:- Learning from the
known unlabeled data to differentiating the
given input data.
 Learning “what normally happens”
 No output
 Clustering: Grouping similar instances
 Other applications: Summarization,
Association Analysis
 Example applications
◦ Customer segmentation in CRM
◦ Image compression: Color quantization
◦ Bioinformatics: Learning motifs
 Step 1 - exploring data
 Step 2 - training the model
 Step 3 - plotting the model
 Vector quantization - image clustering
 Getting ready
 Step 1 - collecting and describing data
 Step 2 - exploring data
 Step 3 - data cleaning
 Step 4 - visualizing cleaned data
 Step 5 - building the model and visualizing it
 Using perceptron learning algorithm(PLA)
1. K-means clustering
2. Hierarchical clustering
 It is unsupervised learning , which is used
when you have unlabeled data.
 The goal of this algorithm is to find groups
in the data ,with the number of groups
represented by the variable K.
 The centroids of the K clusters , which can
be label new data
 Assuming we have inputs X1,X2,X3……Xn
and value of K
 Step-1:-Pick random points as cluster centers
called centroids
 Step-2:-Assign each xi to nearest cluster by
calculating its distance to each other.
 Step-3:-Find new cluster center by taking the
average of the assigned points.
 Step-4:-Repeat step-2 and step-3 until none
of the cluster assignments change.
 Image segmentation
 Clustering gene segmentation data
 News article clustering
 Species clustering
 Anomaly detection
 Hierarchical clustering is a widely used data
analysis tool.
 The idea is to build a binary tree of the data
that successively merges similar groups of
points.
 Visualizing this tree provides a useful
summary of the data.
 Hierarchical clustering only requires a
measure of similarity between groups of
data points.
1. Let X = {x1, x2, x3, ..., xn} be the set of data points.
2. Begin with the disjoint clustering having level L(0) = 0 and
sequence number m = 0.
3. Find the least distance pair of clusters in the current clustering,
say pair (r), (s), according to d[(r),(s)] = min d[(i),(j)] where the
minimum is over all pairs of clusters in the current clustering.
4. Increment the sequence number: m = m +1.Merge clusters (r)
and (s) into a single cluster to form the next clustering m. Set
the level of this clustering to L(m) = d[(r),(s)].
5. Update the distance matrix, D, by deleting the rows and
columns corresponding to clusters (r) and (s) and adding a row
and column corresponding to the newly formed cluster. The
distance between the new cluster, denoted (r,s) and old
cluster(k) is defined in this way: d[(k), (r,s)] = min (d[(k),(r)],
d[(k),(s)]).
6. If all the data points are in one cluster then stop, else repeat
from step 2).Divisive Hierarchical clustering - It is just the
reverse of Agglomerative Hierarchical approach.
 1) No a prior information about the number
of clusters required.
 2) Easy to implement and gives best result in
some cases.
1. Algorithm can never undo what was done previously.
2. Time complexity of at least O(n2 log n) is required,
where ‘n’ is the number of data points.
3. Based on the type of distance matrix chosen for
merging different algorithms can suffer with one or
more of the following:
4. i) Sensitivity to noise and outliers
5. ii) Breaking large clusters
6. iii) Difficulty handling different sized clusters and
convex shapes
7. No objective function is directly minimized
8. Sometimes it is difficult to identify the correct
number of clusters by the dendrogram.
 Labeled data is used to help identify that there are
specific groups of webpage types present in the
data .
 The algorithm is then trained on unlabeled data to
define the boundaries of those webpage types and
may even identify new types of webpages that
were unspecified in the existing human-inputted
labels.
 Semi-supervised learning falls
between unsupervised learning (without any
labeled training data) and supervise learning (with
completely labeled training data).
 Word sense disambiguation
 Document categorization
 Named entity classification
 Sentiment analysis
 Machine translation
 Computer vision
 Object recognition
 Image segmentation
 Bioinformatics
 Protein function prediction
 Cognitive psychology
 In reinforcement learning, the learner is a decision-
making agent that takes actions in an environment
and receives reward (or penalty)for its actions in
trying to solve a problem.
 After a set of trial-and error runs, it should learn the
best policy, which is the sequence of actions that
maximize the total reward.
 Topics:
◦ Policies: what actions should an agent take in a particular
situation
◦ Utility estimation: how good is a state (used by policy)
 No supervised output but delayed reward
 Credit assignment problem (what was responsible for
the outcome)
 Applications:
◦ Game playing
◦ Robot in a maze
◦ Multiple agents, partial observability, ...
 Step 1 - collecting and describing the data
 Step 2 - exploring the data
 Step 3 - preparing the regression model
 Step 4 - preparing the Markov-switching
model
 Step 5 - plotting the regime probabilities
 Step 6 - testing the Markov switching model
 Finance
 Media and advertising
 Text, speech, and dialog systems
 Health and medicine
 Education and training
 Robotics and industrial automation
 HVAC
 Face detection
 Object detection and recognition
 Image segmentation
 Multimedia event detection
 Economical and commercial usage
 We have a simple overview of some
techniques and algorithms in machine
learning. Furthermore, there are more and
more techniques apply machine learning as a
solution. In the future, machine learning will
play an important role in our daily life.
 [1] W. L. Chao, J. J. Ding, “Integrated Machine
Learning Algorithms for Human Age
Estimation”, NTU, 2011.
 UCI Repository:
http://www.ics.uci.edu/~mlearn/MLReposit
ory.html
 UCI KDD Archive:
http://kdd.ics.uci.edu/summary.data.applic
ation.html
 Statlib: http://lib.stat.cmu.edu/
 Delve: http://www.cs.utoronto.ca/~delve/
 Journal of Machine Learning Research
www.jmlr.org
 Machine Learning
 IEEE Transactions on Neural Networks
 IEEE Transactions on Pattern Analysis and
Machine Intelligence
 Annals of Statistics
 Journal of the American Statistical Association
5.  Machine Learning.pptx
5.  Machine Learning.pptx

More Related Content

Similar to 5. Machine Learning.pptx

Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
Akshay Kanchan
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
Satyam Jaiswal
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
IJERA Editor
 
Big Data Analytics.pptx
Big Data Analytics.pptxBig Data Analytics.pptx
Big Data Analytics.pptx
Kaviya452563
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
jagan477830
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
IRJET Journal
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
Sơn Còm Nhom
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
Rahul Jaiman
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
Vikash Kumar
 
Artificial intyelligence and machine learning introduction.pptx
Artificial intyelligence and machine learning introduction.pptxArtificial intyelligence and machine learning introduction.pptx
Artificial intyelligence and machine learning introduction.pptx
ChandrakalaV15
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
Vidya sagar Sharma
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
Luis Borbon
 
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET Journal
 
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
TEJVEER SINGH
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
IJERA Editor
 
Chapter4-ML.pptx slide for concept of mechanic learning
Chapter4-ML.pptx slide  for concept of mechanic learningChapter4-ML.pptx slide  for concept of mechanic learning
Chapter4-ML.pptx slide for concept of mechanic learning
Hina636704
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
NIKHILGR3
 
Data Science in Industry - Applying Machine Learning to Real-world Challenges
Data Science in Industry - Applying Machine Learning to Real-world ChallengesData Science in Industry - Applying Machine Learning to Real-world Challenges
Data Science in Industry - Applying Machine Learning to Real-world Challenges
Yuchen Zhao
 

Similar to 5. Machine Learning.pptx (20)

Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
Big Data Analytics.pptx
Big Data Analytics.pptxBig Data Analytics.pptx
Big Data Analytics.pptx
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
Artificial intyelligence and machine learning introduction.pptx
Artificial intyelligence and machine learning introduction.pptxArtificial intyelligence and machine learning introduction.pptx
Artificial intyelligence and machine learning introduction.pptx
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
 
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
 
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
Chapter4-ML.pptx slide for concept of mechanic learning
Chapter4-ML.pptx slide  for concept of mechanic learningChapter4-ML.pptx slide  for concept of mechanic learning
Chapter4-ML.pptx slide for concept of mechanic learning
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
Data Science in Industry - Applying Machine Learning to Real-world Challenges
Data Science in Industry - Applying Machine Learning to Real-world ChallengesData Science in Industry - Applying Machine Learning to Real-world Challenges
Data Science in Industry - Applying Machine Learning to Real-world Challenges
 

Recently uploaded

C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 

Recently uploaded (20)

C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 

5. Machine Learning.pptx

  • 1.
  • 2.  What is machine learning?  Learning system model  Training and testing  Performance  Learning techniques  Machine learning structure  Machine learning Algorithms  Machine learning Applications  Conclusion
  • 3.  Machine learning is a type of artificial intelligence that allows software applications to become more accurate in predicting outcomes without being explicitly programmed.  A branch of artificial intelligence, concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data.  As intelligence requires knowledge, it is necessary for the computers to acquire knowledge.
  • 4.  Email spam Filtering  Online Fraud Detection  Face Recognition  Search Engine and Result Refining  Traffic Predictions  Product Recommendations  Image Recognition  Speech Recognition  Face detection  Character detection  Medical diagnosis  Web Advertising
  • 5.
  • 6.  There are several factors affecting the performance: ◦ Types of training provided ◦ The form and extent of any initial background knowledge ◦ The type of feedback provided ◦ The learning algorithms used  Two important factors: ◦ Modeling ◦ Optimization
  • 7.  Training is the process of making the system able to learn.  No free lunch rule: ◦ Training set and testing set come from the same distribution ◦ Need to make some assumptions or bias
  • 8.  The success of machine learning system also depends on the algorithms.  The algorithms control the search to find and build the knowledge structures.  The learning algorithms should extract useful information from training examples.
  • 9.  Supervised learning categories and techniques ◦ Linear classifier (numerical functions) ◦ Parametric (Probabilistic functions)  Naïve Bayes, Gaussian discriminant analysis (GDA), Hidden Markov models (HMM), Probabilistic graphical models ◦ Non-parametric (Instance-based functions)  K-nearest neighbors, Kernel regression, Kernel density estimation, Local regression ◦ Non-metric (Symbolic functions)  Classification and regression tree (CART),
  • 10.  Techniques: ◦ Perceptron ◦ Logistic regression ◦ Support vector machine (SVM) ◦ Ada-line ◦ Multi-layer perceptron (MLP)
  • 11.  Using perceptron learning algorithm(PLA)
  • 12. Machine Learning Supervised Learning Develop Predictive model based on both input and output data Unsupervised Learning Group and interpret data based on only input data Classification Regression Clustering
  • 13.
  • 14.  It is a process related to categorization, the process in which ideas and objects are recognized, differentiated, and understood
  • 15.  A technique for determining the statistical relationship between two or more variables where a change in a dependent variable .  It is associated with, and depends on, a change in one or more independent variables.
  • 16.  It is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters).  It is a main task of exploratory data mining , and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis,  Information retrieval, bioinformatics, data compression, and computer graphics.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.  Supervised learning ◦ Prediction ◦ Classification (discrete labels), Regression (real values)  Unsupervised learning ◦ Clustering ◦ Probability distribution estimation ◦ Finding association (in features) ◦ Dimension reduction  Semi-supervised learning  Reinforcement learning ◦ Decision making (robot, chess machine)
  • 23. Machine Learning techniques Supervised Learning Unsupervised Learning Semi- supervised Learning Reinforceme nt Clustering Concerned with classified labeled data Concern with unclassified unlabeled data Concern with mixture of classified and unclassified data No data
  • 24.  Supervised Learning:-Learning from the known label data to create a model then predicting target class for the given input data
  • 25.
  • 26. 1. Linear regression & multiple linear regression 2. Logistic Regression 3. Polynomial Regression 4. Decision trees 5. Support Vector Machine(SVM) 6. K-nearest Neighbors (KNN) 7. Naive Bays 8. Random Forest
  • 27.  It is a basic and commonly used type of Predictive analysis .The relationship between variable (Y) and one or more independent variable.  Simple Linear Regression:- There is only one input variable (x).  Multiple Linear Regression :- There are input variables (e.g. x1, x2, etc.) then this would be called multiple regression.
  • 28.  It is called the sigmoid function was developed by statisticians to describe properties of population growth in ecology, rising quickly and maxing out at the carrying capacity of the environment.  It is used to find the Probability of event success and event failure.
  • 29.  Minimized in the same way as linear regression  For example cubic fit with one feature x: h()=+x+x2+x3  Generate new feature by squaring cubing the original feature
  • 30. It is mostly used in classification. Types of Decision tree:- 1. Categorical variable decision tree:- It has categorical target variable then it called as categorical . 2. Continuous variable decision tree:- It has continuous target variable then it is called as variable decesion tree.
  • 31.  Easy to understand  Useful in data exploration  Less data cleaning required  Data type is not constraint  Non parametric method
  • 32.  It is supervised learning algorithm.  It is mostly used for classification problems There are two types classifiers 1. Linear svm 2. Non linear svm
  • 33.  In linear svm the data points are separated by an apparent gap.  It predicts a straight hyper plane dividing 2 classes.  The hyper plane is called as a maximum margin hyper plane
  • 34.  In non linear svm data points plotted in higher dimensional space .  Here kernel trick used for maximum margin hyper plane.
  • 35.  Allows use of relatively small parameter algorithms to redirect a chaotic system to the target.  Reduces waiting time for chaotic systems.  Maintains the performance of systems.
  • 36.  Face detection  Text and hypertext categorization  Classification of images  Bioinformatics  Protein fold and remote homology detection  Handwriting recognition  Geo and Environmental Sciences  Generalized predictive control(GPC)
  • 37.  It is used for both classification and regression predictive problems.  It widely used in classification industry.  It is to predict the target label by finding the nearest neighbor class . The closest class will be identified using the distance measures like Euclidean distance.
  • 38.  By using cross validation technique we can test KNN algorithm with Different Values of K.  A small value of K means that noise will have higher influence on the result i.e the probability of over fitting is very high.  A large value of K makes it computationally expensive and defeats the basic idea behind.
  • 39.  KNN classifier is very simple classifier that works well on basic recognition problems.
  • 40.  It is a straight forward and powerful algorithm for the classification task.  It works on Bayes theorem of probability to predict the class of unknown data set.  It is applicable for discrete data.
  • 41.  It is used for continuous values.  In this classifier continuous values associated with each feature are assumed to be distributed according to a Gaussian Distribution and it also called normal distribution.  It gives a bell shaped curve which symmetric about mean if the featured values.
  • 42.  It is used for both classification and regression kind of problem.  This algorithm creates the forest with a number of decision trees.  More trees in the forest the more robust the forest looks like .Like this the higher the number of trees in the forest gives the high accuracy results.
  • 43.  Banking  Medicine  Stock market  E-commerce
  • 44.  Example: decision trees tools that create rules  Prediction of future cases: Use the rule to predict the output for future inputs  Knowledge extraction: The rule is easy to understand  Compression: The rule is simpler than the data it explains  Outlier detection: Exceptions that are not covered by the rule, e.g., fraud
  • 45.  Unsupervised Learning:- Learning from the known unlabeled data to differentiating the given input data.
  • 46.  Learning “what normally happens”  No output  Clustering: Grouping similar instances  Other applications: Summarization, Association Analysis  Example applications ◦ Customer segmentation in CRM ◦ Image compression: Color quantization ◦ Bioinformatics: Learning motifs
  • 47.
  • 48.  Step 1 - exploring data  Step 2 - training the model  Step 3 - plotting the model  Vector quantization - image clustering  Getting ready  Step 1 - collecting and describing data  Step 2 - exploring data  Step 3 - data cleaning  Step 4 - visualizing cleaned data  Step 5 - building the model and visualizing it
  • 49.  Using perceptron learning algorithm(PLA)
  • 50.
  • 51. 1. K-means clustering 2. Hierarchical clustering
  • 52.  It is unsupervised learning , which is used when you have unlabeled data.  The goal of this algorithm is to find groups in the data ,with the number of groups represented by the variable K.  The centroids of the K clusters , which can be label new data
  • 53.  Assuming we have inputs X1,X2,X3……Xn and value of K  Step-1:-Pick random points as cluster centers called centroids  Step-2:-Assign each xi to nearest cluster by calculating its distance to each other.  Step-3:-Find new cluster center by taking the average of the assigned points.  Step-4:-Repeat step-2 and step-3 until none of the cluster assignments change.
  • 54.  Image segmentation  Clustering gene segmentation data  News article clustering  Species clustering  Anomaly detection
  • 55.  Hierarchical clustering is a widely used data analysis tool.  The idea is to build a binary tree of the data that successively merges similar groups of points.  Visualizing this tree provides a useful summary of the data.  Hierarchical clustering only requires a measure of similarity between groups of data points.
  • 56.
  • 57. 1. Let X = {x1, x2, x3, ..., xn} be the set of data points. 2. Begin with the disjoint clustering having level L(0) = 0 and sequence number m = 0. 3. Find the least distance pair of clusters in the current clustering, say pair (r), (s), according to d[(r),(s)] = min d[(i),(j)] where the minimum is over all pairs of clusters in the current clustering. 4. Increment the sequence number: m = m +1.Merge clusters (r) and (s) into a single cluster to form the next clustering m. Set the level of this clustering to L(m) = d[(r),(s)]. 5. Update the distance matrix, D, by deleting the rows and columns corresponding to clusters (r) and (s) and adding a row and column corresponding to the newly formed cluster. The distance between the new cluster, denoted (r,s) and old cluster(k) is defined in this way: d[(k), (r,s)] = min (d[(k),(r)], d[(k),(s)]). 6. If all the data points are in one cluster then stop, else repeat from step 2).Divisive Hierarchical clustering - It is just the reverse of Agglomerative Hierarchical approach.
  • 58.
  • 59.  1) No a prior information about the number of clusters required.  2) Easy to implement and gives best result in some cases.
  • 60. 1. Algorithm can never undo what was done previously. 2. Time complexity of at least O(n2 log n) is required, where ‘n’ is the number of data points. 3. Based on the type of distance matrix chosen for merging different algorithms can suffer with one or more of the following: 4. i) Sensitivity to noise and outliers 5. ii) Breaking large clusters 6. iii) Difficulty handling different sized clusters and convex shapes 7. No objective function is directly minimized 8. Sometimes it is difficult to identify the correct number of clusters by the dendrogram.
  • 61.
  • 62.  Labeled data is used to help identify that there are specific groups of webpage types present in the data .  The algorithm is then trained on unlabeled data to define the boundaries of those webpage types and may even identify new types of webpages that were unspecified in the existing human-inputted labels.  Semi-supervised learning falls between unsupervised learning (without any labeled training data) and supervise learning (with completely labeled training data).
  • 63.
  • 64.  Word sense disambiguation  Document categorization  Named entity classification  Sentiment analysis  Machine translation  Computer vision  Object recognition  Image segmentation  Bioinformatics  Protein function prediction  Cognitive psychology
  • 65.  In reinforcement learning, the learner is a decision- making agent that takes actions in an environment and receives reward (or penalty)for its actions in trying to solve a problem.  After a set of trial-and error runs, it should learn the best policy, which is the sequence of actions that maximize the total reward.
  • 66.  Topics: ◦ Policies: what actions should an agent take in a particular situation ◦ Utility estimation: how good is a state (used by policy)  No supervised output but delayed reward  Credit assignment problem (what was responsible for the outcome)  Applications: ◦ Game playing ◦ Robot in a maze ◦ Multiple agents, partial observability, ...
  • 67.  Step 1 - collecting and describing the data  Step 2 - exploring the data  Step 3 - preparing the regression model  Step 4 - preparing the Markov-switching model  Step 5 - plotting the regime probabilities  Step 6 - testing the Markov switching model
  • 68.  Finance  Media and advertising  Text, speech, and dialog systems  Health and medicine  Education and training  Robotics and industrial automation  HVAC
  • 69.  Face detection  Object detection and recognition  Image segmentation  Multimedia event detection  Economical and commercial usage
  • 70.  We have a simple overview of some techniques and algorithms in machine learning. Furthermore, there are more and more techniques apply machine learning as a solution. In the future, machine learning will play an important role in our daily life.
  • 71.  [1] W. L. Chao, J. J. Ding, “Integrated Machine Learning Algorithms for Human Age Estimation”, NTU, 2011.
  • 72.  UCI Repository: http://www.ics.uci.edu/~mlearn/MLReposit ory.html  UCI KDD Archive: http://kdd.ics.uci.edu/summary.data.applic ation.html  Statlib: http://lib.stat.cmu.edu/  Delve: http://www.cs.utoronto.ca/~delve/
  • 73.  Journal of Machine Learning Research www.jmlr.org  Machine Learning  IEEE Transactions on Neural Networks  IEEE Transactions on Pattern Analysis and Machine Intelligence  Annals of Statistics  Journal of the American Statistical Association