DEPARMENT OF COMPUTER SCIENCE
AND INFORMATION TECHNOLOGY
presented by
NIBIYA.G
I-Msc(IT)
 SEMINAR TOPIC:
 Support vector machine algorithm
 Classification of association rules
 Support Vector Machine or SVM is one of the
most popular Supervised Learning
algorithms, which is used for Classification as
well as Regression problems.
 However, primarily, it is used for
Classification problems in Machine Learning.
 The goal of the SVM algorithm is to create the
best line or decision boundary that can
segregate n-dimensional space into classes
so that we can easily put the new data point
in the correct category in the future.
 This best decision boundary is called a
hyperplane.
 These extreme cases are called as support
vectors, and hence algorithm is termed as
Support Vector Machine.
 SVM can be understood with the example that
we have used in the KNN classifier.
 Suppose we see a strange cat that also has
some features of dogs, so if we want a model
that can accurately identify whether it is a cat
or dog, so such a model can be created by
using the SVM algorithm.
 SVM can be of two types:
◦ Linear SVM
◦ Non-linear SVM
 Linear SVM: Linear SVM is used for linearly separable
data, which means if a dataset can be classified into
two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier
is used called as Linear SVM classifier.
 Non-linear SVM: Non-Linear SVM is used for non-
linearly separated data, which means if a dataset
cannot be classified by using a straight line, then
such data is termed as non-linear data and
classifier used is called as Non-linear SVM
classifier.
 Analyzes and predicts customer behaviour
if/then statement.
 EXAMPLE:
 Bread=>butter
 Buys{onions,potatoes}=>buys(tomatoes)
Bread=>butter{20%,45%}
 Bread : Ancedent
 Butter : consequent
 20% : support
 45% : confidence
A=>B
 Support denotes probability that contains
both A&A
 Confidence denoted probability that a
transaction containing A also contain B
 Consider, in a supermarket
◦ Total transactions:100
◦ Bread:20
So, 20/100*100=20% which is support
In 20 transaction, butter:9 transaction
So, 9/20*100=45% which is confidence
 Single dimensional rules
Bread->butter
 Multi-dimensional rules
With 2 or more predicates or dimensional
occupation (I,T),age(>22) =>buys(laptop)
 Hybrid dimensional rules
With repetative predicates or dimensional time
(5’o clock),buys(tea)=>buys(biscuits)
 Web usage mining
 Banking
 Bio-Inforamtion
 Market based analysis
 Credit/debit card analysis
 Product clustering
 Catalog design
 Apriori algorithm
 Elcat algorithm
 F.P growth algorithm
Datamining & warehouse

Datamining & warehouse

  • 1.
    DEPARMENT OF COMPUTERSCIENCE AND INFORMATION TECHNOLOGY presented by NIBIYA.G I-Msc(IT)
  • 2.
     SEMINAR TOPIC: Support vector machine algorithm  Classification of association rules
  • 3.
     Support VectorMachine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems.  However, primarily, it is used for Classification problems in Machine Learning.
  • 4.
     The goalof the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future.  This best decision boundary is called a hyperplane.  These extreme cases are called as support vectors, and hence algorithm is termed as Support Vector Machine.
  • 6.
     SVM canbe understood with the example that we have used in the KNN classifier.  Suppose we see a strange cat that also has some features of dogs, so if we want a model that can accurately identify whether it is a cat or dog, so such a model can be created by using the SVM algorithm.
  • 8.
     SVM canbe of two types: ◦ Linear SVM ◦ Non-linear SVM  Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified into two classes by using a single straight line, then such data is termed as linearly separable data, and classifier is used called as Linear SVM classifier.
  • 9.
     Non-linear SVM:Non-Linear SVM is used for non- linearly separated data, which means if a dataset cannot be classified by using a straight line, then such data is termed as non-linear data and classifier used is called as Non-linear SVM classifier.
  • 10.
     Analyzes andpredicts customer behaviour if/then statement.  EXAMPLE:  Bread=>butter  Buys{onions,potatoes}=>buys(tomatoes)
  • 11.
    Bread=>butter{20%,45%}  Bread :Ancedent  Butter : consequent  20% : support  45% : confidence
  • 12.
    A=>B  Support denotesprobability that contains both A&A  Confidence denoted probability that a transaction containing A also contain B
  • 13.
     Consider, ina supermarket ◦ Total transactions:100 ◦ Bread:20 So, 20/100*100=20% which is support In 20 transaction, butter:9 transaction So, 9/20*100=45% which is confidence
  • 14.
     Single dimensionalrules Bread->butter  Multi-dimensional rules With 2 or more predicates or dimensional occupation (I,T),age(>22) =>buys(laptop)  Hybrid dimensional rules With repetative predicates or dimensional time (5’o clock),buys(tea)=>buys(biscuits)
  • 15.
     Web usagemining  Banking  Bio-Inforamtion  Market based analysis  Credit/debit card analysis  Product clustering  Catalog design
  • 16.
     Apriori algorithm Elcat algorithm  F.P growth algorithm