SlideShare a Scribd company logo
1 of 4
Evaluating multi-class classification using binary SVMs
                        (IT-642: Course Project)

             Vijay T. Raisinghani                              Pradeep Jagannath
              rvijay@it.iitb.ac.in                            pradeep@it.iitb.ac.in

              Roll No: 01429703                                Roll No: 00329010

                                           Abstract

We study how SVM-based binary classifiers are used for multi-way classification. We present the
results of experiments run on various UCI and KDD datasets, using the SVMLight package. The
methods evaluated are 1-versus-1, 1-versus-many and Erroc Correcting Output Coding (ECOC).


1. Main Objectives                                 4. Introduction
- Use three (1-vs-1, 1-vs-many, ECOC)              Many supervised learning tasks can be cast
   binary classification schemes, with             as the problem of assigning elements to a
   SVMLight [SVM02] on various UCI                 finite set of classes or categories. For
   datasets and the KDD intrusion                  example the goal of optical character
   detection dataset.                              recognition (OCR) is to determine the digit
- Report accuracy and run-time for the             value (0..9) from its image. A number of
   various methods.                                other applications too require such multi-
                                                   way classification e.g.: text and speech
2. Status and other details                        categorisation, natural language processing
- Fully completed                                  tasks and gesture and object recognition in
                                                   machine vision [All00].
- Percentage contribution of members:
                                                   In designing machine learning algorithms, it
   - Pradeep Jagannath – 50%
                                                   is often easier first to devise algorithms for
   - Vijay T. Raisinghani – 50%                    distinguishing between only two classes
- Total time spent on the project: ??              [All00]. Ensemble schemes have been
                                                   proposed, which use binary (two-class)
3. Major stumbling blocks                          classification algorithms to solve K-class
- SVM parameter estimation. We referred            classification problems. Decomposing a K-
   to [Dua01] and other papers (see section        class classification problem into a number of
   “Related Work”) and had discussions             binary classification problems allows an
   with Shantanu Godbole (Ph.D student,            ensemble scheme to model binary class
   KR School of IT, IIT Bombay) to                 boundaries with much greater flexibility at a
   “estimate” the kernel parameters                lower computational cost [Goh01].
   required. Still, with no prior estimate
   about the time required for various             Three representative ensemble schemes are
   datasets, especially the KDD dataset, we        one per class (1-vs-many), pairwise coupling
   had to abort tests, which were running          (1-vs-1), and error-correcting output coding
   for days together.                              (ECOC) [Goh01].
- KDD dataset tested using only 1% of the          1. One per class (OPC). This is also known
   data (i.e. 50,000) records. Full dataset        as “one against others.” OPC trains K binary
   has 5 million records. Even 10% of the          classifiers, each of which separates one class
   records were taking a very large amount         from the other (K - 1) classes. Given a point
   of time.                                        X to classify, the binary classifier with the
                                                   largest output determines the class of X.
2. Pairwise coupling (PWC). PWC                   7. Experiments and Results
constructs K(K-1)/2 pairwise binary               All our tests were run on cygnus (PIII – 3
classifiers. The classifying decision is made     processors, 512 MB RAM, running Linux
by aggregating the outputs of the pairwise        version 2.4.17 (RedHat 7.1).
classifiers.
3. Error-correcting output coding (ECOC).         We experimented with various kernel
ECOC was first proposed by Dietterich and         settings. For some cases, the test did not
Bakiri [S] to reduce classification error by      terminate and had to be aborted. For some
exploiting the redundancy of the coding           settings, the accuracy from all the methods
scheme. ECOC employs a set of binary              was very low. We exclude the results for
classifiers assigned with codewords such          which the test had to be aborted or the
that the Hamming distance between each            accuracy was low for all the methods.
pair is far enough apart to enable good error
correction.                                        Data Set       No. of       Train       Test
                                                                 Classes      records    records
5. Related Work                                  iris                 3         100          50
[Die95] discusses the use of ECOC method         dermatalogy          6         244         122
versus multi-way classification using            glass                7         142          72
decision trees. [Gha00], [Ren01] present the     ecoli                8         224         112
use of ECOC, for improving the                   yeast               10         989         495
performance of Naïve-Bayes, for text             KDD                23       48,984,31   3,11,029
                                                 intrusion       (in train      (4.9       (0.3
classification. [All00], [Wes99], [Hsu02]
                                                 detection         data)      million)   million)
propose extensions to SVMs for multi-way                            38
classification. [Goh01] provides details of                       (in test
how to boost the output of binary SVMs, for                        data)
image classification. [Mor98] discusses          letter              26       15,000      5000
various methods of combining the output of
1-vs-1.                                                        Figure 1: Data sets

6. Implementation details                         The dematalogy dataset had missing values
All our test scripts were shell scripts, which    in the age attribute. We substituted this with
invoked SVMLight . Additionally, for ECOC         the maximum frequency value of age.
we used a modified form of bch3.c [Zar02]
– encoder/decoder program for BCH codes           For the KDD dataset we did the following:
in C. We modified the program to decode           - Reduced the training data to only 1% i.e.
and encode in parts. The program generates            50,000 records.
the code matrix based on the input set of         - Scanned the 1% test set for duplicates --
classes and accepts the ‘received’ code from          found 50% duplicates. Eliminated them,
our shell scripts for decoding. We hard-              finally training data had 23000 records.
coded other parameters: code length to 31         - One training record had 55 features
bits and errors correcting capability to 15           while all others had 41. We eliminated
bits. This resulted in the data length being a        this record, although it may not have
maximum of 6 bits i.e. we could encode a              had contributed to any problems.
maximum of 64 classes with these settings.        - Feature selection was done using
This was sufficient for the data sets we used,        “Inducer”[MLC++] and C4.5. Selected
which had a maximum of 26 classes.                    16 features from the original set of 41.
                                                  - Stratified the de-duped file to max 50
                                                      per class to get 689 records. To run a
                                                      simpler / faster test.
-      Used these 689 records to train with the   KDD                                  3.7%                     53.9%        57.8%
        RBF kernel with params: -g 0.03 -c 10 -    intrusion
        q 50 -n 40                                 detection++

                                                   letter                               34%                      78.5%        88.38%
 In KDD with FSS using only 689 records.
 And testing 10 percent of the test file
                                                    Figure 3: Accruacy with various methods
 (29615) records:
 - 5000 records had certain class labels            * (1% train data, deduplicated = 23000;
      present in test file, which were non-         10% test data = 29615)
      existent in the training data. These 5000     ++ (0.01% train data, deduplicated = 689;
      records      directly     contributed  to     10% test data = 29615)
      classification errors.
                                                   Dataset                     1-vs-1                 1-vs-many                           ECOC
 - 1-vs-1 – almost all records got classified




                                                                     File Convert


                                                                                         Learn


                                                                                                      File Convert


                                                                                                                      Learn


                                                                                                                               File Convert


                                                                                                                                               Learn
      as class 1 (3.7 % accuracy)
 - 1-vs-many – 13687 errors (53.9%
      accuracy)
 - ECOC – 12489 errors (57.8% accuracy)            iris          0.09                   0.11         0.06            0.17     0.79             1.21
 -                                                 dermatal      0.43                    3.1         0.29            1.19      2.1             9.79
    Data Set                Parameters             ogy
 iris              kernel:poly d=3,                glass         0.29                   1.14         0.13            0.38     0.96             3.45
                   other params: c=0.001
                                                   ecoli         0.65                   1.22         0.19            0.45     1.16             2.53
 dermatalogy       kernel: RBF g=0.01
                   other params: c=10              yeast         1.66                   9.87          0.6            14.73    4.29             97.1
 glass             kernel: RBF g=0.8               KDD           75.96                  6530         44.37           2980     81.25           13675
                   other params: c=10              intrusion
 ecoli             kernel: poly d=3                detection
                   other params: -                 *
 yeast             kernel: RBF g=10
                   other params: c=10              KDD           6.63                   101.9        1.44            19.87     1.2            83.96
 KDD               kernel: RBF g=0.001             intrusion
                                                   detection
 intrusion         other params: c=1, q=50, n=40
                                                   ++
 detection
 letter            kernel: RBF g=0.01              letter        105.4                  1609         33.49           972.7    108             9230
                   other params: c=1, q=50, n=40
 Figure 2: SVM Parameters for various
 data sets                                                                          #Classes vs accuracy
     Dataset       1-vs-1      1-vs-      ECOC
                               many                       120%
iris                94%        98%         98%            100%
                                                           80%                                                                                1v1
dermatalogy        86.9%       89.3%      90.98%
                                                           60%                                                                                1vm
glass              66.7%       68%        73.6%            40%                                                                                ECOC
ecoli              80.3%       55.3%      73.2%            20%
yeast              53.3%       44.6%      55.8%             0%
                                                                 3                  6    7       8    10 23 23 26
KDD                 5.8%       47.5%      63.7%
intrusion
detection*
[Gha00] Rayid Ghani. Using error-
              convert vs row s/class
                                                       correcting     codes   for    text
                                                       classification. In Proceedings of
  16000
  14000
                                                       the Seventeenth International
  12000                                                Conference on Machine Learning,
                                        1v1
  10000                                                2000.
   8000                                 1vm
   6000
   4000
                                        ECOC   [Ren01] Jason D. M. Rennie. Improving
   2000                                                multi-class text classification with
      0                                                naïve     bayes. Master's thesis,
                                                       Massachusetts       Institute     of
                               4
       29

              63

                     43

                            5.

                                                       Technology, 2001.
     0.

            6.

                   0.

                          10




                                               [Hsu02] C.-W. Hsu and C.-J. Lin. A
                                                       comparison of methods for multi-
              learn vs row s/class
                                                       class support vector machines ,
                                                       IEEE Transactions on Neural
  120
                                                       Networks, 13(2002), 415-425.
  100
   80                                  1v1     [Wes99] J. Weston, "Extensions to the
   60                                  1vm             Support Vector Method", PhD
   40                                  ECOC            thesis, Royal Holloway University
   20                                                  of London, 1999.
    0                                          [Mor98] M. Moreira and E. Mayoraz.
                                                       Improving pairwise coupling
                             4
             63
     29




                    43

                           5.




                                                       classification with error correcting
   0.


          6.

                  0.

                         10




                                                       classifiers. Proceedings of the
                                                       Tenth European Conference on
                                                       Machine Learning, April 1998.
    References
[All00] E. L. Allwein, R. E. Schapire, and     [SVM02]            Thorsten         Joachims
         Y. Singer. Reducing multiclass to               http://svmlight.joachims.org/,
         binary: A unifying approach for                 Cornell University, Department of
         margin classifiers. Journal of                  Computer Science.
         Machine Learning Research,
         1:113-141, 2000.                      [Dua01] Kaibo Duan, S Sathiya Keerthi,
                                                       Aun Neow PooICONIP – 2001,
[Goh01] K. Goh, E. Chang, K. Cheng.                    8th International Conference on
        SVM Binary Classifier Ensembles                Neural Information Processing,
        for      Image      Classification.            Shanghai     China,   November
        CIKM’01, November 5-10,2001,                   14-18.2001
        Atlanta, Georgia, USA.
                                               [MLC++] Silicon Graphics, Inc., MLC++,
[Die95] T. G. Dietterich and G. Bakiri.               http://www.sgi.com/tech/mlc/,
        Solving     multiclass    learning            2002
        problems via error-correcting
        output codes. Journal of Artificial    [Zar02] R. Morelos-Zaragoza., BCH codes -
        Intelligence Research, 2:263-286,               The Error Correcting Codes (ECC)
        1995.                                           Page,
                                                        http://www.csl.sony.co.jp/person/
                                                        morelos/ecc/codes.html, 2002

More Related Content

What's hot

Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2Park Chunduck
 
A flexible method to create wave file features
A flexible method to create wave file features A flexible method to create wave file features
A flexible method to create wave file features IJECEIAES
 
Optimizing Intelligent Agents Constraint Satisfaction with ...
Optimizing Intelligent Agents Constraint Satisfaction with ...Optimizing Intelligent Agents Constraint Satisfaction with ...
Optimizing Intelligent Agents Constraint Satisfaction with ...butest
 
Multilayer Slides
Multilayer  SlidesMultilayer  Slides
Multilayer SlidesESCOM
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...vtunotesbysree
 
C O M P U T E R O R G A N I Z A T I O N J N T U M O D E L P A P E R{Www
C O M P U T E R  O R G A N I Z A T I O N  J N T U  M O D E L  P A P E R{WwwC O M P U T E R  O R G A N I Z A T I O N  J N T U  M O D E L  P A P E R{Www
C O M P U T E R O R G A N I Z A T I O N J N T U M O D E L P A P E R{Wwwguest3f9c6b
 
Deep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationDeep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationTJ Torres
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabImry Kissos
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...CSCJournals
 
June 10 P33
June 10 P33June 10 P33
June 10 P33Samimvez
 
Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...George Ang
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Daniel Lewis
 
June 07 P3
June 07 P3June 07 P3
June 07 P3Samimvez
 
Neural tool box
Neural tool boxNeural tool box
Neural tool boxMohan Raj
 

What's hot (18)

Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2
 
A flexible method to create wave file features
A flexible method to create wave file features A flexible method to create wave file features
A flexible method to create wave file features
 
nn network
nn networknn network
nn network
 
Optimizing Intelligent Agents Constraint Satisfaction with ...
Optimizing Intelligent Agents Constraint Satisfaction with ...Optimizing Intelligent Agents Constraint Satisfaction with ...
Optimizing Intelligent Agents Constraint Satisfaction with ...
 
Multilayer Slides
Multilayer  SlidesMultilayer  Slides
Multilayer Slides
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
 
C O M P U T E R O R G A N I Z A T I O N J N T U M O D E L P A P E R{Www
C O M P U T E R  O R G A N I Z A T I O N  J N T U  M O D E L  P A P E R{WwwC O M P U T E R  O R G A N I Z A T I O N  J N T U  M O D E L  P A P E R{Www
C O M P U T E R O R G A N I Z A T I O N J N T U M O D E L P A P E R{Www
 
Deep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image GenerationDeep Style: Using Variational Auto-encoders for Image Generation
Deep Style: Using Variational Auto-encoders for Image Generation
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and Matlab
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
Real-time DSP Implementation of Audio Crosstalk Cancellation using Mixed Unif...
 
June 10 P33
June 10 P33June 10 P33
June 10 P33
 
Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 
June 07 P3
June 07 P3June 07 P3
June 07 P3
 
Neural tool box
Neural tool boxNeural tool box
Neural tool box
 
Be it
Be itBe it
Be it
 

Viewers also liked

Dowload Paper.doc.doc
Dowload Paper.doc.docDowload Paper.doc.doc
Dowload Paper.doc.docbutest
 
Ci2004-10.doc
Ci2004-10.docCi2004-10.doc
Ci2004-10.docbutest
 
Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425guest4f2eea
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
MikroBasic
MikroBasicMikroBasic
MikroBasicbutest
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)butest
 

Viewers also liked (7)

Dowload Paper.doc.doc
Dowload Paper.doc.docDowload Paper.doc.doc
Dowload Paper.doc.doc
 
Ci2004-10.doc
Ci2004-10.docCi2004-10.doc
Ci2004-10.doc
 
Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
MikroBasic
MikroBasicMikroBasic
MikroBasic
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)
 

Similar to .doc

Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svmtaikhoan262
 
Multilabel Classification by BCH Code and Random Forests
Multilabel Classification by BCH Code and Random ForestsMultilabel Classification by BCH Code and Random Forests
Multilabel Classification by BCH Code and Random ForestsIDES Editor
 
Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...IAESIJAI
 
implementation of area efficient high speed eddr architecture
implementation of area efficient high speed eddr architectureimplementation of area efficient high speed eddr architecture
implementation of area efficient high speed eddr architectureKumar Goud
 
ECG beats classification using multiclass SVMs with ECOC
ECG beats classification using multiclass SVMs with ECOCECG beats classification using multiclass SVMs with ECOC
ECG beats classification using multiclass SVMs with ECOCYomna Mahmoud Ibrahim Hassan
 
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...csandit
 
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...cscpconf
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료taeseon ryu
 
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0theijes
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiersamreshkr19
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETScsandit
 
Sign Detection from Hearing Impaired
Sign Detection from Hearing ImpairedSign Detection from Hearing Impaired
Sign Detection from Hearing ImpairedIRJET Journal
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software developmentMartin Pinzger
 

Similar to .doc (20)

Guide
GuideGuide
Guide
 
Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
 
Multilabel Classification by BCH Code and Random Forests
Multilabel Classification by BCH Code and Random ForestsMultilabel Classification by BCH Code and Random Forests
Multilabel Classification by BCH Code and Random Forests
 
BDS_QA.pdf
BDS_QA.pdfBDS_QA.pdf
BDS_QA.pdf
 
[ppt]
[ppt][ppt]
[ppt]
 
Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...
 
implementation of area efficient high speed eddr architecture
implementation of area efficient high speed eddr architectureimplementation of area efficient high speed eddr architecture
implementation of area efficient high speed eddr architecture
 
ECG beats classification using multiclass SVMs with ECOC
ECG beats classification using multiclass SVMs with ECOCECG beats classification using multiclass SVMs with ECOC
ECG beats classification using multiclass SVMs with ECOC
 
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...
 
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료
 
Maestro_Abstract
Maestro_AbstractMaestro_Abstract
Maestro_Abstract
 
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
 
Sign Detection from Hearing Impaired
Sign Detection from Hearing ImpairedSign Detection from Hearing Impaired
Sign Detection from Hearing Impaired
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
Eryk_Kulikowski_a2
Eryk_Kulikowski_a2Eryk_Kulikowski_a2
Eryk_Kulikowski_a2
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
AI Lesson 39
AI Lesson 39AI Lesson 39
AI Lesson 39
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

.doc

  • 1. Evaluating multi-class classification using binary SVMs (IT-642: Course Project) Vijay T. Raisinghani Pradeep Jagannath rvijay@it.iitb.ac.in pradeep@it.iitb.ac.in Roll No: 01429703 Roll No: 00329010 Abstract We study how SVM-based binary classifiers are used for multi-way classification. We present the results of experiments run on various UCI and KDD datasets, using the SVMLight package. The methods evaluated are 1-versus-1, 1-versus-many and Erroc Correcting Output Coding (ECOC). 1. Main Objectives 4. Introduction - Use three (1-vs-1, 1-vs-many, ECOC) Many supervised learning tasks can be cast binary classification schemes, with as the problem of assigning elements to a SVMLight [SVM02] on various UCI finite set of classes or categories. For datasets and the KDD intrusion example the goal of optical character detection dataset. recognition (OCR) is to determine the digit - Report accuracy and run-time for the value (0..9) from its image. A number of various methods. other applications too require such multi- way classification e.g.: text and speech 2. Status and other details categorisation, natural language processing - Fully completed tasks and gesture and object recognition in machine vision [All00]. - Percentage contribution of members: In designing machine learning algorithms, it - Pradeep Jagannath – 50% is often easier first to devise algorithms for - Vijay T. Raisinghani – 50% distinguishing between only two classes - Total time spent on the project: ?? [All00]. Ensemble schemes have been proposed, which use binary (two-class) 3. Major stumbling blocks classification algorithms to solve K-class - SVM parameter estimation. We referred classification problems. Decomposing a K- to [Dua01] and other papers (see section class classification problem into a number of “Related Work”) and had discussions binary classification problems allows an with Shantanu Godbole (Ph.D student, ensemble scheme to model binary class KR School of IT, IIT Bombay) to boundaries with much greater flexibility at a “estimate” the kernel parameters lower computational cost [Goh01]. required. Still, with no prior estimate about the time required for various Three representative ensemble schemes are datasets, especially the KDD dataset, we one per class (1-vs-many), pairwise coupling had to abort tests, which were running (1-vs-1), and error-correcting output coding for days together. (ECOC) [Goh01]. - KDD dataset tested using only 1% of the 1. One per class (OPC). This is also known data (i.e. 50,000) records. Full dataset as “one against others.” OPC trains K binary has 5 million records. Even 10% of the classifiers, each of which separates one class records were taking a very large amount from the other (K - 1) classes. Given a point of time. X to classify, the binary classifier with the largest output determines the class of X.
  • 2. 2. Pairwise coupling (PWC). PWC 7. Experiments and Results constructs K(K-1)/2 pairwise binary All our tests were run on cygnus (PIII – 3 classifiers. The classifying decision is made processors, 512 MB RAM, running Linux by aggregating the outputs of the pairwise version 2.4.17 (RedHat 7.1). classifiers. 3. Error-correcting output coding (ECOC). We experimented with various kernel ECOC was first proposed by Dietterich and settings. For some cases, the test did not Bakiri [S] to reduce classification error by terminate and had to be aborted. For some exploiting the redundancy of the coding settings, the accuracy from all the methods scheme. ECOC employs a set of binary was very low. We exclude the results for classifiers assigned with codewords such which the test had to be aborted or the that the Hamming distance between each accuracy was low for all the methods. pair is far enough apart to enable good error correction. Data Set No. of Train Test Classes records records 5. Related Work iris 3 100 50 [Die95] discusses the use of ECOC method dermatalogy 6 244 122 versus multi-way classification using glass 7 142 72 decision trees. [Gha00], [Ren01] present the ecoli 8 224 112 use of ECOC, for improving the yeast 10 989 495 performance of Naïve-Bayes, for text KDD 23 48,984,31 3,11,029 intrusion (in train (4.9 (0.3 classification. [All00], [Wes99], [Hsu02] detection data) million) million) propose extensions to SVMs for multi-way 38 classification. [Goh01] provides details of (in test how to boost the output of binary SVMs, for data) image classification. [Mor98] discusses letter 26 15,000 5000 various methods of combining the output of 1-vs-1. Figure 1: Data sets 6. Implementation details The dematalogy dataset had missing values All our test scripts were shell scripts, which in the age attribute. We substituted this with invoked SVMLight . Additionally, for ECOC the maximum frequency value of age. we used a modified form of bch3.c [Zar02] – encoder/decoder program for BCH codes For the KDD dataset we did the following: in C. We modified the program to decode - Reduced the training data to only 1% i.e. and encode in parts. The program generates 50,000 records. the code matrix based on the input set of - Scanned the 1% test set for duplicates -- classes and accepts the ‘received’ code from found 50% duplicates. Eliminated them, our shell scripts for decoding. We hard- finally training data had 23000 records. coded other parameters: code length to 31 - One training record had 55 features bits and errors correcting capability to 15 while all others had 41. We eliminated bits. This resulted in the data length being a this record, although it may not have maximum of 6 bits i.e. we could encode a had contributed to any problems. maximum of 64 classes with these settings. - Feature selection was done using This was sufficient for the data sets we used, “Inducer”[MLC++] and C4.5. Selected which had a maximum of 26 classes. 16 features from the original set of 41. - Stratified the de-duped file to max 50 per class to get 689 records. To run a simpler / faster test.
  • 3. - Used these 689 records to train with the KDD 3.7% 53.9% 57.8% RBF kernel with params: -g 0.03 -c 10 - intrusion q 50 -n 40 detection++ letter 34% 78.5% 88.38% In KDD with FSS using only 689 records. And testing 10 percent of the test file Figure 3: Accruacy with various methods (29615) records: - 5000 records had certain class labels * (1% train data, deduplicated = 23000; present in test file, which were non- 10% test data = 29615) existent in the training data. These 5000 ++ (0.01% train data, deduplicated = 689; records directly contributed to 10% test data = 29615) classification errors. Dataset 1-vs-1 1-vs-many ECOC - 1-vs-1 – almost all records got classified File Convert Learn File Convert Learn File Convert Learn as class 1 (3.7 % accuracy) - 1-vs-many – 13687 errors (53.9% accuracy) - ECOC – 12489 errors (57.8% accuracy) iris 0.09 0.11 0.06 0.17 0.79 1.21 - dermatal 0.43 3.1 0.29 1.19 2.1 9.79 Data Set Parameters ogy iris kernel:poly d=3, glass 0.29 1.14 0.13 0.38 0.96 3.45 other params: c=0.001 ecoli 0.65 1.22 0.19 0.45 1.16 2.53 dermatalogy kernel: RBF g=0.01 other params: c=10 yeast 1.66 9.87 0.6 14.73 4.29 97.1 glass kernel: RBF g=0.8 KDD 75.96 6530 44.37 2980 81.25 13675 other params: c=10 intrusion ecoli kernel: poly d=3 detection other params: - * yeast kernel: RBF g=10 other params: c=10 KDD 6.63 101.9 1.44 19.87 1.2 83.96 KDD kernel: RBF g=0.001 intrusion detection intrusion other params: c=1, q=50, n=40 ++ detection letter kernel: RBF g=0.01 letter 105.4 1609 33.49 972.7 108 9230 other params: c=1, q=50, n=40 Figure 2: SVM Parameters for various data sets #Classes vs accuracy Dataset 1-vs-1 1-vs- ECOC many 120% iris 94% 98% 98% 100% 80% 1v1 dermatalogy 86.9% 89.3% 90.98% 60% 1vm glass 66.7% 68% 73.6% 40% ECOC ecoli 80.3% 55.3% 73.2% 20% yeast 53.3% 44.6% 55.8% 0% 3 6 7 8 10 23 23 26 KDD 5.8% 47.5% 63.7% intrusion detection*
  • 4. [Gha00] Rayid Ghani. Using error- convert vs row s/class correcting codes for text classification. In Proceedings of 16000 14000 the Seventeenth International 12000 Conference on Machine Learning, 1v1 10000 2000. 8000 1vm 6000 4000 ECOC [Ren01] Jason D. M. Rennie. Improving 2000 multi-class text classification with 0 naïve bayes. Master's thesis, Massachusetts Institute of 4 29 63 43 5. Technology, 2001. 0. 6. 0. 10 [Hsu02] C.-W. Hsu and C.-J. Lin. A comparison of methods for multi- learn vs row s/class class support vector machines , IEEE Transactions on Neural 120 Networks, 13(2002), 415-425. 100 80 1v1 [Wes99] J. Weston, "Extensions to the 60 1vm Support Vector Method", PhD 40 ECOC thesis, Royal Holloway University 20 of London, 1999. 0 [Mor98] M. Moreira and E. Mayoraz. Improving pairwise coupling 4 63 29 43 5. classification with error correcting 0. 6. 0. 10 classifiers. Proceedings of the Tenth European Conference on Machine Learning, April 1998. References [All00] E. L. Allwein, R. E. Schapire, and [SVM02] Thorsten Joachims Y. Singer. Reducing multiclass to http://svmlight.joachims.org/, binary: A unifying approach for Cornell University, Department of margin classifiers. Journal of Computer Science. Machine Learning Research, 1:113-141, 2000. [Dua01] Kaibo Duan, S Sathiya Keerthi, Aun Neow PooICONIP – 2001, [Goh01] K. Goh, E. Chang, K. Cheng. 8th International Conference on SVM Binary Classifier Ensembles Neural Information Processing, for Image Classification. Shanghai China, November CIKM’01, November 5-10,2001, 14-18.2001 Atlanta, Georgia, USA. [MLC++] Silicon Graphics, Inc., MLC++, [Die95] T. G. Dietterich and G. Bakiri. http://www.sgi.com/tech/mlc/, Solving multiclass learning 2002 problems via error-correcting output codes. Journal of Artificial [Zar02] R. Morelos-Zaragoza., BCH codes - Intelligence Research, 2:263-286, The Error Correcting Codes (ECC) 1995. Page, http://www.csl.sony.co.jp/person/ morelos/ecc/codes.html, 2002