SlideShare a Scribd company logo
1 of 17
Prepared By : Shraddha Mehta
   Weka was developed at the University
    of Waikato in New Zealand.

   Weka is a open source data mining tool
    developed in Java. It is used for research,
    education, and applications. It can be run
    on Windows, Linux and Mac.
   Main features:
    Comprehensive set of data pre-processing
     tools, learning algorithms and evaluation
     methods
    Graphical     user interfaces (incl. data
     visualization)
    Environment      for comparing learning
     algorithms
   Weka is a collection of machine
    learning algorithms for data mining
    tasks. The algorithms can either be
    applied directly to a dataset (using
    GUI) or called from your own Java
    code (using Weka Java library).
   Weka contains tools for data pre-
    processing, classification, regression,
    clustering, association rules, and
    visualization. It is also well-suited for
    developing new machine learning
    schemes.
Data Ming
             Data Ming
              by Weka
              by Weka
              ••Pre-processing
               Pre-processing      Output
Input
 Input          ••Classification
                 Classification    Output
                                    ••Result
                                     Result
••Rawdata
 Raw data        ••Regression
                  Regression
                  ••Clustering
                   Clustering
            ••AssociationRules
             Association Rules
                ••Visualization
                  Visualization
   There are mainly 2 ways to use Weka to conduct your
    data mining tasks.
     Use Weka Graphical User Interfaces (GUI)
       GUI is straightforward and easy to use. But it is

        not flexible. It can not be called from you
        own application.
 Import  Weka Java library to your own java
 application.
  Developers can leverage on Weka Java library

   to develop software or modify the source code
   to meet special requirements. It is more
   flexible and advanced. But it is not as easy to
   use as GUI.
   Tools (or functions) in Weka include:

     Data preprocessing (e.g., Data Filters),
     Classification (e.g., BayesNet, KNN,       C4.5 Decision Tree,
      Neural Networks, SVM),
     Regression (e.g., Linear Regression, Isotonic Regression, SVM
      for Regression),
     Clustering (e.g., Simple K-means, Expectation Maximization
      (EM)),
     Association rules (e.g., Apriori Algorithm, Predictive Accuracy,
      Confirmation Guided),
     Feature Selection (e.g., Cfs Subset Evaluation, Information Gain,
      Chi-squared Statistic), and
     Visualization (e.g., View different two-dimensional plots of the
      data).
   Weka Data File Format (Input)
   Weka for Data Mining
   Sample Output from Weka (Output)
 The most popular data input format of Weka is “arff” (with “arff”
  being the extension name of your input data file).
FILE FORMAT
 FILE FORMAT
@relation RELATION_NAME
 @relation RELATION_NAME

@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR
@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR

@data
 @data
DATAROW1
 DATAROW1
DATAROW2
 DATAROW2
DATAROW3
 DATAROW3
Different analysis tools/functions




                             The value set of the chosen attribute
                             and the # of input items with each value




Different attributes to
choose
Weka GUI




 Classification Algorithms
   Three sets of classes you may need to use when
    developing your own application
    Classes for Loading Data
    Classes for Classifiers
    Classes for Evaluation
   In sum, the overall goal of Weka is to build a state-
    of-the-art facility for developing machine
    learning (ML) techniques and allow people to
    apply them to real-world data mining problems.
Thank u

More Related Content

What's hot

Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and Beyond
DataWorks Summit
 
Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0
kshanmug2
 
Inb343 week2 sql server intro
Inb343 week2 sql server introInb343 week2 sql server intro
Inb343 week2 sql server intro
Fredlive503
 

What's hot (20)

Weka
WekaWeka
Weka
 
Softwares used in data mining
Softwares used in data miningSoftwares used in data mining
Softwares used in data mining
 
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
 
Recent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and BeyondRecent Developments in Spark MLlib and Beyond
Recent Developments in Spark MLlib and Beyond
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
 
(ATS4-DEV02) Accelrys Query Service: Technology and Tools
(ATS4-DEV02) Accelrys Query Service: Technology and Tools(ATS4-DEV02) Accelrys Query Service: Technology and Tools
(ATS4-DEV02) Accelrys Query Service: Technology and Tools
 
Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop Story
 
Ado.net
Ado.netAdo.net
Ado.net
 
Indexing in eXist database
Indexing in eXist database Indexing in eXist database
Indexing in eXist database
 
Data access
Data accessData access
Data access
 
Sap business objects interview questions
Sap business objects interview questionsSap business objects interview questions
Sap business objects interview questions
 
U-SQL Query Execution and Performance Basics (SQLBits 2016)
U-SQL Query Execution and Performance Basics (SQLBits 2016)U-SQL Query Execution and Performance Basics (SQLBits 2016)
U-SQL Query Execution and Performance Basics (SQLBits 2016)
 
Inb343 week2 sql server intro
Inb343 week2 sql server introInb343 week2 sql server intro
Inb343 week2 sql server intro
 
Incorta spark integration
Incorta spark integrationIncorta spark integration
Incorta spark integration
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
 
Hibernate architecture
Hibernate architectureHibernate architecture
Hibernate architecture
 
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 

Viewers also liked (7)

Shraddha weka
Shraddha wekaShraddha weka
Shraddha weka
 
Abcpp6
Abcpp6Abcpp6
Abcpp6
 
Luci
LuciLuci
Luci
 
Analisi movimento macchina
Analisi movimento macchinaAnalisi movimento macchina
Analisi movimento macchina
 
Presentation1
Presentation1Presentation1
Presentation1
 
Company Profile
Company ProfileCompany Profile
Company Profile
 
Cartea SILUETA SI SANATATE - Radu Mihai Crisan
Cartea SILUETA SI SANATATE -   Radu Mihai CrisanCartea SILUETA SI SANATATE -   Radu Mihai Crisan
Cartea SILUETA SI SANATATE - Radu Mihai Crisan
 

Similar to Shraddha weka

Workware systems company presentation web aug 11
Workware systems company presentation web aug 11Workware systems company presentation web aug 11
Workware systems company presentation web aug 11
deppster
 
3 rad extensibility-srilakshmi_s_rajesh_k
3 rad extensibility-srilakshmi_s_rajesh_k3 rad extensibility-srilakshmi_s_rajesh_k
3 rad extensibility-srilakshmi_s_rajesh_k
IBM
 
Rad Extensibility - Srilakshmi S Rajesh K
Rad Extensibility - Srilakshmi S Rajesh KRad Extensibility - Srilakshmi S Rajesh K
Rad Extensibility - Srilakshmi S Rajesh K
Roopa Nadkarni
 
Java in the database–is it really useful? Solving impossible Big Data challenges
Java in the database–is it really useful? Solving impossible Big Data challengesJava in the database–is it really useful? Solving impossible Big Data challenges
Java in the database–is it really useful? Solving impossible Big Data challenges
Rogue Wave Software
 
Sakai Technical Chinese
Sakai Technical ChineseSakai Technical Chinese
Sakai Technical Chinese
jiali zhang
 
Sakai Technical (Chinese)
Sakai Technical (Chinese)Sakai Technical (Chinese)
Sakai Technical (Chinese)
jiali zhang
 

Similar to Shraddha weka (20)

Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data mining
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals
 
Installation Guidelines_Weka.pptx
Installation Guidelines_Weka.pptxInstallation Guidelines_Weka.pptx
Installation Guidelines_Weka.pptx
 
Alok Resume
Alok ResumeAlok Resume
Alok Resume
 
Java ug
Java ugJava ug
Java ug
 
Workware systems company presentation web aug 11
Workware systems company presentation web aug 11Workware systems company presentation web aug 11
Workware systems company presentation web aug 11
 
Easy Data Object Relational Mapping Tool
Easy Data Object Relational Mapping ToolEasy Data Object Relational Mapping Tool
Easy Data Object Relational Mapping Tool
 
3 rad extensibility-srilakshmi_s_rajesh_k
3 rad extensibility-srilakshmi_s_rajesh_k3 rad extensibility-srilakshmi_s_rajesh_k
3 rad extensibility-srilakshmi_s_rajesh_k
 
Rad Extensibility - Srilakshmi S Rajesh K
Rad Extensibility - Srilakshmi S Rajesh KRad Extensibility - Srilakshmi S Rajesh K
Rad Extensibility - Srilakshmi S Rajesh K
 
Java in the database–is it really useful? Solving impossible Big Data challenges
Java in the database–is it really useful? Solving impossible Big Data challengesJava in the database–is it really useful? Solving impossible Big Data challenges
Java in the database–is it really useful? Solving impossible Big Data challenges
 
1.5 weka an intoduction
1.5 weka an intoduction1.5 weka an intoduction
1.5 weka an intoduction
 
B040101007012
B040101007012B040101007012
B040101007012
 
WEKA.pptx
 WEKA.pptx WEKA.pptx
WEKA.pptx
 
Python and data analytics
Python and data analyticsPython and data analytics
Python and data analytics
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
Senior Sofware Resume
Senior Sofware ResumeSenior Sofware Resume
Senior Sofware Resume
 
Sakai Technical Chinese
Sakai Technical ChineseSakai Technical Chinese
Sakai Technical Chinese
 
Sakai Technical (Chinese)
Sakai Technical (Chinese)Sakai Technical (Chinese)
Sakai Technical (Chinese)
 
Sakai Technical
Sakai TechnicalSakai Technical
Sakai Technical
 
Odi ireland rittman
Odi ireland rittmanOdi ireland rittman
Odi ireland rittman
 

Shraddha weka

  • 1. Prepared By : Shraddha Mehta
  • 2. Weka was developed at the University of Waikato in New Zealand.  Weka is a open source data mining tool developed in Java. It is used for research, education, and applications. It can be run on Windows, Linux and Mac.
  • 3.
  • 4. Main features: Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms
  • 5. Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset (using GUI) or called from your own Java code (using Weka Java library).
  • 6. Weka contains tools for data pre- processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
  • 7. Data Ming Data Ming by Weka by Weka ••Pre-processing Pre-processing Output Input Input ••Classification Classification Output ••Result Result ••Rawdata Raw data ••Regression Regression ••Clustering Clustering ••AssociationRules Association Rules ••Visualization Visualization
  • 8. There are mainly 2 ways to use Weka to conduct your data mining tasks.  Use Weka Graphical User Interfaces (GUI)  GUI is straightforward and easy to use. But it is not flexible. It can not be called from you own application.
  • 9.  Import Weka Java library to your own java application.  Developers can leverage on Weka Java library to develop software or modify the source code to meet special requirements. It is more flexible and advanced. But it is not as easy to use as GUI.
  • 10. Tools (or functions) in Weka include:  Data preprocessing (e.g., Data Filters),  Classification (e.g., BayesNet, KNN, C4.5 Decision Tree, Neural Networks, SVM),  Regression (e.g., Linear Regression, Isotonic Regression, SVM for Regression),  Clustering (e.g., Simple K-means, Expectation Maximization (EM)),  Association rules (e.g., Apriori Algorithm, Predictive Accuracy, Confirmation Guided),  Feature Selection (e.g., Cfs Subset Evaluation, Information Gain, Chi-squared Statistic), and  Visualization (e.g., View different two-dimensional plots of the data).
  • 11. Weka Data File Format (Input)  Weka for Data Mining  Sample Output from Weka (Output)
  • 12.  The most popular data input format of Weka is “arff” (with “arff” being the extension name of your input data file). FILE FORMAT FILE FORMAT @relation RELATION_NAME @relation RELATION_NAME @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR @data @data DATAROW1 DATAROW1 DATAROW2 DATAROW2 DATAROW3 DATAROW3
  • 13. Different analysis tools/functions The value set of the chosen attribute and the # of input items with each value Different attributes to choose
  • 15. Three sets of classes you may need to use when developing your own application Classes for Loading Data Classes for Classifiers Classes for Evaluation
  • 16. In sum, the overall goal of Weka is to build a state- of-the-art facility for developing machine learning (ML) techniques and allow people to apply them to real-world data mining problems.