SlideShare a Scribd company logo
1 of 23
Download to read offline
WEKAWEKA
A . Antony Alex MCA
Dr G R D College of Science – CBE
Tamil Nadu - India
Waikato Environment forWaikato Environment for
Knowledge AnalysisKnowledge Analysis
A collection of open source ML algorithms
◦ pre-processing
◦ classifiers
◦ clustering
◦ association rule
It’s a data mining/machine learning tool developed byIt’s a data mining/machine learning tool developed by
Department of Computer Science, University of Waikato,
New Zealand.
Weka is also a bird found only on the islands of New
Zealand.
Java based
Routines are implemented as classes and logically
arranged in packages
Comes with an extensive GUI interface
Download and Install WEKADownload and Install WEKA
Website:
http://www.cs.waikato.ac.nz/~ml/weka/index.
html
Support multiple platforms (written in java):Support multiple platforms (written in java):
◦ Windows, Mac OS X and Linux
39/8/2012
Main FeaturesMain Features
49 data preprocessing tools
76 classification/regression algorithms
8 clustering algorithms
3 algorithms for finding association rules
15 attribute/subset evaluators + 10 search15 attribute/subset evaluators + 10 search
algorithms for feature selection
49/8/2012
• Dataset
• Classifier
• Weka.filters
• Weka.classifiers
java weka.core.converters.CSVLoader data.csv > data.arff
Command line interface
java weka.core.converters.CSVLoader data.csv > data.arff
java weka.core.converters.C45Loader c45_filestem > data.arff
java weka.classifiers.rules.ZeroR -t weather.arff
java weka.classifiers.trees.J48 -t weather.arff
java weka.filters.supervised.attribute.Discretize -i data/iris.arff  -o
iris-nom.arff -c last
java weka.filters.supervised.attribute.Discretize -i data/cpu.arff  -o
cpu-classvendor-nom.arff -c first
Main GUIMain GUI
Three graphical user interfaces
◦ “The Explorer” (exploratory data
analysis)
◦ “The Experimenter” (experimental
environment)
◦ “The KnowledgeFlow” (new process◦ “The KnowledgeFlow” (new process
model inspired interface)
69/8/2012
Explorer: preExplorer: pre--processing the dataprocessing the data
Data can be imported from a file in various
formats:ARFF, CSV, C4.5, binary
Data can also be read from a URL or from
an SQL database (using JDBC)
Pre-processing tools inWEKA are called
9/8/2012 7
Pre-processing tools inWEKA are called
“filters”
WEKA contains filters for:
◦ Discretization, normalization, resampling,
attribute selection, transforming and combining
attributes, …
DatabaseUtils.props.hsql - HSQLDB
DatabaseUtils.props.msaccess - MS Access
jdbcDriver
jdbcURL
ACCESSING DATABASEACCESSING DATABASE
DatabaseUtils.props.mssqlserver - MS SQL Server
DatabaseUtils.props.mysql - MySQL
DatabaseUtils.props.odbc - ODBC access via ODBC/JDBC
bridge,
DatabaseUtils.props.oracle - Oracle 10g
DatabaseUtils.props.postgresql - PostgreSQL 7.4
DatabaseUtils.props.sqlite3 - sqlite 3.x
@relation heart-disease-simplified
@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}
@attribute cholesterol numeric
@attribute exercise_induced_angina { no, yes}
WEKA “flat” filesWEKA “flat” files
9/8/2012 9
@attribute exercise_induced_angina { no, yes}
@attribute class { present, not_present}
@data
63,male,typ_angina,233,no,not_present
67,male,asympt,286,yes,present
67,male,asympt,229,yes,present
38,female,non_anginal,?,no,not_present
...
9/8/2012 University of Waikato 10
9/8/2012 University of Waikato 11
9/8/2012 University of Waikato 12
9/8/2012 University of Waikato 13
WEKA:: Explorer: building “classifiers”WEKA:: Explorer: building “classifiers”
Classifiers in WEKA are models for predicting
nominal or numeric quantities
Implemented learning schemes include:
◦ Decision trees and lists, instance-based classifiers,
support vector machines, multi-layer perceptrons,
logistic regression, Bayes’ nets, …
support vector machines, multi-layer perceptrons,
logistic regression, Bayes’ nets, …
“Meta”-classifiers include:
◦ Bagging, boosting, stacking, error-correcting output
codes, locally weighted learning, …
Explorer: clustering dataExplorer: clustering data
WEKA contains “clusterers” for finding
groups of similar instances in a dataset
Implemented schemes are:
◦ k-Means, EM, Cobweb, X-means, FarthestFirst
9/8/2012 16
◦ k-Means, EM, Cobweb, X-means, FarthestFirst
Clusters can be visualized and compared to
“true” clusters
Explorer: finding associationsExplorer: finding associations
WEKA contains an implementation of the
Apriori algorithm for learning association rules
◦ Works only with discrete data
Can identify statistical dependencies between
9/8/2012 17
Can identify statistical dependencies between
groups of attributes:
◦ milk, butter ⇒ bread, eggs (with confidence 0.9 and
support 2000)
Apriori can compute all rules that have a given
minimum support and exceed a given
confidence
Explorer: attribute selectionExplorer: attribute selection
Panel that can be used to investigate which
(subsets of) attributes are the most predictive
ones
Attribute selection methods contain two parts:
9/8/2012 18
◦ A search method: best-first, forward selection,
random, exhaustive, genetic algorithm, ranking
◦ An evaluation method: correlation-based, wrapper,
information gain, chi-squared, …
Very flexible:WEKA allows (almost) arbitrary
combinations of these two
Explorer: data visualizationExplorer: data visualization
Visualization very useful in practice: e.g. helps
to determine difficulty of the learning
problem
WEKA can visualize single attributes (1-d)
and pairs of attributes (2-d)
◦ To do: rotating 3-d visualizations (Xgobi-style)
9/8/2012 19
◦ To do: rotating 3-d visualizations (Xgobi-style)
Color-coded class values
“Jitter” option to deal with nominal
attributes (and to detect “hidden” data
points)
“Zoom-in” function
Thank UThank U

More Related Content

What's hot

Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Patrick Van Renterghem
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingKamal Acharya
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-ReduceBrendan Tierney
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkHome
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data StackZubair Nabi
 
Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0Olivier DASINI
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLVenu Anuganti
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentationArvind Kumar
 
JDBC Java Database Connectivity
JDBC Java Database ConnectivityJDBC Java Database Connectivity
JDBC Java Database ConnectivityRanjan Kumar
 
A 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeA 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeSnowflake Computing
 
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...Spark Summit
 
Ray and Its Growing Ecosystem
Ray and Its Growing EcosystemRay and Its Growing Ecosystem
Ray and Its Growing EcosystemDatabricks
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 

What's hot (20)

Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-Reduce
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0Upgrade from MySQL 5.7 to MySQL 8.0
Upgrade from MySQL 5.7 to MySQL 8.0
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQL
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
 
JDBC Java Database Connectivity
JDBC Java Database ConnectivityJDBC Java Database Connectivity
JDBC Java Database Connectivity
 
A 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeA 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with Snowflake
 
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
 
Performance tuning in sql server
Performance tuning in sql serverPerformance tuning in sql server
Performance tuning in sql server
 
NoSql
NoSqlNoSql
NoSql
 
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction
 
Ray and Its Growing Ecosystem
Ray and Its Growing EcosystemRay and Its Growing Ecosystem
Ray and Its Growing Ecosystem
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 

Similar to Weka

Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningKeshab Kumar Gaurav
 
Data Mining with WEKA WEKA
Data Mining with WEKA WEKAData Mining with WEKA WEKA
Data Mining with WEKA WEKAbutest
 
Weka toolkit introduction
Weka toolkit introductionWeka toolkit introduction
Weka toolkit introductionbutest
 
Weka toolkit introduction
Weka toolkit introductionWeka toolkit introduction
Weka toolkit introductionbutest
 
data mining with weka application
data mining with weka applicationdata mining with weka application
data mining with weka applicationRezapourabbas
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Wekaweka Content
 
wekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdfwekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdfDr. Rajesh P Barnwal
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using wekaPrashant Menon
 
1.5 weka an intoduction
1.5 weka an intoduction1.5 weka an intoduction
1.5 weka an intoductionKrish_ver2
 
Introduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.pptIntroduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.pptradhikadsu
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 

Similar to Weka (20)

Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data mining
 
Data Mining with WEKA WEKA
Data Mining with WEKA WEKAData Mining with WEKA WEKA
Data Mining with WEKA WEKA
 
Weka toolkit introduction
Weka toolkit introductionWeka toolkit introduction
Weka toolkit introduction
 
Weka toolkit introduction
Weka toolkit introductionWeka toolkit introduction
Weka toolkit introduction
 
Wek1
Wek1Wek1
Wek1
 
Weka
Weka Weka
Weka
 
data mining with weka application
data mining with weka applicationdata mining with weka application
data mining with weka application
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
wekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdfwekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdf
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
Weka
WekaWeka
Weka
 
1.5 weka an intoduction
1.5 weka an intoduction1.5 weka an intoduction
1.5 weka an intoduction
 
Introduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.pptIntroduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.ppt
 
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
 
Data Quality
Data QualityData Quality
Data Quality
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
IT6701-Information management question bank
IT6701-Information management question bankIT6701-Information management question bank
IT6701-Information management question bank
 
Jdbc
JdbcJdbc
Jdbc
 

More from Antony Alex

Transposition cipher
Transposition cipherTransposition cipher
Transposition cipherAntony Alex
 
Textile management system review iii
Textile management system   review iiiTextile management system   review iii
Textile management system review iiiAntony Alex
 
Software project management requirements analysis
Software project management requirements analysisSoftware project management requirements analysis
Software project management requirements analysisAntony Alex
 
Installing windows xp
Installing windows xpInstalling windows xp
Installing windows xpAntony Alex
 
Application express
Application expressApplication express
Application expressAntony Alex
 

More from Antony Alex (10)

Transposition cipher
Transposition cipherTransposition cipher
Transposition cipher
 
Topdown parsing
Topdown parsingTopdown parsing
Topdown parsing
 
Textile management system review iii
Textile management system   review iiiTextile management system   review iii
Textile management system review iii
 
Sound
SoundSound
Sound
 
Software project management requirements analysis
Software project management requirements analysisSoftware project management requirements analysis
Software project management requirements analysis
 
Site map & web
Site map & webSite map & web
Site map & web
 
Review ii
Review iiReview ii
Review ii
 
Installing windows xp
Installing windows xpInstalling windows xp
Installing windows xp
 
Application express
Application expressApplication express
Application express
 
Android
AndroidAndroid
Android
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Weka

  • 1. WEKAWEKA A . Antony Alex MCA Dr G R D College of Science – CBE Tamil Nadu - India
  • 2. Waikato Environment forWaikato Environment for Knowledge AnalysisKnowledge Analysis A collection of open source ML algorithms ◦ pre-processing ◦ classifiers ◦ clustering ◦ association rule It’s a data mining/machine learning tool developed byIt’s a data mining/machine learning tool developed by Department of Computer Science, University of Waikato, New Zealand. Weka is also a bird found only on the islands of New Zealand. Java based Routines are implemented as classes and logically arranged in packages Comes with an extensive GUI interface
  • 3. Download and Install WEKADownload and Install WEKA Website: http://www.cs.waikato.ac.nz/~ml/weka/index. html Support multiple platforms (written in java):Support multiple platforms (written in java): ◦ Windows, Mac OS X and Linux 39/8/2012
  • 4. Main FeaturesMain Features 49 data preprocessing tools 76 classification/regression algorithms 8 clustering algorithms 3 algorithms for finding association rules 15 attribute/subset evaluators + 10 search15 attribute/subset evaluators + 10 search algorithms for feature selection 49/8/2012
  • 5. • Dataset • Classifier • Weka.filters • Weka.classifiers java weka.core.converters.CSVLoader data.csv > data.arff Command line interface java weka.core.converters.CSVLoader data.csv > data.arff java weka.core.converters.C45Loader c45_filestem > data.arff java weka.classifiers.rules.ZeroR -t weather.arff java weka.classifiers.trees.J48 -t weather.arff java weka.filters.supervised.attribute.Discretize -i data/iris.arff -o iris-nom.arff -c last java weka.filters.supervised.attribute.Discretize -i data/cpu.arff -o cpu-classvendor-nom.arff -c first
  • 6. Main GUIMain GUI Three graphical user interfaces ◦ “The Explorer” (exploratory data analysis) ◦ “The Experimenter” (experimental environment) ◦ “The KnowledgeFlow” (new process◦ “The KnowledgeFlow” (new process model inspired interface) 69/8/2012
  • 7. Explorer: preExplorer: pre--processing the dataprocessing the data Data can be imported from a file in various formats:ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools inWEKA are called 9/8/2012 7 Pre-processing tools inWEKA are called “filters” WEKA contains filters for: ◦ Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …
  • 8. DatabaseUtils.props.hsql - HSQLDB DatabaseUtils.props.msaccess - MS Access jdbcDriver jdbcURL ACCESSING DATABASEACCESSING DATABASE DatabaseUtils.props.mssqlserver - MS SQL Server DatabaseUtils.props.mysql - MySQL DatabaseUtils.props.odbc - ODBC access via ODBC/JDBC bridge, DatabaseUtils.props.oracle - Oracle 10g DatabaseUtils.props.postgresql - PostgreSQL 7.4 DatabaseUtils.props.sqlite3 - sqlite 3.x
  • 9. @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} WEKA “flat” filesWEKA “flat” files 9/8/2012 9 @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...
  • 14. WEKA:: Explorer: building “classifiers”WEKA:: Explorer: building “classifiers” Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: ◦ Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: ◦ Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, …
  • 15.
  • 16. Explorer: clustering dataExplorer: clustering data WEKA contains “clusterers” for finding groups of similar instances in a dataset Implemented schemes are: ◦ k-Means, EM, Cobweb, X-means, FarthestFirst 9/8/2012 16 ◦ k-Means, EM, Cobweb, X-means, FarthestFirst Clusters can be visualized and compared to “true” clusters
  • 17. Explorer: finding associationsExplorer: finding associations WEKA contains an implementation of the Apriori algorithm for learning association rules ◦ Works only with discrete data Can identify statistical dependencies between 9/8/2012 17 Can identify statistical dependencies between groups of attributes: ◦ milk, butter ⇒ bread, eggs (with confidence 0.9 and support 2000) Apriori can compute all rules that have a given minimum support and exceed a given confidence
  • 18. Explorer: attribute selectionExplorer: attribute selection Panel that can be used to investigate which (subsets of) attributes are the most predictive ones Attribute selection methods contain two parts: 9/8/2012 18 ◦ A search method: best-first, forward selection, random, exhaustive, genetic algorithm, ranking ◦ An evaluation method: correlation-based, wrapper, information gain, chi-squared, … Very flexible:WEKA allows (almost) arbitrary combinations of these two
  • 19. Explorer: data visualizationExplorer: data visualization Visualization very useful in practice: e.g. helps to determine difficulty of the learning problem WEKA can visualize single attributes (1-d) and pairs of attributes (2-d) ◦ To do: rotating 3-d visualizations (Xgobi-style) 9/8/2012 19 ◦ To do: rotating 3-d visualizations (Xgobi-style) Color-coded class values “Jitter” option to deal with nominal attributes (and to detect “hidden” data points) “Zoom-in” function
  • 20.
  • 21.
  • 22.