SlideShare a Scribd company logo
1 of 9
Download to read offline
IDP in Educational Technology, 2016.
Exploring WEKA Tool using Data Mining Concepts, is licensed under the Creative
Commons Attribution-ShareAlike 4.0 International License. You are free to use, distribute
and modify it, including for commercial purposes, provided you acknowledge the source and
share-alike.
To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/
Exploring WEKA Tool using Data
Mining Concepts
Work done as part of AICTE approved FDP on Use of ICT in
Education for Online and Blended Learning
RC1352_004
Team Leader: V.P.Hara Gopal
Team Members: Brahmananda Reddy A
Talluri Sunil Kumar
Sagar Yeruva
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 2
TABLE OF CONTENTS
1 INTRODUCTION 3
2 EXPLORING THE EXPLORER 4
3 EXPLORING DATA SETS 6
4 BUILDING A CLASSIFIER 6
5 USING A FILTER 7
6 VISUALIZING YOUR DATA 8
7 REFERENCES 8
8 CONSOLIDATED LOG OF TEAM WORK 9
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 3
RC1352_004
V.P.Hara Gopal
Brahmananda Reddy A
Talluri Sunil Kumar
Sagar Yeruva
EXPLORING WEKA USING DATA MINING CONCEPTS
1. INTRODUCTION
Data Mining:
Data mining is about going from data to information, information that can give
you useful predictions.
Weka?
A bird found only in New Zealand.
Data mining workbench
Waikato Environment for Knowledge Analysis(WEKA).
What will you learn?
 Load data into Weka and look at it
 Use filters to preprocess it
 Explore it using interactive visualization
 Apply classification algorithms
 Interpret the output
 Understand evaluation methods and their implications
 Understand various representations for models
 Explain how popular machine learning algorithms work
 Be aware of common pitfalls with data mining
Use Weka on your own data… and understand what you are doing!
Getting Started with WEKA
 Install Weka
 Explore the “Explorer” interface
 Explore some datasets
 Build a classifier
 Interpret the output
 Use filters
 Visualize your data set
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 4
2. EXPLORING THE EXPLORER
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 5
 Install Weka
 Get datasets
 Open Explorer
 Open a dataset (weather.nominal.arff)
 Look at attributes and their values
 Edit the dataset
 Save it
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 6
3. EXPLORING DATA SETS
 The classification problem
 weather.nominal, weather.numeric
 Nominal vs numeric attributes
 ARFF file format
 glass.arff dataset
 Sanity checking attributes
4. BUILDING A CLASSIFIER
Use J48 to analyze the glass dataset
 Open file glass.arff (or leave it open from the last lesson)
 Check the available classifiers
 Choose the J48 decision tree learner (trees>J48)
 Run it
 Examine the output
 Look at the correctly classified instances … and the confusion matrix
Investigate J48
 Open the configuration panel
 Check the More information
 Examine the options
 Use an unpruned tree
 Look at leaf sizes
 Set minNumObj to 15 to avoid small leaves
 Visualize tree using right‐click menu
From C4.5 to J48
 ID3 (1979)
 C4.5 (1993)
 C4.8 (1996?) J48
 C5.0 (commercial)
 Classifiers in Weka
 Classifying the glass dataset
 Interpreting J48 output
 J48 configuration panel
 … option: pruned vs unpruned trees
 … option: avoid small leaves
 J48 ~ C4.5
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 7
5. USING A FILTER
Use a filter to remove an attribute
 Open weather.nominal.arff (again!)
 Check the filters
o supervised vs unsupervised
o attribute vs instance
 Choose the unsupervised attribute filter Remove
 Check the More information; look at the options
 Set attributeIndices to 3 and click OK
 Apply the filter
 Recall that you can Save the result
 Press Undo
Remove instances where humidity is high
 Supervised or unsupervised?
 Attribute or instance?
 Look at them
 Select RemoveWithValues
 Set attributeIndex
 Set nominalIndices
 Apply
 Undo
Fewer attributes, better classification!
 Open glass.arff
 Run J48 (trees>J48)
 Remove Fe
 Remove all attributes except RI and MG
 Look at the decision trees
 Use right‐click menu to visualize decision trees
 Filters in Weka
 Supervised vs unsupervised, attribute vs instance
 To find the right one, you need to look!
 Filters can be very powerful
 Judiciously removing attributes can
– improve performance
– increase comprehensibility
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 8
6. VISUALIZING YOUR DATA
Open Visualize Panel
 Open iris.arff
 Bring up Visualize panel
 Click one of the plots; examine some instances
 Set x axis to petalwidth and y axis to petallength
 Click on Class colour to change the colour
 Bars on the right change correspond to attributes: click for x axis; right‐click for
y axis
 Jitter slider
 Show Select Instance: Rectangle option
 Submit, Reset, Clear and Save
Visualizing Classification Errors
 Run J48 (trees>J48)
 Visualize classifier errors (from Results list)
 Plot predictedclass against class
 Identify errors shown by confusion matrix
 Get down and dirty with your data
 Visualize it
 Clean it up by deleting outliers
 Look at classification errors
o (there’s a filter that allows you to add classifications as a new attribute)
7. REFERENCES
1. Data Mining: Practical machine learning tools and techniques, by Ian H. Witten,
Eibe Frank and Mark A. Hall. Morgan Kaufmann, 2011.
E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 9
8. CONSOLIDATED LOG OF TEAM WORK
Activity
Period of
Activity
Team Member
Amount of Time
Spent (in min.)
Discussion 26-06-2016
V.P.Hara Gopal 50
Brahmananda Reddy A 50
Talluri Sunil Kumar 50
Sagar Yeruva 50
Tool Exploration
(Documentation
and Weka Tool)
26-06-2016
&
01-06-2016
V.P.Hara Gopal 45
Brahmananda Reddy A 45
Talluri Sunil Kumar 90
Sagar Yeruva 90
OER Creation
06-07-2016
to
09-07-2016
V.P.Hara Gopal 40
Brahmananda Reddy A 40
Talluri Sunil Kumar 90
Sagar Yeruva 90
OER
Documentation
12-07-2016
to
18-07-2016
V.P.Hara Gopal 90
Brahmananda Reddy A 90
Talluri Sunil Kumar 25
Sagar Yeruva 25
Individual
Reflection (Diary
Logging)
20-07-2016
V.P.Hara Gopal 15
Brahmananda Reddy A 10
Talluri Sunil Kumar 10
Sagar Yeruva 10
OER Evaluation
(Self-through
meeting)
21-07-2016
&
22-07-2016
V.P.Hara Gopal 45
Brahmananda Reddy A 45
Talluri Sunil Kumar 45
Sagar Yeruva 45

More Related Content

Similar to 1352 004 oer submission

data mining with weka application
data mining with weka applicationdata mining with weka application
data mining with weka applicationRezapourabbas
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using wekarathorenitin87
 
TAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKATAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKAFayan TAO
 
Introduction to weka
Introduction to wekaIntroduction to weka
Introduction to wekaJK Knowledge
 
Important SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A GradeImportant SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A GradeLesa Cote
 
Data Mining Lab Manual.docx
Data Mining Lab Manual.docxData Mining Lab Manual.docx
Data Mining Lab Manual.docxpadmavathi95
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Carole Goble
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
lab #6
lab #6lab #6
lab #6butest
 
weka-190429184259.pdf
weka-190429184259.pdfweka-190429184259.pdf
weka-190429184259.pdfTeamRebel1
 
Weka presentation
Weka presentationWeka presentation
Weka presentationAbrar ali
 
Analysis using r
Analysis using rAnalysis using r
Analysis using rPriya Mohan
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialTariq King
 
Machine Learning with WEKA
Machine Learning with WEKAMachine Learning with WEKA
Machine Learning with WEKAbutest
 
Root cause of community problem for this discussion, you will i
Root cause of community problem for this discussion, you will iRoot cause of community problem for this discussion, you will i
Root cause of community problem for this discussion, you will issusere73ce3
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 

Similar to 1352 004 oer submission (20)

data mining with weka application
data mining with weka applicationdata mining with weka application
data mining with weka application
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
TAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKATAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKA
 
Acutate erd pro
Acutate erd proAcutate erd pro
Acutate erd pro
 
Introduction to weka
Introduction to wekaIntroduction to weka
Introduction to weka
 
Important SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A GradeImportant SAS Tips and Tricks for A Grade
Important SAS Tips and Tricks for A Grade
 
Wek1
Wek1Wek1
Wek1
 
G046024851
G046024851G046024851
G046024851
 
Data Mining Lab Manual.docx
Data Mining Lab Manual.docxData Mining Lab Manual.docx
Data Mining Lab Manual.docx
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
lab #6
lab #6lab #6
lab #6
 
weka-190429184259.pdf
weka-190429184259.pdfweka-190429184259.pdf
weka-190429184259.pdf
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 
Analysis using r
Analysis using rAnalysis using r
Analysis using r
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World Tutorial
 
Machine Learning with WEKA
Machine Learning with WEKAMachine Learning with WEKA
Machine Learning with WEKA
 
Root cause of community problem for this discussion, you will i
Root cause of community problem for this discussion, you will iRoot cause of community problem for this discussion, you will i
Root cause of community problem for this discussion, you will i
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 

Recently uploaded

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 

Recently uploaded (20)

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 

1352 004 oer submission

  • 1. IDP in Educational Technology, 2016. Exploring WEKA Tool using Data Mining Concepts, is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. You are free to use, distribute and modify it, including for commercial purposes, provided you acknowledge the source and share-alike. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ Exploring WEKA Tool using Data Mining Concepts Work done as part of AICTE approved FDP on Use of ICT in Education for Online and Blended Learning RC1352_004 Team Leader: V.P.Hara Gopal Team Members: Brahmananda Reddy A Talluri Sunil Kumar Sagar Yeruva
  • 2. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 2 TABLE OF CONTENTS 1 INTRODUCTION 3 2 EXPLORING THE EXPLORER 4 3 EXPLORING DATA SETS 6 4 BUILDING A CLASSIFIER 6 5 USING A FILTER 7 6 VISUALIZING YOUR DATA 8 7 REFERENCES 8 8 CONSOLIDATED LOG OF TEAM WORK 9
  • 3. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 3 RC1352_004 V.P.Hara Gopal Brahmananda Reddy A Talluri Sunil Kumar Sagar Yeruva EXPLORING WEKA USING DATA MINING CONCEPTS 1. INTRODUCTION Data Mining: Data mining is about going from data to information, information that can give you useful predictions. Weka? A bird found only in New Zealand. Data mining workbench Waikato Environment for Knowledge Analysis(WEKA). What will you learn?  Load data into Weka and look at it  Use filters to preprocess it  Explore it using interactive visualization  Apply classification algorithms  Interpret the output  Understand evaluation methods and their implications  Understand various representations for models  Explain how popular machine learning algorithms work  Be aware of common pitfalls with data mining Use Weka on your own data… and understand what you are doing! Getting Started with WEKA  Install Weka  Explore the “Explorer” interface  Explore some datasets  Build a classifier  Interpret the output  Use filters  Visualize your data set
  • 4. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 4 2. EXPLORING THE EXPLORER
  • 5. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 5  Install Weka  Get datasets  Open Explorer  Open a dataset (weather.nominal.arff)  Look at attributes and their values  Edit the dataset  Save it
  • 6. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 6 3. EXPLORING DATA SETS  The classification problem  weather.nominal, weather.numeric  Nominal vs numeric attributes  ARFF file format  glass.arff dataset  Sanity checking attributes 4. BUILDING A CLASSIFIER Use J48 to analyze the glass dataset  Open file glass.arff (or leave it open from the last lesson)  Check the available classifiers  Choose the J48 decision tree learner (trees>J48)  Run it  Examine the output  Look at the correctly classified instances … and the confusion matrix Investigate J48  Open the configuration panel  Check the More information  Examine the options  Use an unpruned tree  Look at leaf sizes  Set minNumObj to 15 to avoid small leaves  Visualize tree using right‐click menu From C4.5 to J48  ID3 (1979)  C4.5 (1993)  C4.8 (1996?) J48  C5.0 (commercial)  Classifiers in Weka  Classifying the glass dataset  Interpreting J48 output  J48 configuration panel  … option: pruned vs unpruned trees  … option: avoid small leaves  J48 ~ C4.5
  • 7. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 7 5. USING A FILTER Use a filter to remove an attribute  Open weather.nominal.arff (again!)  Check the filters o supervised vs unsupervised o attribute vs instance  Choose the unsupervised attribute filter Remove  Check the More information; look at the options  Set attributeIndices to 3 and click OK  Apply the filter  Recall that you can Save the result  Press Undo Remove instances where humidity is high  Supervised or unsupervised?  Attribute or instance?  Look at them  Select RemoveWithValues  Set attributeIndex  Set nominalIndices  Apply  Undo Fewer attributes, better classification!  Open glass.arff  Run J48 (trees>J48)  Remove Fe  Remove all attributes except RI and MG  Look at the decision trees  Use right‐click menu to visualize decision trees  Filters in Weka  Supervised vs unsupervised, attribute vs instance  To find the right one, you need to look!  Filters can be very powerful  Judiciously removing attributes can – improve performance – increase comprehensibility
  • 8. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 8 6. VISUALIZING YOUR DATA Open Visualize Panel  Open iris.arff  Bring up Visualize panel  Click one of the plots; examine some instances  Set x axis to petalwidth and y axis to petallength  Click on Class colour to change the colour  Bars on the right change correspond to attributes: click for x axis; right‐click for y axis  Jitter slider  Show Select Instance: Rectangle option  Submit, Reset, Clear and Save Visualizing Classification Errors  Run J48 (trees>J48)  Visualize classifier errors (from Results list)  Plot predictedclass against class  Identify errors shown by confusion matrix  Get down and dirty with your data  Visualize it  Clean it up by deleting outliers  Look at classification errors o (there’s a filter that allows you to add classifications as a new attribute) 7. REFERENCES 1. Data Mining: Practical machine learning tools and techniques, by Ian H. Witten, Eibe Frank and Mark A. Hall. Morgan Kaufmann, 2011.
  • 9. E x p l o r i n g W E K A T o o l U s i n g D a t a M i n i n g C o n c e p t s Page 9 8. CONSOLIDATED LOG OF TEAM WORK Activity Period of Activity Team Member Amount of Time Spent (in min.) Discussion 26-06-2016 V.P.Hara Gopal 50 Brahmananda Reddy A 50 Talluri Sunil Kumar 50 Sagar Yeruva 50 Tool Exploration (Documentation and Weka Tool) 26-06-2016 & 01-06-2016 V.P.Hara Gopal 45 Brahmananda Reddy A 45 Talluri Sunil Kumar 90 Sagar Yeruva 90 OER Creation 06-07-2016 to 09-07-2016 V.P.Hara Gopal 40 Brahmananda Reddy A 40 Talluri Sunil Kumar 90 Sagar Yeruva 90 OER Documentation 12-07-2016 to 18-07-2016 V.P.Hara Gopal 90 Brahmananda Reddy A 90 Talluri Sunil Kumar 25 Sagar Yeruva 25 Individual Reflection (Diary Logging) 20-07-2016 V.P.Hara Gopal 15 Brahmananda Reddy A 10 Talluri Sunil Kumar 10 Sagar Yeruva 10 OER Evaluation (Self-through meeting) 21-07-2016 & 22-07-2016 V.P.Hara Gopal 45 Brahmananda Reddy A 45 Talluri Sunil Kumar 45 Sagar Yeruva 45