SlideShare a Scribd company logo
1 of 16
Data Mining:
Implementation of Data
Mining Techniques using
RapidMiner software
Prepared by
Mohammed Kharma
Definitions review
• Cluster: A collection of data objects
– similar (or related) to one another within the
same group
– dissimilar (or unrelated) to the objects in other
groups
• Cluster analysis
– Finding similarities between data according to the
characteristics found in the data and grouping
similar data objects into clusters
Clustering Methods
• Partitioning :
– Unsupervised learning algorithms, Construct various
partitions and then evaluate them by some criterion,
e.g., minimizing the sum of square errors
– Typical methods: k-means, k-medoids
• Hierarchical :
– Create a hierarchical decomposition of the set of data
(or objects) using some criterion
– Typical methods: Diana, Agnes, BIRCH, ROCK,
CAMELEON
Illustration & compression of 2
clustering technique using
Rapidminer tool and Java
application
illustrate of 2 clustering technique
using Rapidminer tool and Java
• K-means algorithm:
We performed two test
1. Using java program: program parameters
K = 2;
Data:
22 21
19 20
18 22
1 3
3 2
6
K-means Clustering
• Input: the number of clusters K and the collection of n
instances
• Output: a set of k clusters that minimizes the squared error
criterion
• Method:
– Arbitrarily choose k instances as the initial cluster centers
– Repeat
• (Re)assign each instance to the cluster to which the
instance is the most similar, based on the mean value of
the instances in the cluster
• Update cluster means (compute mean value of the
instances for each cluster)
– Until no change in the assignment
• Squared Error Criterion
– E = ∑i=1 k ∑ pЄCi |p-mi|2
– where mi are the cluster means and p are points in clusters
The result K-Means-java program
The result of K-Means-RapidMiner
The result of K-Means-RapidMiner
Continued-The result of K-Means-
RapidMiner
11
K-medoids
• Input: the number of clusters K and the collection of n
instances
• Output: A set of k clusters that minimizes the sum of the
dissimilarities of all the instances to their nearest medoids
• Method:
– Arbitrarily choose k instances as the initial medoids
– Repeat
• (Re)assign each remaining instance to the cluster with
the nearest medoid
• Randomly select a non-medoid instance, or
• Compute the total cost, S, of swapping Oj with Or
• If S<0 then swap Oj with Or to form the new set of k
medoids
– Until no change
The result of k-medoids-RapidMiner
The result of k-medoids-RapidMiner
Java Live Demo:
http://home.dei.polimi.it/matteucc/Clustering/t
utorial_html/AppletKM.html
Comparison
The results of both algorithms are the same
Both require K to be specified in the
input
K-medoids is less influenced by outliers in the
data
Both methods assign each instance exactly to
one cluster
»Thank you

More Related Content

What's hot

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
How to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | EdurekaHow to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | EdurekaEdureka!
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with SparkBarak Gitsis
 
MLconf NYC Xiangrui Meng
MLconf NYC Xiangrui MengMLconf NYC Xiangrui Meng
MLconf NYC Xiangrui MengMLconf
 
Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013MLconf
 
Flexible Memory Allocation in Kinetic Monte Carlo Simulations
Flexible Memory Allocation in Kinetic Monte Carlo SimulationsFlexible Memory Allocation in Kinetic Monte Carlo Simulations
Flexible Memory Allocation in Kinetic Monte Carlo SimulationsAaron Craig
 
Gelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area MeetupGelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area MeetupVasia Kalavri
 
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...Daiki Tanaka
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16MLconf
 
Modelling Accessibility Performance in LTE networks, An Analytics Methodology
Modelling Accessibility Performance in LTE networks, An Analytics MethodologyModelling Accessibility Performance in LTE networks, An Analytics Methodology
Modelling Accessibility Performance in LTE networks, An Analytics Methodologyalien_gmx
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Experfy
 
Quick and Heap Sort with examples
Quick and Heap Sort with examplesQuick and Heap Sort with examples
Quick and Heap Sort with examplesBst Ali
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O Sri Ambati
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSridhara R
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPGHye-min Ahn
 

What's hot (20)

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
CS267_Graph_Lab
CS267_Graph_LabCS267_Graph_Lab
CS267_Graph_Lab
 
How to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | EdurekaHow to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | Edureka
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with Spark
 
MLconf NYC Xiangrui Meng
MLconf NYC Xiangrui MengMLconf NYC Xiangrui Meng
MLconf NYC Xiangrui Meng
 
Java - Collections
Java - CollectionsJava - Collections
Java - Collections
 
Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013
 
Flexible Memory Allocation in Kinetic Monte Carlo Simulations
Flexible Memory Allocation in Kinetic Monte Carlo SimulationsFlexible Memory Allocation in Kinetic Monte Carlo Simulations
Flexible Memory Allocation in Kinetic Monte Carlo Simulations
 
Gelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area MeetupGelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area Meetup
 
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
 
Array
ArrayArray
Array
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
 
Modelling Accessibility Performance in LTE networks, An Analytics Methodology
Modelling Accessibility Performance in LTE networks, An Analytics MethodologyModelling Accessibility Performance in LTE networks, An Analytics Methodology
Modelling Accessibility Performance in LTE networks, An Analytics Methodology
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
 
[ppt]
[ppt][ppt]
[ppt]
 
Quick and Heap Sort with examples
Quick and Heap Sort with examplesQuick and Heap Sort with examples
Quick and Heap Sort with examples
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPG
 
Heapsort quick sort
Heapsort quick sortHeapsort quick sort
Heapsort quick sort
 

Viewers also liked

Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7geraldinegray
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidmining Content
 
Data mining tools
Data mining toolsData mining tools
Data mining toolssuganmca14
 
Slides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSlides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSabrina Kirstein
 
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...Cloudera, Inc.
 
M Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classM Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classmcAnalytics99
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?Sven Van Poucke, MD, PhD
 
Data Analytics.01. Data selection and capture
Data Analytics.01. Data selection and captureData Analytics.01. Data selection and capture
Data Analytics.01. Data selection and captureAlex Rayón Jerez
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive ModellingRajiv Advani
 
My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)Data Science Thailand
 
Predictive Analytics World Berlin 2016
Predictive Analytics World Berlin 2016 Predictive Analytics World Berlin 2016
Predictive Analytics World Berlin 2016 Rising Media Ltd.
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1Venkata Reddy Konasani
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsSalah Amean
 

Viewers also liked (20)

Rapidminer
RapidminerRapidminer
Rapidminer
 
Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid Miner
 
Data mining tools
Data mining toolsData mining tools
Data mining tools
 
Slides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSlides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMiner
 
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
Hadoop World 2011: Radoop: a Graphical Analytics Tool for Big Data - Gabor Ma...
 
Data mining tools overall
Data mining tools overallData mining tools overall
Data mining tools overall
 
M Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson classM Chambers and RapidMiner Overview for Babson class
M Chambers and RapidMiner Overview for Babson class
 
RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?RapidMiner, an entrance to explore MIMIC-III?
RapidMiner, an entrance to explore MIMIC-III?
 
Data Analytics.01. Data selection and capture
Data Analytics.01. Data selection and captureData Analytics.01. Data selection and capture
Data Analytics.01. Data selection and capture
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
 
Predictive Modeling and Analytics select_chapters
Predictive Modeling and Analytics select_chaptersPredictive Modeling and Analytics select_chapters
Predictive Modeling and Analytics select_chapters
 
predictive models
predictive modelspredictive models
predictive models
 
Introduction to Text Classification with RapidMiner Studio 7
Introduction to Text Classification with RapidMiner Studio 7Introduction to Text Classification with RapidMiner Studio 7
Introduction to Text Classification with RapidMiner Studio 7
 
My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)
 
Search Twitter with RapidMiner Studio 6
Search Twitter with RapidMiner Studio 6Search Twitter with RapidMiner Studio 6
Search Twitter with RapidMiner Studio 6
 
Predictive Analytics World Berlin 2016
Predictive Analytics World Berlin 2016 Predictive Analytics World Berlin 2016
Predictive Analytics World Berlin 2016
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1
 
Advanced Predictive Modeling with R and RapidMiner Studio 7
Advanced Predictive Modeling with R and RapidMiner Studio 7Advanced Predictive Modeling with R and RapidMiner Studio 7
Advanced Predictive Modeling with R and RapidMiner Studio 7
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
 

Similar to Data Mining: Implementation of Data Mining Techniques using RapidMiner software

3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering TheorySSA KPI
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit vmalathieswaran29
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16MLconf
 
Clustering on database systems rkm
Clustering on database systems rkmClustering on database systems rkm
Clustering on database systems rkmVahid Mirjalili
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in RSudhakar Chavan
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptxssusere1fd42
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithmDarshak Mehta
 
Cluster Analysis.pptx
Cluster Analysis.pptxCluster Analysis.pptx
Cluster Analysis.pptxRvishnupriya2
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clusteringKrish_ver2
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Salah Amean
 

Similar to Data Mining: Implementation of Data Mining Techniques using RapidMiner software (20)

3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering Theory
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
 
Clustering on database systems rkm
Clustering on database systems rkmClustering on database systems rkm
Clustering on database systems rkm
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptx
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
 
Cluster Analysis.pptx
Cluster Analysis.pptxCluster Analysis.pptx
Cluster Analysis.pptx
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 

More from Mohammed Kharma

Data Mining Project for student academic specialization and performance
Data Mining Project for student academic specialization and performanceData Mining Project for student academic specialization and performance
Data Mining Project for student academic specialization and performanceMohammed Kharma
 
Cloud Computing Presentation
Cloud Computing PresentationCloud Computing Presentation
Cloud Computing PresentationMohammed Kharma
 
A data mining framework for fraud detection in telecom based on MapReduce (Pr...
A data mining framework for fraud detection in telecom based on MapReduce (Pr...A data mining framework for fraud detection in telecom based on MapReduce (Pr...
A data mining framework for fraud detection in telecom based on MapReduce (Pr...Mohammed Kharma
 
How to speedup GWT compiler
How to speedup GWT compilerHow to speedup GWT compiler
How to speedup GWT compilerMohammed Kharma
 
37 c 551 - reduced changes in the carrier of steganography algorithm
37 c 551 - reduced changes in the carrier of steganography algorithm37 c 551 - reduced changes in the carrier of steganography algorithm
37 c 551 - reduced changes in the carrier of steganography algorithmMohammed Kharma
 
Learning objects and metadata framework - Mohammed Kharma
Learning objects and metadata framework - Mohammed KharmaLearning objects and metadata framework - Mohammed Kharma
Learning objects and metadata framework - Mohammed KharmaMohammed Kharma
 
Mohammed Kharma-A flexible framework for quality assurance and testing of sof...
Mohammed Kharma-A flexible framework for quality assurance and testing of sof...Mohammed Kharma-A flexible framework for quality assurance and testing of sof...
Mohammed Kharma-A flexible framework for quality assurance and testing of sof...Mohammed Kharma
 
Mohammed Kharma - A flexible framework for quality assurance and testing of s...
Mohammed Kharma - A flexible framework for quality assurance and testing of s...Mohammed Kharma - A flexible framework for quality assurance and testing of s...
Mohammed Kharma - A flexible framework for quality assurance and testing of s...Mohammed Kharma
 

More from Mohammed Kharma (8)

Data Mining Project for student academic specialization and performance
Data Mining Project for student academic specialization and performanceData Mining Project for student academic specialization and performance
Data Mining Project for student academic specialization and performance
 
Cloud Computing Presentation
Cloud Computing PresentationCloud Computing Presentation
Cloud Computing Presentation
 
A data mining framework for fraud detection in telecom based on MapReduce (Pr...
A data mining framework for fraud detection in telecom based on MapReduce (Pr...A data mining framework for fraud detection in telecom based on MapReduce (Pr...
A data mining framework for fraud detection in telecom based on MapReduce (Pr...
 
How to speedup GWT compiler
How to speedup GWT compilerHow to speedup GWT compiler
How to speedup GWT compiler
 
37 c 551 - reduced changes in the carrier of steganography algorithm
37 c 551 - reduced changes in the carrier of steganography algorithm37 c 551 - reduced changes in the carrier of steganography algorithm
37 c 551 - reduced changes in the carrier of steganography algorithm
 
Learning objects and metadata framework - Mohammed Kharma
Learning objects and metadata framework - Mohammed KharmaLearning objects and metadata framework - Mohammed Kharma
Learning objects and metadata framework - Mohammed Kharma
 
Mohammed Kharma-A flexible framework for quality assurance and testing of sof...
Mohammed Kharma-A flexible framework for quality assurance and testing of sof...Mohammed Kharma-A flexible framework for quality assurance and testing of sof...
Mohammed Kharma-A flexible framework for quality assurance and testing of sof...
 
Mohammed Kharma - A flexible framework for quality assurance and testing of s...
Mohammed Kharma - A flexible framework for quality assurance and testing of s...Mohammed Kharma - A flexible framework for quality assurance and testing of s...
Mohammed Kharma - A flexible framework for quality assurance and testing of s...
 

Recently uploaded

Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 

Recently uploaded (20)

Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 

Data Mining: Implementation of Data Mining Techniques using RapidMiner software

  • 1. Data Mining: Implementation of Data Mining Techniques using RapidMiner software Prepared by Mohammed Kharma
  • 2. Definitions review • Cluster: A collection of data objects – similar (or related) to one another within the same group – dissimilar (or unrelated) to the objects in other groups • Cluster analysis – Finding similarities between data according to the characteristics found in the data and grouping similar data objects into clusters
  • 3. Clustering Methods • Partitioning : – Unsupervised learning algorithms, Construct various partitions and then evaluate them by some criterion, e.g., minimizing the sum of square errors – Typical methods: k-means, k-medoids • Hierarchical : – Create a hierarchical decomposition of the set of data (or objects) using some criterion – Typical methods: Diana, Agnes, BIRCH, ROCK, CAMELEON
  • 4. Illustration & compression of 2 clustering technique using Rapidminer tool and Java application
  • 5. illustrate of 2 clustering technique using Rapidminer tool and Java • K-means algorithm: We performed two test 1. Using java program: program parameters K = 2; Data: 22 21 19 20 18 22 1 3 3 2
  • 6. 6 K-means Clustering • Input: the number of clusters K and the collection of n instances • Output: a set of k clusters that minimizes the squared error criterion • Method: – Arbitrarily choose k instances as the initial cluster centers – Repeat • (Re)assign each instance to the cluster to which the instance is the most similar, based on the mean value of the instances in the cluster • Update cluster means (compute mean value of the instances for each cluster) – Until no change in the assignment • Squared Error Criterion – E = ∑i=1 k ∑ pЄCi |p-mi|2 – where mi are the cluster means and p are points in clusters
  • 8. The result of K-Means-RapidMiner
  • 9. The result of K-Means-RapidMiner
  • 10. Continued-The result of K-Means- RapidMiner
  • 11. 11 K-medoids • Input: the number of clusters K and the collection of n instances • Output: A set of k clusters that minimizes the sum of the dissimilarities of all the instances to their nearest medoids • Method: – Arbitrarily choose k instances as the initial medoids – Repeat • (Re)assign each remaining instance to the cluster with the nearest medoid • Randomly select a non-medoid instance, or • Compute the total cost, S, of swapping Oj with Or • If S<0 then swap Oj with Or to form the new set of k medoids – Until no change
  • 12. The result of k-medoids-RapidMiner
  • 13. The result of k-medoids-RapidMiner
  • 15. Comparison The results of both algorithms are the same Both require K to be specified in the input K-medoids is less influenced by outliers in the data Both methods assign each instance exactly to one cluster