SlideShare a Scribd company logo
1 of 19
Gaurav.R.Handa
TE CMPN-A ,40
Under the Guidance of
Ms.Veena Kulkarni
Thakur college of Engineering and Technology, Shyamnarayan Marg, Thakur Village,
Kandivli (E), Mumbai-101.Year 2013-2014
This presentation shows implementation of the k-means
algorithm. Along with a brief description of the algorithm
we have also provided graphs and arithmetic problems for
better understanding of the algorithm.
It shows how k-means algorithm is implemented
efficiently along with the drawbacks of this algorithm.
• Business Intelligence is a more advanced form of Data
Mining and Databases.
• Business Intelligence enables the business to make
intelligent and fact-based decisions.
• It is divided into Association Analysis, Classification,
Clustering and Regression.
• Data clustering is a method in which we make cluster of
objects which are somewhat similar in characteristics.
• Clustering is further divided into Hierarchical,
Partitional and Density based. K-means is an algorithm
which is a part of partitional clustering.
•The knowledge discovery process by analyzing large
volumes of data from various perspectives and organizing
them into useful information.
•The search for valuable information in large volumes of
data and to identify hidden structures in data.
K-means algorithm is a Centroid based technique in
which each cluster is represented by the centre of the
cluster.
This algorithm aims at minimizing an objective
function, specifically a squared error function.
Flowchart:
• Papers on K-Means 
“The Uniqueness of a Good Optimum for K-Means’’, Marina Meila,
Proceedings of the 23rd International Conference on Machine Learning,
2006-By augmenting k-means with a simple,randomized seeding
technique, they obtained an algorithm that is O(log k)-competitive with
the optimal clustering,that guarantees speed &accuracy.
• “The Effectiveness of Lloyd-Type Methods for the k-Means Problem”,
Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, and Chaitanya
Swamy, SODA, 2007-Polynomial-time approximation schemes (PTAS’s)
has been obtained for the k-means clustering algo.
• “Improved Smoothed Analysis of the k-Means Method”, Bodo Manthey
and Heiko Roglin, preprint, 2008- The paper tells us one of the
distinguished features is its speed in practice. Its worst-case running-time,
however, is exponential, leaving a gap between practical and theoretical
performance. This technical paper aims at closing this gap.
1.Archaeology
The objective here is to cluster the locations of
archaeological sites and to make inferences about political
history based on the clusters.
With the help of these we can make some speculations and
these can be tested by actual going to the site.
2. Computational Biology
Here, carp to different levels of cold and genes were
clustered based on their response in different tissues.
Green colour indicates that the gene is under expressed
whereas red colour indicates that the gene is over expressed.
We can see in the figure that there are some patterns in
different tissues.
Thus clustering is a useful tool where we can represent so
much information in one plot.
3.Education
This example is taken from “Teachers as Sources of Middle
School Students’ Motivational Identity: Variable Centered
and Person Centered Analytic Approaches” paper.
In this paper survey results of 206 students are clustered.
These clusters are used to identify groups to buttress an
analysis of what affects motivation.
The number of clusters were selected to get some nice
hypothesis. This hypothesis can then be verified.
 Need to specify K, the number of clusters, in advance
 Unable to handle noisy data and outliers (K-Medoids
algorithm)
 Not suitable for discovering clusters with non-convex
shapes
 Applicable only when mean is defined(K-mode
algorithm).
 K-means algorithm is a simple yet popular method for
clustering analysis. Its performance is determined by
initialisation and appropriate distance measure. There
are several variants of K-means to overcome its
weaknesses :
– K-Medoids: resistance to noise and/or outliers
– K-Modes: extension to categorical data clustering
analysis
– CLARA: dealing with large data sets
– Mixture models (EM algorithm): handling uncertainty
of clusters
Bowman, M., Debray, S. K., and Peterson, L. L. 1993. Reasoning
about naming systems. .
Ding, W. and Marchionini, G. 1997 A Study on Video Browsing
Strategies. Technical Report. University of Maryland at College
Park.
Fröhlich, B. and Plate, J. 2000. The cubic mouse: a new device for
three-dimensional input. In Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems
Tavel, P. 2007 Modeling and Simulation Design. AK Peters Ltd.
THANK YOU !!

More Related Content

What's hot

Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...IJERA Editor
 
Towards reducing the
Towards reducing theTowards reducing the
Towards reducing theIJDKP
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...IJDKP
 
APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...
APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...
APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...Zac Darcy
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET Journal
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for properIJDKP
 
SCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial IntelligenceSCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial Intelligencevivatechijri
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELScscpconf
 
A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniquesijsrd.com
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computingijtsrd
 
Grounded Theory
Grounded TheoryGrounded Theory
Grounded Theorylitdoc1999
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHcscpconf
 
Estimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachEstimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachcsandit
 
PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTING
PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTINGPERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTING
PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTINGIJMIT JOURNAL
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentIJDKP
 
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...ertekg
 
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...paperpublications3
 

What's hot (19)

Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
 
Towards reducing the
Towards reducing theTowards reducing the
Towards reducing the
 
50120130406007
5012013040600750120130406007
50120130406007
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
 
APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...
APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...
APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN ESTIMATING PARTICIPATION IN ELEC...
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for proper
 
SCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial IntelligenceSCCAI- A Student Career Counselling Artificial Intelligence
SCCAI- A Student Career Counselling Artificial Intelligence
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
 
A Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification TechniquesA Survey of Modern Data Classification Techniques
A Survey of Modern Data Classification Techniques
 
40120130405012
4012013040501240120130405012
40120130405012
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computing
 
Grounded Theory
Grounded TheoryGrounded Theory
Grounded Theory
 
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
 
Estimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approachEstimating project development effort using clustered regression approach
Estimating project development effort using clustered regression approach
 
PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTING
PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTINGPERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTING
PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTING
 
A statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environmentA statistical data fusion technique in virtual data integration environment
A statistical data fusion technique in virtual data integration environment
 
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
 
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
 

Similar to Kmeans

Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringIJERD Editor
 
LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...
LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...
LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...Nexgen Technology
 
Data Clustering Using Swarm Intelligence Algorithms An Overview
Data Clustering Using  Swarm Intelligence Algorithms  An OverviewData Clustering Using  Swarm Intelligence Algorithms  An Overview
Data Clustering Using Swarm Intelligence Algorithms An OverviewAboul Ella Hassanien
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningNatasha Grant
 
For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxSureshPolisetty2
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET Journal
 
Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...IJSRD
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerIJERA Editor
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
 
K-NN Classifier Performs Better Than K-Means Clustering in Missing Value Imp...
K-NN Classifier Performs Better Than K-Means Clustering in  Missing Value Imp...K-NN Classifier Performs Better Than K-Means Clustering in  Missing Value Imp...
K-NN Classifier Performs Better Than K-Means Clustering in Missing Value Imp...IOSR Journals
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfRushikeshKulkarni71
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative StudyFiona Phillips
 
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
Clustering heterogeneous categorical data using enhanced mini  batch K-means ...Clustering heterogeneous categorical data using enhanced mini  batch K-means ...
Clustering heterogeneous categorical data using enhanced mini batch K-means ...IJECEIAES
 
factorization methods
factorization methodsfactorization methods
factorization methodsShaina Raza
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Editor IJMTER
 
Knowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific SystemKnowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific SystemSubhasis Dasgupta
 

Similar to Kmeans (20)

Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
 
I017235662
I017235662I017235662
I017235662
 
LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...
LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...
LEARNING MIXTURES OF MARKOV CHAINS FROM AGGREGATE DATA WITH STRUCTURAL CONSTR...
 
Data Clustering Using Swarm Intelligence Algorithms An Overview
Data Clustering Using  Swarm Intelligence Algorithms  An OverviewData Clustering Using  Swarm Intelligence Algorithms  An Overview
Data Clustering Using Swarm Intelligence Algorithms An Overview
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptx
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
 
Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...Student Performance Evaluation in Education Sector Using Prediction and Clust...
Student Performance Evaluation in Education Sector Using Prediction and Clust...
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
 
K-NN Classifier Performs Better Than K-Means Clustering in Missing Value Imp...
K-NN Classifier Performs Better Than K-Means Clustering in  Missing Value Imp...K-NN Classifier Performs Better Than K-Means Clustering in  Missing Value Imp...
K-NN Classifier Performs Better Than K-Means Clustering in Missing Value Imp...
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdf
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative Study
 
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
Clustering heterogeneous categorical data using enhanced mini  batch K-means ...Clustering heterogeneous categorical data using enhanced mini  batch K-means ...
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
 
T0 numtq0n tk=
T0 numtq0n tk=T0 numtq0n tk=
T0 numtq0n tk=
 
factorization methods
factorization methodsfactorization methods
factorization methods
 
Disseration_ppt
Disseration_pptDisseration_ppt
Disseration_ppt
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
 
Knowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific SystemKnowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific System
 
Nguyen - Science of Information, Computation and Fusion - Spring Review 2013
Nguyen - Science of Information, Computation and Fusion - Spring Review 2013Nguyen - Science of Information, Computation and Fusion - Spring Review 2013
Nguyen - Science of Information, Computation and Fusion - Spring Review 2013
 

More from Gaurav Handa

Gaurav handa - Big Data and Hadoop
Gaurav handa - Big Data and HadoopGaurav handa - Big Data and Hadoop
Gaurav handa - Big Data and HadoopGaurav Handa
 
Gaurav handa - Business Analytics
Gaurav handa - Business AnalyticsGaurav handa - Business Analytics
Gaurav handa - Business AnalyticsGaurav Handa
 
Gaurav handa - Data Visualization
Gaurav handa - Data VisualizationGaurav handa - Data Visualization
Gaurav handa - Data VisualizationGaurav Handa
 
A comparative study of hawk eye and goal line
A comparative study of hawk eye and goal lineA comparative study of hawk eye and goal line
A comparative study of hawk eye and goal lineGaurav Handa
 
Ijca paper template
Ijca paper templateIjca paper template
Ijca paper templateGaurav Handa
 

More from Gaurav Handa (9)

Gaurav handa - Big Data and Hadoop
Gaurav handa - Big Data and HadoopGaurav handa - Big Data and Hadoop
Gaurav handa - Big Data and Hadoop
 
Gaurav handa - Business Analytics
Gaurav handa - Business AnalyticsGaurav handa - Business Analytics
Gaurav handa - Business Analytics
 
Gaurav handa - Data Visualization
Gaurav handa - Data VisualizationGaurav handa - Data Visualization
Gaurav handa - Data Visualization
 
A comparative study of hawk eye and goal line
A comparative study of hawk eye and goal lineA comparative study of hawk eye and goal line
A comparative study of hawk eye and goal line
 
Ijca paper template
Ijca paper templateIjca paper template
Ijca paper template
 
K means report
K means reportK means report
K means report
 
B.E degree
B.E degreeB.E degree
B.E degree
 
Project ISR
Project ISRProject ISR
Project ISR
 
Project WeLike
Project WeLikeProject WeLike
Project WeLike
 

Recently uploaded

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Recently uploaded (20)

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Kmeans

  • 1. Gaurav.R.Handa TE CMPN-A ,40 Under the Guidance of Ms.Veena Kulkarni Thakur college of Engineering and Technology, Shyamnarayan Marg, Thakur Village, Kandivli (E), Mumbai-101.Year 2013-2014
  • 2. This presentation shows implementation of the k-means algorithm. Along with a brief description of the algorithm we have also provided graphs and arithmetic problems for better understanding of the algorithm. It shows how k-means algorithm is implemented efficiently along with the drawbacks of this algorithm.
  • 3. • Business Intelligence is a more advanced form of Data Mining and Databases. • Business Intelligence enables the business to make intelligent and fact-based decisions. • It is divided into Association Analysis, Classification, Clustering and Regression. • Data clustering is a method in which we make cluster of objects which are somewhat similar in characteristics. • Clustering is further divided into Hierarchical, Partitional and Density based. K-means is an algorithm which is a part of partitional clustering.
  • 4.
  • 5. •The knowledge discovery process by analyzing large volumes of data from various perspectives and organizing them into useful information. •The search for valuable information in large volumes of data and to identify hidden structures in data.
  • 6. K-means algorithm is a Centroid based technique in which each cluster is represented by the centre of the cluster. This algorithm aims at minimizing an objective function, specifically a squared error function.
  • 7.
  • 9.
  • 10.
  • 11. • Papers on K-Means  “The Uniqueness of a Good Optimum for K-Means’’, Marina Meila, Proceedings of the 23rd International Conference on Machine Learning, 2006-By augmenting k-means with a simple,randomized seeding technique, they obtained an algorithm that is O(log k)-competitive with the optimal clustering,that guarantees speed &accuracy. • “The Effectiveness of Lloyd-Type Methods for the k-Means Problem”, Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, and Chaitanya Swamy, SODA, 2007-Polynomial-time approximation schemes (PTAS’s) has been obtained for the k-means clustering algo. • “Improved Smoothed Analysis of the k-Means Method”, Bodo Manthey and Heiko Roglin, preprint, 2008- The paper tells us one of the distinguished features is its speed in practice. Its worst-case running-time, however, is exponential, leaving a gap between practical and theoretical performance. This technical paper aims at closing this gap.
  • 12. 1.Archaeology The objective here is to cluster the locations of archaeological sites and to make inferences about political history based on the clusters. With the help of these we can make some speculations and these can be tested by actual going to the site.
  • 13. 2. Computational Biology Here, carp to different levels of cold and genes were clustered based on their response in different tissues. Green colour indicates that the gene is under expressed whereas red colour indicates that the gene is over expressed. We can see in the figure that there are some patterns in different tissues. Thus clustering is a useful tool where we can represent so much information in one plot.
  • 14.
  • 15. 3.Education This example is taken from “Teachers as Sources of Middle School Students’ Motivational Identity: Variable Centered and Person Centered Analytic Approaches” paper. In this paper survey results of 206 students are clustered. These clusters are used to identify groups to buttress an analysis of what affects motivation. The number of clusters were selected to get some nice hypothesis. This hypothesis can then be verified.
  • 16.  Need to specify K, the number of clusters, in advance  Unable to handle noisy data and outliers (K-Medoids algorithm)  Not suitable for discovering clusters with non-convex shapes  Applicable only when mean is defined(K-mode algorithm).
  • 17.  K-means algorithm is a simple yet popular method for clustering analysis. Its performance is determined by initialisation and appropriate distance measure. There are several variants of K-means to overcome its weaknesses : – K-Medoids: resistance to noise and/or outliers – K-Modes: extension to categorical data clustering analysis – CLARA: dealing with large data sets – Mixture models (EM algorithm): handling uncertainty of clusters
  • 18. Bowman, M., Debray, S. K., and Peterson, L. L. 1993. Reasoning about naming systems. . Ding, W. and Marchionini, G. 1997 A Study on Video Browsing Strategies. Technical Report. University of Maryland at College Park. Fröhlich, B. and Plate, J. 2000. The cubic mouse: a new device for three-dimensional input. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Tavel, P. 2007 Modeling and Simulation Design. AK Peters Ltd.