SlideShare a Scribd company logo
1 of 5
NAME- SOUMA MAITI
ROLL N0- 27500120016
REG NO.- 202750100110016
DEPARTMENT- COMPUTER SCIENCE
YEAR- 3RD YEAR(6TH SEMESTER)
SUBJECT NAME- DATA WAREHOUSING
AND DATA MINING
SUBJECT CODE- PEC-IT602B
K-MEANS CLUSTERING ALGORITHM
• K-MEANS CLUSTERING ALGORITHM COMPUTES
THE CENTROIDS AND ITERATES UNTIL WE IT
FINDS OPTIMAL CENTROID. IT ASSUMES THAT
THE NUMBER OF CLUSTERS ARE ALREADY
KNOWN. IT IS ALSO CALLED FLAT CLUSTERING
ALGORITHM. THE NUMBER OF CLUSTERS
IDENTIFIED FROM DATA BY ALGORITHM IS
REPRESENTED BY ‘K’ IN K-MEANS.
• IN THIS ALGORITHM, THE DATA POINTS ARE
ASSIGNED TO A CLUSTER IN SUCH A MANNER
THAT THE SUM OF THE SQUARED DISTANCE
BETWEEN THE DATA POINTS AND CENTROID
WOULD BE MINIMUM. IT IS TO BE UNDERSTOOD
THAT LESS VARIATION WITHIN THE CLUSTERS
WILL LEAD TO MORE SIMILAR DATA POINTS
HOW DOES THE K-MEANS ALGORITHM
WORK?
HE WORKING OF THE K-MEANS ALGORITHM IS EXPLAINED IN
THE BELOW STEPS:
• STEP-1: SELECT THE NUMBER K TO DECIDE THE NUMBER
OF CLUSTERS.
• STEP-2: SELECT RANDOM K POINTS OR CENTROIDS. (IT
CAN BE OTHER FROM THE INPUT DATASET).
• STEP-3: ASSIGN EACH DATA POINT TO THEIR CLOSEST
CENTROID, WHICH WILL FORM THE PREDEFINED K
CLUSTERS.
• STEP-4: CALCULATE THE VARIANCE AND PLACE A NEW
CENTROID OF EACH CLUSTER.
• STEP-5: REPEAT THE THIRD STEPS, WHICH MEANS
REASSIGN EACH DATAPOINT TO THE NEW CLOSEST
CENTROID OF EACH CLUSTER.
• STEP-6: IF ANY REASSIGNMENT OCCURS, THEN GO TO
ADVANTAGES OF K MEANS CLUSTERING
ALGORITM:
• THE FOLLOWING ARE SOME ADVANTAGES OF K-
MEANS CLUSTERING ALGORITHMS
• IT IS VERY EASY TO UNDERSTAND AND IMPLEMENT.
• IF WE HAVE LARGE NUMBER OF VARIABLES THEN,
K-MEANS WOULD BE FASTER THAN HIERARCHICAL
CLUSTERING.
• ON RE-COMPUTATION OF CENTROIDS, AN
INSTANCE CAN CHANGE THE CLUSTER.
• TIGHTER CLUSTERS ARE FORMED WITH K-MEANS
AS COMPARED TO HIERARCHICAL CLUSTERING.
APPLICATIONS OF K-MEANS CLUSTERING
ALGORITHM:
THE MAIN GOALS OF CLUSTER ANALYSIS ARE −
•TO GET A MEANINGFUL INTUITION FROM THE DATA
WE ARE WORKING WITH.
•CLUSTER-THEN-PREDICT WHERE DIFFERENT MODELS
WILL BE BUILT FOR DIFFERENT SUBGROUPS.
TO FULFILL THE ABOVE-MENTIONED GOALS, K-
MEANS CLUSTERING IS PERFORMING WELL ENOUGH.
IT CAN BE USED IN FOLLOWING APPLICATIONS −
•MARKET SEGMENTATION
•DOCUMENT CLUSTERING
•IMAGE SEGMENTATION
•IMAGE COMPRESSION
•CUSTOMER SEGMENTATION
•ANALYZING THE TREND ON DYNAMIC DATA
K means Clustering Algorithm

More Related Content

Similar to K means Clustering Algorithm

Assent webinar cm3_updates-aug19-2014-slideshare
Assent webinar cm3_updates-aug19-2014-slideshareAssent webinar cm3_updates-aug19-2014-slideshare
Assent webinar cm3_updates-aug19-2014-slideshare
brytani
 
Seminar_Presentation(Mar 2023).pptx
Seminar_Presentation(Mar 2023).pptxSeminar_Presentation(Mar 2023).pptx
Seminar_Presentation(Mar 2023).pptx
japnaanand3
 
White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2
White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2
White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2
Chandrashekar Sundaresan
 

Similar to K means Clustering Algorithm (20)

Enabling Value added Product (UTR) Rolling using Artificial Intelligence base...
Enabling Value added Product (UTR) Rolling using Artificial Intelligence base...Enabling Value added Product (UTR) Rolling using Artificial Intelligence base...
Enabling Value added Product (UTR) Rolling using Artificial Intelligence base...
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
Mining single dimensional boolean association rules from transactional
Mining single dimensional boolean association rules from transactionalMining single dimensional boolean association rules from transactional
Mining single dimensional boolean association rules from transactional
 
Assent webinar cm3_updates-aug19-2014-slideshare
Assent webinar cm3_updates-aug19-2014-slideshareAssent webinar cm3_updates-aug19-2014-slideshare
Assent webinar cm3_updates-aug19-2014-slideshare
 
Measurement Procedures for Design and Enforcement of Harm Claim Thresholds
Measurement Procedures for Design and Enforcement of Harm Claim ThresholdsMeasurement Procedures for Design and Enforcement of Harm Claim Thresholds
Measurement Procedures for Design and Enforcement of Harm Claim Thresholds
 
Seminar_Presentation(Mar 2023).pptx
Seminar_Presentation(Mar 2023).pptxSeminar_Presentation(Mar 2023).pptx
Seminar_Presentation(Mar 2023).pptx
 
Rides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA BikeRides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA Bike
 
Wind-Plus-Storage Presentation
Wind-Plus-Storage PresentationWind-Plus-Storage Presentation
Wind-Plus-Storage Presentation
 
Time_Series_Assignment
Time_Series_AssignmentTime_Series_Assignment
Time_Series_Assignment
 
CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION
 
CNVMiner: Pipeline to Mine CNV & Structural Variation in Hierarchical Fashion
CNVMiner: Pipeline to Mine CNV & Structural Variation in Hierarchical FashionCNVMiner: Pipeline to Mine CNV & Structural Variation in Hierarchical Fashion
CNVMiner: Pipeline to Mine CNV & Structural Variation in Hierarchical Fashion
 
ABCCOM Profile1
ABCCOM Profile1ABCCOM Profile1
ABCCOM Profile1
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
Salford Systems - On the Cutting Edge of Technology
Salford Systems - On the Cutting Edge of TechnologySalford Systems - On the Cutting Edge of Technology
Salford Systems - On the Cutting Edge of Technology
 
Skyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentSkyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed Environment
 
White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2
White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2
White_Paper_on_the_Impact_of_Inventory_on_Network_Design_v2
 
Control techniques
Control techniquesControl techniques
Control techniques
 
traffic sign detection using deep learning.pptx
traffic sign detection using deep learning.pptxtraffic sign detection using deep learning.pptx
traffic sign detection using deep learning.pptx
 
An Efficient top- k Query Processing in Distributed Wireless Sensor Networks
An Efficient top- k Query Processing in Distributed Wireless  Sensor NetworksAn Efficient top- k Query Processing in Distributed Wireless  Sensor Networks
An Efficient top- k Query Processing in Distributed Wireless Sensor Networks
 
Facility location of services
Facility location of servicesFacility location of services
Facility location of services
 

More from Souma Maiti

More from Souma Maiti (18)

LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Types of Cyber Security Attacks- Active & Passive Attak
Types of Cyber Security Attacks- Active & Passive AttakTypes of Cyber Security Attacks- Active & Passive Attak
Types of Cyber Security Attacks- Active & Passive Attak
 
E-Commerce Analysis & Strategy Presentation
E-Commerce Analysis & Strategy PresentationE-Commerce Analysis & Strategy Presentation
E-Commerce Analysis & Strategy Presentation
 
Principles of Network Security-CIAD TRIAD
Principles of Network Security-CIAD TRIADPrinciples of Network Security-CIAD TRIAD
Principles of Network Security-CIAD TRIAD
 
Decision Tree in Machine Learning
Decision Tree in Machine Learning  Decision Tree in Machine Learning
Decision Tree in Machine Learning
 
Idea on Entreprenaurship
Idea on EntreprenaurshipIdea on Entreprenaurship
Idea on Entreprenaurship
 
System Based Attacks - CYBER SECURITY
System Based Attacks - CYBER SECURITYSystem Based Attacks - CYBER SECURITY
System Based Attacks - CYBER SECURITY
 
Operation Research
Operation ResearchOperation Research
Operation Research
 
Loan Approval Prediction Using Machine Learning
Loan Approval Prediction Using Machine LearningLoan Approval Prediction Using Machine Learning
Loan Approval Prediction Using Machine Learning
 
Constitution of India
Constitution of IndiaConstitution of India
Constitution of India
 
COMIPLER_DESIGN_1[1].pdf
COMIPLER_DESIGN_1[1].pdfCOMIPLER_DESIGN_1[1].pdf
COMIPLER_DESIGN_1[1].pdf
 
Heuristic Search Technique- Hill Climbing
Heuristic Search Technique- Hill ClimbingHeuristic Search Technique- Hill Climbing
Heuristic Search Technique- Hill Climbing
 
SATELLITE INTERNET AND STARLINK
SATELLITE INTERNET AND STARLINKSATELLITE INTERNET AND STARLINK
SATELLITE INTERNET AND STARLINK
 
Fundamental Steps Of Image Processing
Fundamental Steps Of Image ProcessingFundamental Steps Of Image Processing
Fundamental Steps Of Image Processing
 
Join in SQL - Inner, Self, Outer Join
Join in SQL - Inner, Self, Outer JoinJoin in SQL - Inner, Self, Outer Join
Join in SQL - Inner, Self, Outer Join
 
Errors in Numerical Analysis
Errors in Numerical AnalysisErrors in Numerical Analysis
Errors in Numerical Analysis
 
Open Systems Interconnection (OSI) MODEL
Open Systems Interconnection (OSI)  MODELOpen Systems Interconnection (OSI)  MODEL
Open Systems Interconnection (OSI) MODEL
 
Internet of Things(IOT)
Internet of Things(IOT)Internet of Things(IOT)
Internet of Things(IOT)
 

Recently uploaded

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Recently uploaded (20)

VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 

K means Clustering Algorithm

  • 1. NAME- SOUMA MAITI ROLL N0- 27500120016 REG NO.- 202750100110016 DEPARTMENT- COMPUTER SCIENCE YEAR- 3RD YEAR(6TH SEMESTER) SUBJECT NAME- DATA WAREHOUSING AND DATA MINING SUBJECT CODE- PEC-IT602B
  • 2. K-MEANS CLUSTERING ALGORITHM • K-MEANS CLUSTERING ALGORITHM COMPUTES THE CENTROIDS AND ITERATES UNTIL WE IT FINDS OPTIMAL CENTROID. IT ASSUMES THAT THE NUMBER OF CLUSTERS ARE ALREADY KNOWN. IT IS ALSO CALLED FLAT CLUSTERING ALGORITHM. THE NUMBER OF CLUSTERS IDENTIFIED FROM DATA BY ALGORITHM IS REPRESENTED BY ‘K’ IN K-MEANS. • IN THIS ALGORITHM, THE DATA POINTS ARE ASSIGNED TO A CLUSTER IN SUCH A MANNER THAT THE SUM OF THE SQUARED DISTANCE BETWEEN THE DATA POINTS AND CENTROID WOULD BE MINIMUM. IT IS TO BE UNDERSTOOD THAT LESS VARIATION WITHIN THE CLUSTERS WILL LEAD TO MORE SIMILAR DATA POINTS
  • 3. HOW DOES THE K-MEANS ALGORITHM WORK? HE WORKING OF THE K-MEANS ALGORITHM IS EXPLAINED IN THE BELOW STEPS: • STEP-1: SELECT THE NUMBER K TO DECIDE THE NUMBER OF CLUSTERS. • STEP-2: SELECT RANDOM K POINTS OR CENTROIDS. (IT CAN BE OTHER FROM THE INPUT DATASET). • STEP-3: ASSIGN EACH DATA POINT TO THEIR CLOSEST CENTROID, WHICH WILL FORM THE PREDEFINED K CLUSTERS. • STEP-4: CALCULATE THE VARIANCE AND PLACE A NEW CENTROID OF EACH CLUSTER. • STEP-5: REPEAT THE THIRD STEPS, WHICH MEANS REASSIGN EACH DATAPOINT TO THE NEW CLOSEST CENTROID OF EACH CLUSTER. • STEP-6: IF ANY REASSIGNMENT OCCURS, THEN GO TO
  • 4. ADVANTAGES OF K MEANS CLUSTERING ALGORITM: • THE FOLLOWING ARE SOME ADVANTAGES OF K- MEANS CLUSTERING ALGORITHMS • IT IS VERY EASY TO UNDERSTAND AND IMPLEMENT. • IF WE HAVE LARGE NUMBER OF VARIABLES THEN, K-MEANS WOULD BE FASTER THAN HIERARCHICAL CLUSTERING. • ON RE-COMPUTATION OF CENTROIDS, AN INSTANCE CAN CHANGE THE CLUSTER. • TIGHTER CLUSTERS ARE FORMED WITH K-MEANS AS COMPARED TO HIERARCHICAL CLUSTERING. APPLICATIONS OF K-MEANS CLUSTERING ALGORITHM: THE MAIN GOALS OF CLUSTER ANALYSIS ARE − •TO GET A MEANINGFUL INTUITION FROM THE DATA WE ARE WORKING WITH. •CLUSTER-THEN-PREDICT WHERE DIFFERENT MODELS WILL BE BUILT FOR DIFFERENT SUBGROUPS. TO FULFILL THE ABOVE-MENTIONED GOALS, K- MEANS CLUSTERING IS PERFORMING WELL ENOUGH. IT CAN BE USED IN FOLLOWING APPLICATIONS − •MARKET SEGMENTATION •DOCUMENT CLUSTERING •IMAGE SEGMENTATION •IMAGE COMPRESSION •CUSTOMER SEGMENTATION •ANALYZING THE TREND ON DYNAMIC DATA