SlideShare a Scribd company logo
1 of 24
0
Confidential. © Stream Intelligence Ltd. All rights reserved.
Introduction to Clustering
1
Confidential. © Stream Intelligence Ltd. All rights reserved.
Agenda
1 Introduction: Business Case
2 Clustering
3 Hierarchical Clustering
4 K-means Clustering
2
Confidential. © Stream Intelligence Ltd. All rights reserved.
Business Case
1
3
Confidential. © Stream Intelligence Ltd. All rights reserved.
Business Case – Predicting Successful Music Production
Cluster
music A
Cluster
music B
Cluster
music C
Cluster
music D
• Target is to appear at Billboard’s weekly to 40
• Cost per single could up to 300K USD
• Music Intelligence Solution using clustering to predict if a music will be
accepted by market
• Increase success rate from 1 out of 10 to 8 out of 10
4
Confidential. © Stream Intelligence Ltd. All rights reserved.
Clustering
2
5
Confidential. © Stream Intelligence Ltd. All rights reserved.
Statistical Learning Categorization
Statistical
Learning
Unsupervised
Learning
Supervised
Learning
Clustering Predictive Model
6
Confidential. © Stream Intelligence Ltd. All rights reserved.
Clustering
• Process of grouping a set of physical or abstract objects into clusters
(example: customer, product etc.)
• A cluster is a collection of data objects that are similar to one another within the same
cluster and are dissimilar to the objects in other clusters
• Similarity is calculated based distance between point
• Common distance measure is Euclidian distance
7
Confidential. © Stream Intelligence Ltd. All rights reserved.
Hierarchycal Clustering
2
8
Confidential. © Stream Intelligence Ltd. All rights reserved.
Hierarchical Clustering
• Start with each data point in its own cluster
9
Confidential. © Stream Intelligence Ltd. All rights reserved.
Hierarchical Clustering
• Combine two nearest clusters (Euclidian, Centroid)
10
Confidential. © Stream Intelligence Ltd. All rights reserved.
Lets Practice
• The data for this exercise was downloaded from www.movielens.org
• Open “clustering_movie.R”
• The movies in the dataset are categorized as belonging to different gender:
a. Action
b. Comedy
c. Sci-Fi
d. etc.
11
Confidential. © Stream Intelligence Ltd. All rights reserved.
Dendogram
Heights represent
the distance
between
point/cluster
12
Confidential. © Stream Intelligence Ltd. All rights reserved.
Finding Meaningful Cluster
• How to see which cluster have the most action movies?
use this command:
tapply(movies$Action, clusterGroups, mean)
• Exercise: Can you find the characteristic of each cluster?
Hint:
- Add the cluster as one of the variable in the data
- Load dplyr library
- Use aggregate and summarise function
13
Confidential. © Stream Intelligence Ltd. All rights reserved.
Common scenario
Tips:
- Normalize the data
Movie Action Romance Rating Revenue
(in USD)
A 1 1 5 200
B 0 1 4 150
C 0 0 3 50
D 1 1 4 120
14
Confidential. © Stream Intelligence Ltd. All rights reserved.
K-means Clustering
2
15
Confidential. © Stream Intelligence Ltd. All rights reserved.
K-Means Clustering
1. Group data into K-clusters by:
a. Determining the k centroid
b. Group the data points to the nearest centroid
2. Algorithm works by iterating between two stages until the data points converge
Objective : High Level Description
16
Suppose k=3
K-Means Illustrations
17
Iteration = 0
1. Start with random positions of centroids.
K-Means Illustrations
18
Iteration = 1
1. Start with random positions of centroids.
2. Assign each data point to closest centroid
K-Means Illustrations
19
Iteration = 1
1. Start with random positions of centroids.
2. Assign each data point to closest centroid
3. Move centroids to center of assigned
points (recalculating C)
K-Means Illustrations
20
Iteration = 3
1. Start with random positions of centroids.
2. Assign each data point to closest centroid
3. Move centroids to center of assigned
points
4. Iterate till minimal cost
K-Means Illustrations
21
Iteration = 3
1. Start with random positions of centroids.
2. Assign each data point to closest centroid
3. Move centroids to center of assigned
points
4. Iterate till minimal cost
What potentially can go wrong?
22
Optimum Number of Cluster Illustrations
TSS = Total Sum of Square Error
K = Number of cluster
Optimum Number of Cluster
23
Confidential. © Stream Intelligence Ltd. All rights reserved.
Lets Practice
• We will use the credit card profile data (cc-profile.csv)
• Open “segmenting_customer.R”
Exercise:
• What is the optimum number of cluster?
• Please provide the characteristics of segment. Do you think it is meaningful?

More Related Content

What's hot

2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagationKrish_ver2
 
Machine Learning and Data Mining: 10 Introduction to Classification
Machine Learning and Data Mining: 10 Introduction to ClassificationMachine Learning and Data Mining: 10 Introduction to Classification
Machine Learning and Data Mining: 10 Introduction to ClassificationPier Luca Lanzi
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithmparry prabhu
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
K means clustering
K means clusteringK means clustering
K means clusteringkeshav goyal
 
Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.Siddhanth Chaurasiya
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clusteringMegha Sharma
 
Classification
ClassificationClassification
ClassificationCloudxLab
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
 
Spectral Clustering
Spectral ClusteringSpectral Clustering
Spectral Clusteringssusered887b
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsAndrew Ferlitsch
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERINGsingh7599
 

What's hot (20)

Customer Segmentation
Customer SegmentationCustomer Segmentation
Customer Segmentation
 
Decision trees
Decision treesDecision trees
Decision trees
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Machine Learning and Data Mining: 10 Introduction to Classification
Machine Learning and Data Mining: 10 Introduction to ClassificationMachine Learning and Data Mining: 10 Introduction to Classification
Machine Learning and Data Mining: 10 Introduction to Classification
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clustering
 
Supervised learning
  Supervised learning  Supervised learning
Supervised learning
 
Classification
ClassificationClassification
Classification
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
 
Web mining
Web mining Web mining
Web mining
 
Spectral Clustering
Spectral ClusteringSpectral Clustering
Spectral Clustering
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
 

Similar to Customer Segmentation using Clustering

Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017Iwan Sofana
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithmsMark Moriarty
 
MLSD18. Unsupervised Learning
MLSD18. Unsupervised LearningMLSD18. Unsupervised Learning
MLSD18. Unsupervised LearningBigML, Inc
 
Cluster Analysis : Assignment & Update
Cluster Analysis : Assignment & UpdateCluster Analysis : Assignment & Update
Cluster Analysis : Assignment & UpdateBilly Yang
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxnikshaikh786
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Ml9 introduction to-unsupervised_learning_and_clustering_methods
Ml9 introduction to-unsupervised_learning_and_clustering_methodsMl9 introduction to-unsupervised_learning_and_clustering_methods
Ml9 introduction to-unsupervised_learning_and_clustering_methodsankit_ppt
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.pptArumugam90
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in RSudhakar Chavan
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringIJCSIS Research Publications
 
Machine learning algorithms
Machine learning algorithmsMachine learning algorithms
Machine learning algorithmsShalitha Suranga
 
13_Unsupervised Learning.pdf
13_Unsupervised Learning.pdf13_Unsupervised Learning.pdf
13_Unsupervised Learning.pdfEmanAsem4
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 

Similar to Customer Segmentation using Clustering (20)

Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
 
MLSD18. Unsupervised Learning
MLSD18. Unsupervised LearningMLSD18. Unsupervised Learning
MLSD18. Unsupervised Learning
 
Project PPT
Project PPTProject PPT
Project PPT
 
Cluster Analysis : Assignment & Update
Cluster Analysis : Assignment & UpdateCluster Analysis : Assignment & Update
Cluster Analysis : Assignment & Update
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Clustering
ClusteringClustering
Clustering
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Ml9 introduction to-unsupervised_learning_and_clustering_methods
Ml9 introduction to-unsupervised_learning_and_clustering_methodsMl9 introduction to-unsupervised_learning_and_clustering_methods
Ml9 introduction to-unsupervised_learning_and_clustering_methods
 
Kmeans
KmeansKmeans
Kmeans
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
Hiding slides
Hiding slidesHiding slides
Hiding slides
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
Machine learning algorithms
Machine learning algorithmsMachine learning algorithms
Machine learning algorithms
 
13_Unsupervised Learning.pdf
13_Unsupervised Learning.pdf13_Unsupervised Learning.pdf
13_Unsupervised Learning.pdf
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Customer Segmentation using Clustering

  • 1. 0 Confidential. © Stream Intelligence Ltd. All rights reserved. Introduction to Clustering
  • 2. 1 Confidential. © Stream Intelligence Ltd. All rights reserved. Agenda 1 Introduction: Business Case 2 Clustering 3 Hierarchical Clustering 4 K-means Clustering
  • 3. 2 Confidential. © Stream Intelligence Ltd. All rights reserved. Business Case 1
  • 4. 3 Confidential. © Stream Intelligence Ltd. All rights reserved. Business Case – Predicting Successful Music Production Cluster music A Cluster music B Cluster music C Cluster music D • Target is to appear at Billboard’s weekly to 40 • Cost per single could up to 300K USD • Music Intelligence Solution using clustering to predict if a music will be accepted by market • Increase success rate from 1 out of 10 to 8 out of 10
  • 5. 4 Confidential. © Stream Intelligence Ltd. All rights reserved. Clustering 2
  • 6. 5 Confidential. © Stream Intelligence Ltd. All rights reserved. Statistical Learning Categorization Statistical Learning Unsupervised Learning Supervised Learning Clustering Predictive Model
  • 7. 6 Confidential. © Stream Intelligence Ltd. All rights reserved. Clustering • Process of grouping a set of physical or abstract objects into clusters (example: customer, product etc.) • A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters • Similarity is calculated based distance between point • Common distance measure is Euclidian distance
  • 8. 7 Confidential. © Stream Intelligence Ltd. All rights reserved. Hierarchycal Clustering 2
  • 9. 8 Confidential. © Stream Intelligence Ltd. All rights reserved. Hierarchical Clustering • Start with each data point in its own cluster
  • 10. 9 Confidential. © Stream Intelligence Ltd. All rights reserved. Hierarchical Clustering • Combine two nearest clusters (Euclidian, Centroid)
  • 11. 10 Confidential. © Stream Intelligence Ltd. All rights reserved. Lets Practice • The data for this exercise was downloaded from www.movielens.org • Open “clustering_movie.R” • The movies in the dataset are categorized as belonging to different gender: a. Action b. Comedy c. Sci-Fi d. etc.
  • 12. 11 Confidential. © Stream Intelligence Ltd. All rights reserved. Dendogram Heights represent the distance between point/cluster
  • 13. 12 Confidential. © Stream Intelligence Ltd. All rights reserved. Finding Meaningful Cluster • How to see which cluster have the most action movies? use this command: tapply(movies$Action, clusterGroups, mean) • Exercise: Can you find the characteristic of each cluster? Hint: - Add the cluster as one of the variable in the data - Load dplyr library - Use aggregate and summarise function
  • 14. 13 Confidential. © Stream Intelligence Ltd. All rights reserved. Common scenario Tips: - Normalize the data Movie Action Romance Rating Revenue (in USD) A 1 1 5 200 B 0 1 4 150 C 0 0 3 50 D 1 1 4 120
  • 15. 14 Confidential. © Stream Intelligence Ltd. All rights reserved. K-means Clustering 2
  • 16. 15 Confidential. © Stream Intelligence Ltd. All rights reserved. K-Means Clustering 1. Group data into K-clusters by: a. Determining the k centroid b. Group the data points to the nearest centroid 2. Algorithm works by iterating between two stages until the data points converge Objective : High Level Description
  • 18. 17 Iteration = 0 1. Start with random positions of centroids. K-Means Illustrations
  • 19. 18 Iteration = 1 1. Start with random positions of centroids. 2. Assign each data point to closest centroid K-Means Illustrations
  • 20. 19 Iteration = 1 1. Start with random positions of centroids. 2. Assign each data point to closest centroid 3. Move centroids to center of assigned points (recalculating C) K-Means Illustrations
  • 21. 20 Iteration = 3 1. Start with random positions of centroids. 2. Assign each data point to closest centroid 3. Move centroids to center of assigned points 4. Iterate till minimal cost K-Means Illustrations
  • 22. 21 Iteration = 3 1. Start with random positions of centroids. 2. Assign each data point to closest centroid 3. Move centroids to center of assigned points 4. Iterate till minimal cost What potentially can go wrong?
  • 23. 22 Optimum Number of Cluster Illustrations TSS = Total Sum of Square Error K = Number of cluster Optimum Number of Cluster
  • 24. 23 Confidential. © Stream Intelligence Ltd. All rights reserved. Lets Practice • We will use the credit card profile data (cc-profile.csv) • Open “segmenting_customer.R” Exercise: • What is the optimum number of cluster? • Please provide the characteristics of segment. Do you think it is meaningful?