SlideShare a Scribd company logo
MIXED NUMERIC AND CATEGORICAL
ATTRIBUTE CLUSTERING ALGORITHM MODELING
DR. ASOKA KORALE, C.ENG. MIET & MIESL
ADVANTAGES TO NUMERIC AND CATEGORICAL ATTRIBUTE CLUSTERING
Slide | 2
Improved Targeting in Campaigns & Insight in
to Segments
Currently clustering on numeric variables Age,
Net Stay, ARPU
PRIMARY ATTRIBUTES THAT CAN BE INCLUDED
WITH MIXED ATTRIBUTE TYPE CLUSTERING –
ACCOUNT TYPE, GENDER, GEO LOCATION, ……
Currently Fuzzy C – Means Algorithm used in
Clustering
Digital Advertizing SEGMENTATIONS
INCREASINGLY BASED ON CLUSTERING
Include other Categorical attributes depending
on Interest segment to create –”Micro
Segments”
WIDENING POTENTIAL INSIGHTS THROUGH CATEGORICAL CLUSTERING
Slide | 3
Improved
Targeting in
Campaigns &
All Attributes Can be
Clustered – leading to
very specific and
wider array of
segments
Geographic attribute clustering
to incorporate Income/ARPU
hotspots at micro level
CONCEPT UNDERLYING THE MIXED K PROTOTYPES ALGORITHM [1]
Slide |
4
point “d” and point “c” may switch sides
depending on how similar the numeric part and
categorical part of the point is similar to the
numeric and categorical part of the centroid
(prototype)
Influence or contribution of Numeric and
Categorical Attributes of a data point can
be controlled via a parameter “gamma”
Point “a” may switch if the categorical
part is closer to the categorical centroid
(prototype) more than its numeric part is
close to the numeric part of the centroid.
Numeric and Categorical Attributes parts
of a data point can be considered
separately and two sets of centroids act
as attractors for each Attribute type in
each cluster
Numeric Attribute1
Shapes represent two values of a single categorical variable
Numeric
Attribute2
[1]. Huang, CSIRO, Australia
MIXED K PROTOTYPES ALGORITHM [1]
Slide |
5
Distance measure to a prototype
(center) of two parts – numeric
and categorical
Numeric Attributes - Euclidian Distance Categorical Attributes – Dissimilarity Measure
Centroid of Numeric Attributes – a simple
average of the points in that cluster
Includes “Yij” a fuzzy membership
function if we wish to go in that
direction
MIXED K PROTOTYPES ALGORITHM [1]
Slide | 6
Minimize the total cost “E” which is the sum of the distances to the
numeric and categorical parts of the centroid (prototype)
Centroid of Categorical attributes
determined on highest frequency of
attribute value in each cluster
Slide | 7
CONVERGENCE PERFORMANCE
0 5 10 15 20 25 30 35 40
0
200
400
600
800
1000
1200
1400
1600
Total no of switches at each iteration
IterationNumber
0 5 10 15 20 25 30 35 40
1.2
1.3
1.4
1.5
1.6
1.7
1.8
x 10
4
IterationNumber
Total Distance at eachiteration
1
2
3
4
5
6
7
8
0 5 10 15 20 25 30 35 40
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
IterationNumber
Total Categorical Distance at eachiteration
1
2
3
4
5
6
7
8
Slide | 8
CLUSTER & SEGMENT PROFILE
1 2 3 4 5 6 7 8
0
200
400
600
800
1000
1200
Number of Cx in eachCluster
Cluster ID
20
30
40
50
60
70
80
90
1 2 3 4 5 6 7 8
Cluster/Segment ID
Age
0
50
100
150
200
250
1 2 3 4 5 6 7 8
Cluster/Segment ID
Net Stay
0
0.5
1
1.5
2
2.5
3
3.5
4
x 10
4
1 2 3 4 5 6 7 8
Cluster/Segment ID
ARPU
Slide | 9
VALIDATION WITH DISTRIBUTION ANALYSIS
Cluster ID
Cx in
Cluster Avg. Age
Spread
Age
Avg. Net-
Stay
Spred
Net-Stay Avg. ARPU
Spread
ARPU Post Paid Pre Paid Female Male
1 913 27 5 28 26 1231 1427 90 823 913 0
2 930 28 5 19 16 1407 1699 159 771 0 930
3 407 53 8 46 35 1095 1303 34 373 407 0
4 409 54 8 34 24 967 919 66 343 0 409
5 556 36 11 82 43 2601 2399 546 10 556 0
6 542 32 5 95 27 1031 927 0 542 67 475
7 1116 36 9 96 44 2917 2669 1116 0 0 1116
8 348 57 7 131 33 1205 853 147 201 33 315
15 20 25 30 35 40 45 50 55 60 65 70 75 80 85
0
50
100
150
200
Histogram Cx Age, Male
Age (years)
Frequency
15 20 25 30 35 40 45 50 55 60 65 70 75
0
50
100
150
Histogram Cx Age, Female
Age (years)
Frequency
Due to a certain bi-modal nature, clustering able to identify the modes in the Age histograms
Slide |
10
Cluster ID
Data
points in
Cluster Avg. Age
Spread
Age
Avg. Net-
Stay
Spred
Net-Stay Avg. ARPU
Spread
ARPU
Number
Post Paid
Number
Pre Paid
Number
Female
Number
Male
1 913 27 5 28 26 1231 1427 90 823 913 0
2 930 28 5 19 16 1407 1699 159 771 0 930
3 407 53 8 46 35 1095 1303 34 373 407 0
4 409 54 8 34 24 967 919 66 343 0 409
5 556 36 11 82 43 2601 2399 546 10 556 0
6 542 32 5 95 27 1031 927 0 542 67 475
7 1116 36 9 96 44 2917 2669 1116 0 0 1116
8 348 57 7 131 33 1205 853 147 201 33 315
0 12 24 36 48 60 72 84 96 108 120 132 144 156 168 180 192 204 216 228 240
0
50
100
150
200
250
Histogram Cx Network Stay
Net Stay(months)
Frequency
No identifiable structure in Net Stay distribution
VALIDATION WITH DISTRIBUTION ANALYSIS
Cluster Segment Profile
Slide |
11
CLUSTERING NUMERIC PART OF SEGMENTS IN 3D
-2
0
2
4
-5
0
5
-5
0
5
10
15
20
Age (normalized)
Segmental Analysis: Age, Net Stay and ARPU
Net-Stay (normalized)
ARPU(normalized)
1
2
3
4
5
6
7
8
Slide |
12
NOTABLE POINTS
• Allows us to cluster most attributes (within reason)
• Particularly if the categorical attributes do not have many different component
values
• Reasonable convergence performance both in terms of run time and number
of iterations
• Different dissimilarity measures and distance criteria will give differing results
• The influence of the categorical part via gamma may also need to change with
the method used
• Algorithm somewhat sensitive to initial conditions –
initialization of centroids
• Explore likelihood of falling in to a local minima and getting trapped there leading to a
sub optimal final solution
• To do…..
• Each drop can result in a non unique final result but will not impact the underlying
trends and insights in to each segment

More Related Content

What's hot

Data Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal Fultz
Data Con LA
 
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin RSelf-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
shanelynn
 
Ben logistics
Ben logisticsBen logistics
Ben logistics
Ben2628
 
Wind/ Solar Power Forecasting
Wind/ Solar Power Forecasting  Wind/ Solar Power Forecasting
Wind/ Solar Power Forecasting
Das A. K.
 
Geek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial DataGeek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial Data
IDERA Software
 
Self Organizing Maps
Self Organizing MapsSelf Organizing Maps
Self Organizing Maps
Daksh Raj Chopra
 
Intensity transformation & histogram processing
Intensity transformation & histogram processingIntensity transformation & histogram processing
Intensity transformation & histogram processing
Dëèp Çhõkshï
 
Notes on Spectral Clustering
Notes on Spectral ClusteringNotes on Spectral Clustering
Notes on Spectral Clustering
Davide Eynard
 
Graph Theory
Graph TheoryGraph Theory
Graph Theory
Prateek Pandey
 
THE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSION
THE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSIONTHE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSION
THE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSION
cscpconf
 
Regression Study: Boston Housing
Regression Study: Boston HousingRegression Study: Boston Housing
Regression Study: Boston Housing
Ravish Kalra
 
An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...
An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...
An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...
Ichwanul Imkarokaro
 
1 sollins algorithm
1 sollins algorithm1 sollins algorithm
1 sollins algorithm
Muhammad Salman
 
Region Splitting and Merging Technique For Image segmentation.
Region Splitting and Merging Technique For Image segmentation.Region Splitting and Merging Technique For Image segmentation.
Region Splitting and Merging Technique For Image segmentation.
SomitSamanto1
 
Formont.ppt
Formont.pptFormont.ppt
Formont.pptgrssieee
 
Fuzzy c means manual work
Fuzzy c means manual workFuzzy c means manual work
Fuzzy c means manual work
Dr.E.N.Sathishkumar
 
region Basd in ML
region Basd in MLregion Basd in ML
region Basd in ML
KartheekRaja3
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networks
mourya chandra
 

What's hot (19)

Data Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal Fultz
 
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin RSelf-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
 
Ben logistics
Ben logisticsBen logistics
Ben logistics
 
Wind/ Solar Power Forecasting
Wind/ Solar Power Forecasting  Wind/ Solar Power Forecasting
Wind/ Solar Power Forecasting
 
Geek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial DataGeek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial Data
 
Self Organizing Maps
Self Organizing MapsSelf Organizing Maps
Self Organizing Maps
 
Intensity transformation & histogram processing
Intensity transformation & histogram processingIntensity transformation & histogram processing
Intensity transformation & histogram processing
 
Extrapolation
ExtrapolationExtrapolation
Extrapolation
 
Notes on Spectral Clustering
Notes on Spectral ClusteringNotes on Spectral Clustering
Notes on Spectral Clustering
 
Graph Theory
Graph TheoryGraph Theory
Graph Theory
 
THE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSION
THE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSIONTHE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSION
THE EVIDENCE THEORY FOR COLOR SATELLITE IMAGE COMPRESSION
 
Regression Study: Boston Housing
Regression Study: Boston HousingRegression Study: Boston Housing
Regression Study: Boston Housing
 
An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...
An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...
An Weighting Dissimilarity Function of CLARANS for Clustering Spatial Data Ed...
 
1 sollins algorithm
1 sollins algorithm1 sollins algorithm
1 sollins algorithm
 
Region Splitting and Merging Technique For Image segmentation.
Region Splitting and Merging Technique For Image segmentation.Region Splitting and Merging Technique For Image segmentation.
Region Splitting and Merging Technique For Image segmentation.
 
Formont.ppt
Formont.pptFormont.ppt
Formont.ppt
 
Fuzzy c means manual work
Fuzzy c means manual workFuzzy c means manual work
Fuzzy c means manual work
 
region Basd in ML
region Basd in MLregion Basd in ML
region Basd in ML
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networks
 

Viewers also liked

Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Asoka Korale
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
ijpla
 
Birch
BirchBirch
Birch
ngocdiem87
 
Cure, Clustering Algorithm
Cure, Clustering AlgorithmCure, Clustering Algorithm
Cure, Clustering Algorithm
Lino Possamai
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
Valerii Klymchuk
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
Valerii Klymchuk
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
garima931
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Patrick Nicolas
 
Hierarchical clustering algo for wsn
Hierarchical clustering algo for wsnHierarchical clustering algo for wsn
Hierarchical clustering algo for wsn
Samruddhi Gaikwad
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
Spark Summit
 

Viewers also liked (11)

Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
 
Birch
BirchBirch
Birch
 
Cure, Clustering Algorithm
Cure, Clustering AlgorithmCure, Clustering Algorithm
Cure, Clustering Algorithm
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentiment
 
Hierarchical clustering algo for wsn
Hierarchical clustering algo for wsnHierarchical clustering algo for wsn
Hierarchical clustering algo for wsn
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
 

Similar to Mixed Numeric and Categorical Attribute Clustering Algorithm

FREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICS
FREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICSFREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICS
FREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICS
Airlangga University , Indonesia
 
IRJET- Segregation of Machines According to the Noise Emitted by Differen...
IRJET-  	  Segregation of Machines According to the Noise Emitted by Differen...IRJET-  	  Segregation of Machines According to the Noise Emitted by Differen...
IRJET- Segregation of Machines According to the Noise Emitted by Differen...
IRJET Journal
 
MNIST 10-class Classifiers
MNIST 10-class ClassifiersMNIST 10-class Classifiers
MNIST 10-class Classifiers
Sheetal Gangakhedkar
 
Variable Transformation in P&C Loss Models Based on Monotonic Binning
Variable Transformation  in P&C Loss Models  Based on Monotonic BinningVariable Transformation  in P&C Loss Models  Based on Monotonic Binning
Variable Transformation in P&C Loss Models Based on Monotonic Binning
WenSui Liu
 
Time series data mining techniques
Time series data mining techniquesTime series data mining techniques
Time series data mining techniques
Shanmukha S. Potti
 
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
Smarten Augmented Analytics
 
Six sigma pedagogy
Six sigma pedagogySix sigma pedagogy
Six sigma pedagogy
MallikarjunRaoPanaba
 
Six sigma
Six sigma Six sigma
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
cookie1969
 
Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)Shubham Gupta
 
IRJET- Advanced Control Strategies for Mold Level Process
IRJET- Advanced Control Strategies for Mold Level ProcessIRJET- Advanced Control Strategies for Mold Level Process
IRJET- Advanced Control Strategies for Mold Level Process
IRJET Journal
 
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
Smarten Augmented Analytics
 
Summarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdfSummarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdf
JustynOwen
 
Sqqs1013 ch2-a122
Sqqs1013 ch2-a122Sqqs1013 ch2-a122
Sqqs1013 ch2-a122
kim rae KI
 
IRJET- Interactive Image Segmentation with Seed Propagation
IRJET-  	  Interactive Image Segmentation with Seed PropagationIRJET-  	  Interactive Image Segmentation with Seed Propagation
IRJET- Interactive Image Segmentation with Seed Propagation
IRJET Journal
 
Answer key for pattern recognition and machine learning
Answer key for pattern recognition and machine learningAnswer key for pattern recognition and machine learning
Answer key for pattern recognition and machine learning
VijayAECE1
 
Clusters (4).pptx
Clusters (4).pptxClusters (4).pptx
Clusters (4).pptx
brahimNasibov
 
ACMS TV Ratings Midterm Angelini
ACMS TV Ratings Midterm AngeliniACMS TV Ratings Midterm Angelini
ACMS TV Ratings Midterm AngeliniBrandon Angelini
 
A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...
A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...
A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...
Gustavo Dejean
 

Similar to Mixed Numeric and Categorical Attribute Clustering Algorithm (20)

FREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICS
FREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICSFREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICS
FREQUENCY DISTRIBUTION ( distribusi frekuensi) - STATISTICS
 
IRJET- Segregation of Machines According to the Noise Emitted by Differen...
IRJET-  	  Segregation of Machines According to the Noise Emitted by Differen...IRJET-  	  Segregation of Machines According to the Noise Emitted by Differen...
IRJET- Segregation of Machines According to the Noise Emitted by Differen...
 
MNIST 10-class Classifiers
MNIST 10-class ClassifiersMNIST 10-class Classifiers
MNIST 10-class Classifiers
 
Variable Transformation in P&C Loss Models Based on Monotonic Binning
Variable Transformation  in P&C Loss Models  Based on Monotonic BinningVariable Transformation  in P&C Loss Models  Based on Monotonic Binning
Variable Transformation in P&C Loss Models Based on Monotonic Binning
 
Time series data mining techniques
Time series data mining techniquesTime series data mining techniques
Time series data mining techniques
 
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
 
Six sigma pedagogy
Six sigma pedagogySix sigma pedagogy
Six sigma pedagogy
 
Six sigma
Six sigma Six sigma
Six sigma
 
Cluto presentation
Cluto presentationCluto presentation
Cluto presentation
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
 
Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)
 
IRJET- Advanced Control Strategies for Mold Level Process
IRJET- Advanced Control Strategies for Mold Level ProcessIRJET- Advanced Control Strategies for Mold Level Process
IRJET- Advanced Control Strategies for Mold Level Process
 
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
 
Summarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdfSummarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdf
 
Sqqs1013 ch2-a122
Sqqs1013 ch2-a122Sqqs1013 ch2-a122
Sqqs1013 ch2-a122
 
IRJET- Interactive Image Segmentation with Seed Propagation
IRJET-  	  Interactive Image Segmentation with Seed PropagationIRJET-  	  Interactive Image Segmentation with Seed Propagation
IRJET- Interactive Image Segmentation with Seed Propagation
 
Answer key for pattern recognition and machine learning
Answer key for pattern recognition and machine learningAnswer key for pattern recognition and machine learning
Answer key for pattern recognition and machine learning
 
Clusters (4).pptx
Clusters (4).pptxClusters (4).pptx
Clusters (4).pptx
 
ACMS TV Ratings Midterm Angelini
ACMS TV Ratings Midterm AngeliniACMS TV Ratings Midterm Angelini
ACMS TV Ratings Midterm Angelini
 
A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...
A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...
A Methodology for Classifying Visitors to an Amusement Park VAST Challenge 20...
 

More from Asoka Korale

Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...
Asoka Korale
 
Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...
Asoka Korale
 
Novel price models in the capital market
Novel price models in the capital marketNovel price models in the capital market
Novel price models in the capital market
Asoka Korale
 
Modeling prices for capital market surveillance
Modeling prices for capital market surveillanceModeling prices for capital market surveillance
Modeling prices for capital market surveillance
Asoka Korale
 
Entity profling and collusion detection
Entity profling and collusion detectionEntity profling and collusion detection
Entity profling and collusion detection
Asoka Korale
 
Entity Profiling and Collusion Detection
Entity Profiling and Collusion DetectionEntity Profiling and Collusion Detection
Entity Profiling and Collusion Detection
Asoka Korale
 
Markov Decision Processes in Market Surveillance
Markov Decision Processes in Market SurveillanceMarkov Decision Processes in Market Surveillance
Markov Decision Processes in Market Surveillance
Asoka Korale
 
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
Asoka Korale
 
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
Asoka Korale
 
Customer Lifetime Value Modeling
Customer Lifetime Value ModelingCustomer Lifetime Value Modeling
Customer Lifetime Value Modeling
Asoka Korale
 
Forecasting models for Customer Lifetime Value
Forecasting models for Customer Lifetime ValueForecasting models for Customer Lifetime Value
Forecasting models for Customer Lifetime Value
Asoka Korale
 
Capacity and utilization enhancement
Capacity and utilization enhancementCapacity and utilization enhancement
Capacity and utilization enhancement
Asoka Korale
 
Cell load KPIs in support of event triggered Cellular Yield Maximization
Cell load KPIs in support of event triggered Cellular Yield MaximizationCell load KPIs in support of event triggered Cellular Yield Maximization
Cell load KPIs in support of event triggered Cellular Yield Maximization
Asoka Korale
 
Vehicular Traffic Monitoring Scenarios
Vehicular Traffic Monitoring ScenariosVehicular Traffic Monitoring Scenarios
Vehicular Traffic Monitoring Scenarios
Asoka Korale
 
Introduction to Bit Coin Model
Introduction to Bit Coin ModelIntroduction to Bit Coin Model
Introduction to Bit Coin Model
Asoka Korale
 
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Asoka Korale
 
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedinAsoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedinAsoka Korale
 
event tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedinevent tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedinAsoka Korale
 
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedinIET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedinAsoka Korale
 
Estimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedinEstimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedinAsoka Korale
 

More from Asoka Korale (20)

Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...
 
Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...Improving predictability and performance by relating the number of events and...
Improving predictability and performance by relating the number of events and...
 
Novel price models in the capital market
Novel price models in the capital marketNovel price models in the capital market
Novel price models in the capital market
 
Modeling prices for capital market surveillance
Modeling prices for capital market surveillanceModeling prices for capital market surveillance
Modeling prices for capital market surveillance
 
Entity profling and collusion detection
Entity profling and collusion detectionEntity profling and collusion detection
Entity profling and collusion detection
 
Entity Profiling and Collusion Detection
Entity Profiling and Collusion DetectionEntity Profiling and Collusion Detection
Entity Profiling and Collusion Detection
 
Markov Decision Processes in Market Surveillance
Markov Decision Processes in Market SurveillanceMarkov Decision Processes in Market Surveillance
Markov Decision Processes in Market Surveillance
 
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
 
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
 
Customer Lifetime Value Modeling
Customer Lifetime Value ModelingCustomer Lifetime Value Modeling
Customer Lifetime Value Modeling
 
Forecasting models for Customer Lifetime Value
Forecasting models for Customer Lifetime ValueForecasting models for Customer Lifetime Value
Forecasting models for Customer Lifetime Value
 
Capacity and utilization enhancement
Capacity and utilization enhancementCapacity and utilization enhancement
Capacity and utilization enhancement
 
Cell load KPIs in support of event triggered Cellular Yield Maximization
Cell load KPIs in support of event triggered Cellular Yield MaximizationCell load KPIs in support of event triggered Cellular Yield Maximization
Cell load KPIs in support of event triggered Cellular Yield Maximization
 
Vehicular Traffic Monitoring Scenarios
Vehicular Traffic Monitoring ScenariosVehicular Traffic Monitoring Scenarios
Vehicular Traffic Monitoring Scenarios
 
Introduction to Bit Coin Model
Introduction to Bit Coin ModelIntroduction to Bit Coin Model
Introduction to Bit Coin Model
 
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
 
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedinAsoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
 
event tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedinevent tiggered cellular yield enhancement linkedin
event tiggered cellular yield enhancement linkedin
 
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedinIET_Estimating_market_share_through_mobile_traffic_analysis linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
 
Estimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedinEstimating market share through mobile traffic analysis linkedin
Estimating market share through mobile traffic analysis linkedin
 

Recently uploaded

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 

Recently uploaded (20)

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 

Mixed Numeric and Categorical Attribute Clustering Algorithm

  • 1. MIXED NUMERIC AND CATEGORICAL ATTRIBUTE CLUSTERING ALGORITHM MODELING DR. ASOKA KORALE, C.ENG. MIET & MIESL
  • 2. ADVANTAGES TO NUMERIC AND CATEGORICAL ATTRIBUTE CLUSTERING Slide | 2 Improved Targeting in Campaigns & Insight in to Segments Currently clustering on numeric variables Age, Net Stay, ARPU PRIMARY ATTRIBUTES THAT CAN BE INCLUDED WITH MIXED ATTRIBUTE TYPE CLUSTERING – ACCOUNT TYPE, GENDER, GEO LOCATION, …… Currently Fuzzy C – Means Algorithm used in Clustering Digital Advertizing SEGMENTATIONS INCREASINGLY BASED ON CLUSTERING Include other Categorical attributes depending on Interest segment to create –”Micro Segments”
  • 3. WIDENING POTENTIAL INSIGHTS THROUGH CATEGORICAL CLUSTERING Slide | 3 Improved Targeting in Campaigns & All Attributes Can be Clustered – leading to very specific and wider array of segments Geographic attribute clustering to incorporate Income/ARPU hotspots at micro level
  • 4. CONCEPT UNDERLYING THE MIXED K PROTOTYPES ALGORITHM [1] Slide | 4 point “d” and point “c” may switch sides depending on how similar the numeric part and categorical part of the point is similar to the numeric and categorical part of the centroid (prototype) Influence or contribution of Numeric and Categorical Attributes of a data point can be controlled via a parameter “gamma” Point “a” may switch if the categorical part is closer to the categorical centroid (prototype) more than its numeric part is close to the numeric part of the centroid. Numeric and Categorical Attributes parts of a data point can be considered separately and two sets of centroids act as attractors for each Attribute type in each cluster Numeric Attribute1 Shapes represent two values of a single categorical variable Numeric Attribute2 [1]. Huang, CSIRO, Australia
  • 5. MIXED K PROTOTYPES ALGORITHM [1] Slide | 5 Distance measure to a prototype (center) of two parts – numeric and categorical Numeric Attributes - Euclidian Distance Categorical Attributes – Dissimilarity Measure Centroid of Numeric Attributes – a simple average of the points in that cluster Includes “Yij” a fuzzy membership function if we wish to go in that direction
  • 6. MIXED K PROTOTYPES ALGORITHM [1] Slide | 6 Minimize the total cost “E” which is the sum of the distances to the numeric and categorical parts of the centroid (prototype) Centroid of Categorical attributes determined on highest frequency of attribute value in each cluster
  • 7. Slide | 7 CONVERGENCE PERFORMANCE 0 5 10 15 20 25 30 35 40 0 200 400 600 800 1000 1200 1400 1600 Total no of switches at each iteration IterationNumber 0 5 10 15 20 25 30 35 40 1.2 1.3 1.4 1.5 1.6 1.7 1.8 x 10 4 IterationNumber Total Distance at eachiteration 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 40 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 IterationNumber Total Categorical Distance at eachiteration 1 2 3 4 5 6 7 8
  • 8. Slide | 8 CLUSTER & SEGMENT PROFILE 1 2 3 4 5 6 7 8 0 200 400 600 800 1000 1200 Number of Cx in eachCluster Cluster ID 20 30 40 50 60 70 80 90 1 2 3 4 5 6 7 8 Cluster/Segment ID Age 0 50 100 150 200 250 1 2 3 4 5 6 7 8 Cluster/Segment ID Net Stay 0 0.5 1 1.5 2 2.5 3 3.5 4 x 10 4 1 2 3 4 5 6 7 8 Cluster/Segment ID ARPU
  • 9. Slide | 9 VALIDATION WITH DISTRIBUTION ANALYSIS Cluster ID Cx in Cluster Avg. Age Spread Age Avg. Net- Stay Spred Net-Stay Avg. ARPU Spread ARPU Post Paid Pre Paid Female Male 1 913 27 5 28 26 1231 1427 90 823 913 0 2 930 28 5 19 16 1407 1699 159 771 0 930 3 407 53 8 46 35 1095 1303 34 373 407 0 4 409 54 8 34 24 967 919 66 343 0 409 5 556 36 11 82 43 2601 2399 546 10 556 0 6 542 32 5 95 27 1031 927 0 542 67 475 7 1116 36 9 96 44 2917 2669 1116 0 0 1116 8 348 57 7 131 33 1205 853 147 201 33 315 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 0 50 100 150 200 Histogram Cx Age, Male Age (years) Frequency 15 20 25 30 35 40 45 50 55 60 65 70 75 0 50 100 150 Histogram Cx Age, Female Age (years) Frequency Due to a certain bi-modal nature, clustering able to identify the modes in the Age histograms
  • 10. Slide | 10 Cluster ID Data points in Cluster Avg. Age Spread Age Avg. Net- Stay Spred Net-Stay Avg. ARPU Spread ARPU Number Post Paid Number Pre Paid Number Female Number Male 1 913 27 5 28 26 1231 1427 90 823 913 0 2 930 28 5 19 16 1407 1699 159 771 0 930 3 407 53 8 46 35 1095 1303 34 373 407 0 4 409 54 8 34 24 967 919 66 343 0 409 5 556 36 11 82 43 2601 2399 546 10 556 0 6 542 32 5 95 27 1031 927 0 542 67 475 7 1116 36 9 96 44 2917 2669 1116 0 0 1116 8 348 57 7 131 33 1205 853 147 201 33 315 0 12 24 36 48 60 72 84 96 108 120 132 144 156 168 180 192 204 216 228 240 0 50 100 150 200 250 Histogram Cx Network Stay Net Stay(months) Frequency No identifiable structure in Net Stay distribution VALIDATION WITH DISTRIBUTION ANALYSIS Cluster Segment Profile
  • 11. Slide | 11 CLUSTERING NUMERIC PART OF SEGMENTS IN 3D -2 0 2 4 -5 0 5 -5 0 5 10 15 20 Age (normalized) Segmental Analysis: Age, Net Stay and ARPU Net-Stay (normalized) ARPU(normalized) 1 2 3 4 5 6 7 8
  • 12. Slide | 12 NOTABLE POINTS • Allows us to cluster most attributes (within reason) • Particularly if the categorical attributes do not have many different component values • Reasonable convergence performance both in terms of run time and number of iterations • Different dissimilarity measures and distance criteria will give differing results • The influence of the categorical part via gamma may also need to change with the method used • Algorithm somewhat sensitive to initial conditions – initialization of centroids • Explore likelihood of falling in to a local minima and getting trapped there leading to a sub optimal final solution • To do….. • Each drop can result in a non unique final result but will not impact the underlying trends and insights in to each segment