Cluster Analyis    Anindita
Cluster analysisThe class of technique used to classify objects or  cases into relatively homogenous groups  called cluste...
Uses of Cluster Analysis• Segmenting the market(benefits soughts)• Understanding Buyer behavior• Assess new product opport...
Steps• Formulation of problem: Selecting relevant  variables on interval scale.• Select a distance measure: how close or  ...
Types• Hierarchicala)Agglomerative   1. Linkage(single, complete and average)   2. Variance( ward’s)a)Divisive• Non- Hiera...
Steps in SPSS1. ANALYZE from SPSS2. Click CLASSIFY and then HIERARCHICAL   CLUSTER3. Move the VARIABLES into VARIABLE box4...
Hierarchical clustering
Agglomeration Schedule• “Stage” with 19 clusters• Respondents 14 & 16 are combined “ Clusters  combined”• Euclidean distan...
Icicle plot• Columns corresponds to objects being clustered, 1  through 20.• Row corresponds to number of clusters• Figure...
Dendogram• Read fro left to right• Vertical lines represent clusters that r joined  together.• Position of line represents...
Deciding the Clusters• Practical , theoretical or conceptual  considerations while deciding number of  clusters• In hierar...
Interpret and profiling the clusters• Cluster 1 : High values variables V1(shopping is fun) and V 3(I  combine shopping wi...
Non Hierarchical Clustering
• The Initial Cluster center are the values of three  randomly selected cases. Each case is assigned to  nearest classific...
• The distance between the final cluster centers  indicated that the pair of clusters are well  seperated• Univarite F tes...
Two Step clustering
• AIC is at minimum (97.594) for a three cluster  solution. A comparison of cluster centroids  show that cluster 1(two ste...
Upcoming SlideShare
Loading in …5
×

Cluster

505 views
390 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
505
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cluster

  1. 1. Cluster Analyis Anindita
  2. 2. Cluster analysisThe class of technique used to classify objects or cases into relatively homogenous groups called clusters. Also known as classification analysis or numerical taxonomy.Example: Clustering of variables on the variables like quality consciousness(var1) and Price sensitivity(var2)It requires no prior information about sample
  3. 3. Uses of Cluster Analysis• Segmenting the market(benefits soughts)• Understanding Buyer behavior• Assess new product opportunities(brands or markets)• Selecting test markets(grouping cities)• Effort to reduce clusters
  4. 4. Steps• Formulation of problem: Selecting relevant variables on interval scale.• Select a distance measure: how close or different objects are?Euclidean Distance• Select clustering procedure• Interpret or profiling clusters• Assess reliability of clustering
  5. 5. Types• Hierarchicala)Agglomerative 1. Linkage(single, complete and average) 2. Variance( ward’s)a)Divisive• Non- Hierarchichal(k-means)
  6. 6. Steps in SPSS1. ANALYZE from SPSS2. Click CLASSIFY and then HIERARCHICAL CLUSTER3. Move the VARIABLES into VARIABLE box4. In Cluster check CASES. In DISPLAY Box check STATISTICS and PLOTS5. Click on statistics. In pop up window check agglomeration schedule. In cluster membership
  7. 7. Hierarchical clustering
  8. 8. Agglomeration Schedule• “Stage” with 19 clusters• Respondents 14 & 16 are combined “ Clusters combined”• Euclidean distance betwn two respondents “Cofficients”• “Stage cluster first appears” indicates the stage at which first cluster is formed. Entry of 1 in stage 6, respondent 14 was first grouped in stage 1• “Next Stage” the stage at which another cluster is combined with this one. Number is 6 so at the stage 6, 10 and 14 combined to form a single cluster
  9. 9. Icicle plot• Columns corresponds to objects being clustered, 1 through 20.• Row corresponds to number of clusters• Figure is read from bottom to top• First all cases are considered, last row 20 initial clusters• First step, two closest objects are combined resulting in 19 clusters, 14 and 16 are combined, X’s• Row 18 corresponds, 18 clusters, 6 and 7 are combined. Here 16 are individual, two contains two respondents.• Each step leads to a new cluster
  10. 10. Dendogram• Read fro left to right• Vertical lines represent clusters that r joined together.• Position of line represents the distance at which clusters were joined• Initially its less different as distances increase it becomes clear.
  11. 11. Deciding the Clusters• Practical , theoretical or conceptual considerations while deciding number of clusters• In hierarchical clustering, the distances at which clusters are formed are a criteria. In “coefficients” column suddenly more than doubles between stages 17 (three clusters) and 18 (clusters). That can be seen in last two stages of dendogram.
  12. 12. Interpret and profiling the clusters• Cluster 1 : High values variables V1(shopping is fun) and V 3(I combine shopping with eating out). It has a low value for V5( I don’t care about shopping). Cluster 1 can be labeled as “fun loving and concerned shoppers”. This consists of respondents or cases 1,3, 6,7,8,12,15 and 17.• Cluster 2 is just opposite with low values on V1 and V3 and high values V5 so it can be labeled as “Apathetic shoppers”. It consists of cases 2,5, 9, 11, 13 and 20.• Cluster 3 has high values of V2(shopping upsets budget, V4(I try to get best buys) and V6( comparing saves money) so they can be labeled as economical shoppers. It consists of cases 4, 10,14, 16, 18 and 19.
  13. 13. Non Hierarchical Clustering
  14. 14. • The Initial Cluster center are the values of three randomly selected cases. Each case is assigned to nearest classification cluster center• The results also displays the cluster membership and the distance between each case and its classification center• Cluster 1 of hierarchical clustering is same sa cluster 3 of non hieararchical clustering• Cluster 3 of hierarchical clustering is same as cluster 1 of non hierarchical clustering
  15. 15. • The distance between the final cluster centers indicated that the pair of clusters are well seperated• Univarite F test for each clustering variable is presented. It is only desriptive
  16. 16. Two Step clustering
  17. 17. • AIC is at minimum (97.594) for a three cluster solution. A comparison of cluster centroids show that cluster 1(two step cluster) corresponds to cluster 2 (hierarchical). Cluster 2(two step cluster) corresponds to cluster 3(hierarchical) .• The results are same ensures validity of clustering

×