SlideShare a Scribd company logo
Gauging Heterogeneity in Online
Consumer Behaviour Data:
A Proximity Graph Approach
Natalie de Vries, Ahmed Shamsul Arefin, Pablo Moscato
The Priority Research Centre for Bioinformatics, Biomarker Discovery and Information-Based Medicine (CIBM)
School of Electrical Engineering and Computer Science
Faculty of Engineering and Built Environment
The University of Newcastle, Australia
Agenda
• Introduction and objectives
• Dataset characteristics
• Outline of the study
• Methodology
• Results
• Significance of the work and future research
directions
• Questions
Introduction
• Increase in online behaviours towards brands
• Increasing importance of social media in marketing strategies
• High levels of heterogeneity amongst consumers
• Need for clustering consumers or objects into similar groups
Middle-
aged
females
Middle-
aged
males
Retirees
Teenagers
Housewives
Introduction: Importance of Clustering in
Marketing
“Brand
lovers”
“Brand
haters”
“Excited
sharers”
“Online
lurkers”
“Quiet
supporters”
• Gaining insights into consumer behaviour
• Market segmentation
• Targeted marketing strategies
• Personalised marketing messages
• Online technologies available to personalise brand
messages at a very small or individual level
“Old-fashioned Way”Modern “data-driven way”
Objectives of this Study
• Create an understanding of the natural groupings in a
consumer cohort based on their online consumer behaviours
towards a particular brand
• Find a suitable distance measure for analysing a specific
dataset in a specific context
• Explore the use of meta-features for finding a more accurate
partitioning of respondents
• Uncover the best way to cluster consumers; e.g. using raw
data or using a form of meta-features and using either; intra-
or inter-construct relationships
Methodology: Dataset collection and
preparation
Construct Source Code
Number of
Items
Usage Intensity
(Jahn and
Kunz 2012)
UI 3
Functional Value FUV 4
Hedonic Value HED 4
Social Interaction
Value
SOC 4
Customer
Engagement
CE 5
Customer Loyalty LO 6
Brand Involvement
(Carlson and
O'Cass 2012)
INV 6
Co-Creation Value
(O'Cass and
Ngo 2011)
CCV 6
SNS-Specific Loyalty
Behaviours
(O'Cass and
Carlson
2012)
ON 3
Self-Brand-
Congruency
(Hohenstein,
Sirgy et al.
2007)
SBC 5
Survey Constructs
Category
No.
Explanation
Percentage
of sample
1 Fashion Brands 31.54%
2
Community, Charities, Personality and Sports
Fan Pages
23.99%
3 Other Services 19.68%
4 Other Consumer Goods 8.09%
5 Hospitality (Restaurants, Cafes, Bars) 7.28%
6 Consumer Electronics 7.01%
7 Automotive 2.43%
Respondents’ chosen brands’ categories
Methodology: Outline of
the study
Methodology: Difference Meta-features
The difference of values
between two measured
features might be capable to
distinguish between two
given categories, even when
those features are not able to
do so alone (De Paula et al, 2011)
Previous successful
application of difference
meta-features in Alzheimer’s
Disease biomarker detection
(De Paula et al. 2011) and (Arefin et al.
2012), both in PLoS ONE.
Data collection
and pre-
processing
Meta-features:
Pair-wise
differences
Meta-features:
Pair-wise
products
Intra- and
inter-construct
relationships
Distance
Computation
Data preparation
-6
-4
-2
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11
f1
f2
Meta-f
Class A Class B
-6
-4
-2
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11 12
f1
f2
Meta-f
Class A Class B
Methodology: Product Meta-features
The product of values between
two measured features might be
capable to distinguish between
two given categories, even when
those features are not able to do
so alone.
This study is the first to trial the
application of this idea.
Left, the values of f1 (blue) and
f2 (red) do not distinguish the
classes well but their product
(meta-feature in green) does.
Data collection
and pre-
processing
Meta-features:
Pair-wise
differences
Meta-features:
Pair-wise
products
Intra- and
inter-construct
relationships
Distance
Computation
Data preparation
0
2
4
6
8
10
12
14
16
18
1 2 3 4 5 6 7 8 9 10 11 12
f1
f2
Meta-f
Class A Class B0
2
4
6
8
10
12
14
16
18
1 2 3 4 5 6 7 8 9 10 11 12
f1
f2
Meta-f
Class A Class B
Methodology: Distance Computation
and Dataset Variations
• Distance matrices computed for all 7 datasets
• Various distance/correlations metrics used on each
of the dataset variations
X X X
Distance Metrics:
• Pearson
• Spearman
• Robust
• Euclidean
• Cosine
Various datasets:
• Original
• Difference
meta-features
• Product meta-
features
Interactions:
• Intra-construct
item
relationships
• Inter-construct
item
relationships
Values of k for
kNN Cliques:
k=3
k=4
k=5
k=6
= 7 datasets and
140 graphs
Methodology: MST-kNN and kNN Cliques
Complete graph Minimum Spanning Tree Select and remove edges
that are not k-Nearest
Neigbors
Final forest (a
forest is a
set of trees) =
clusters
Previous applications of the MST-kNN method
• U.S. Stock market time series data (Inostroza-Ponta, Berretta, & Moscato, 2011)
• Yeast gene expression data (Inostroza-Ponta, Mendes, Berretta, & Moscato, 2007)
• Alzheimer’s disease data - in the order of 1 million data elements (Arefin, Mathieson, Johnstone, Berretta, &
Moscato, 2012)
• Prostate cancer data (Capp et al., 2009)
These examples show the methodology proposed here has a proven scalability for larger
datasets
MST-kNN + kNN Cliques Results
Results: Clustering Highlights
Heterogeneous cluster?More homogenous cluster?
And what about the statistical
difference of the clustering result that
these highlights came from?
Results: Clustering and Significance Values
Data Rows selected
Distance
Metric
MST-kNN merged
with the kNN cliques of
size
p-values
Wilcoxon’s Test Kruskal-Wallis
Original All
Robust 5NN 0.021187 0.042364
Spearman 6NN 0.025987 0.051962
Robust 6NN 0.028565 0.057117
Pearson 3NN 0.030232 0.060451
Spearman 3NN 0.040661 0.081306
Euclidean 6NN 0.041232 0.082448
Difference
Metafeatures
‘Intra’ constructs
Robust 3NN 0.016551 0.033095
Robust 6NN 0.017177 0.03434
Pearson 3NN 0.018628 0.0372481
Pearson 6NN 0.019066 0.038124
Pearson 5NN 0.019656 0.039303
All Pearson 3NN 0.020594 0.041180
Product
Metafeatures
‘Inter’ Constructs
Spearman 3NN 0.016949 0.033891
Pearson 4NN 0.01757 0.035132
All Pearson 4NN 0.017721 0.035433
‘Inter’ Constructs
Pearson 6NN 0.01781 0.035611
Pearson 3NN 0.017816 0.035624
‘Inter’ Constructs Robust 4NN 0.017998 0.035988
Results: Analysis of clusters
Cluster
No. of
respondents
Avg.
Age
Age
range
% Males/
Females
1 103 20.5 17-32 39.8 / 60.2
2 92 21.3 18-36 39.1 / 60.9
3 31 23.4 19-49 51.6 / 48.4
4 71 21.0 18-44 40.8 / 59.2
5 4 22.3 20-24 75 / 25
6 18 21.1 18-26 33.3 / 66.7
7 10 22.5 18-29 20 / 80
8 5 21 20-24 80 / 20
9 20 23 19-44 45 / 55
10 12 22 18-45 41.7 / 58.3
11 5 26.4 20-46 0 / 100
Clusters’ demographic informationThis figure presents the frequencies of the
respondents’ chosen brand categories for
two of the largest clusters
The difference in degrees of heterogeneity
between different clusters can be seen in
these figures.
Furthermore, these two clusters highlight
the differences in brand preferences
amongst respondents that do exist within
each cluster of similar consumers
Heterogeneous spread of respondents’
chosen brand categories
Contribution and Significance
• Methodological guide for the investigation of several distance
measures, meta-features, relationships of theoretical
construct items to find ‘best’ clustering results
• Expanded on the MST-kNN clustering method for increased
potential to find statistically significant clusters of categories
of consumers and their chosen brands
• The clustering methodology used in this study highlights the
high levels of heterogeneity found in consumer’s online
behaviours towards brands
Future Research Directions
• Various domains and contexts to apply the novel process outlined
in this study
• Combine a study using survey data as well as ‘live’ behaviour data
from social networking sites (real-time interactions)
• Further exploration of meta-features in both survey data and ‘real’
online behaviour clustering studies; ‘differences’ meta-features in
this study yielded better results
• This study guides the development of future feature selection
models to identify group of consumers according to higher-order
characteristics.
Thank you
Questions?
We would like to thank Dr. Jamie Carlson and Mr. Benjamin Lucas for their advise and proofreading.
Dr. Jamie Carlson supervised Ms. de Vries’ thesis project and the initial collection and analysis of this data.
Thanks to Mario Inostroza-Ponta for the use of his MST-kNN images.
References (from paper)
• [1] I. P. Cvijikj and F. Michahelles, "Online engagement factors on Facebook brand pages," Social Network Analysis and Mining, vol. 3, pp. 843-
861, 2013.
• [2] B. Jahn and W. Kunz, "How to transform consumers into fans of your brand," Journal of Service Management, vol. 23, pp. 344-361, 2012.
• [3] T. S. Chung and M. Wedel, "Adaptive personalization of mobile information services," in Handbook of Service Marketing Research, R. T.
Rust and M.-H. Huang, Eds., ed Cheltenham: Edward Elgar Publishing Limited, 2014.
• [4] N. J. de Vries, J. Carlson, and P. Moscato, "A Data-Driven Approach to Reverse Engineering Customer Engagement Models: Towards
Functional Constructs," PLoS ONE, vol. 9, p. e102768, 2014.
• [5] B. Jahn and W. Kunz, "How to Transform Consumers into Fans of your Brand," Journal of Service Management, vol. 23, pp. 344-361, 2012.
• [6] J. Carlson and A. O'Cass, "Optimizing the Online Channel in Professional Sport to Create Trusting and Loyal Consumers: The Role of the
Professional Sports Team Brand and Service Quality. ," Journal of Sport Management, vol. 26, p. 463, 2012.
• [7] N. Hohenstein, M. J. Sirgy, A. Herrmann, and M. Heitmann, "Self-Congruity: Antecedents and Consequences," in 34th La Londe
International Research Conference in Marketing Communications and Consumer Behaviour Aix en Provance: France University Paul Cezanne, 2007,
pp. 118-130.
• [8] A. O'Cass and L. Ngo, "Examining the Firm’s Value Creation Process: A Managerial Perspective of the Firm’s Value Offering Strategy and
Performance," British Journal of Management, vol. 22, pp. 646-671, 2011.
• [9] A. O'Cass and J. Carlson, "An Empirical Assessment of Consumers' Evaluations of Web Site Service Quality: Conceptualizing and Testing a
Formative Model," Journal of Services Marketing, vol. 26, pp. 419-434, 2012.
• [10] L. D. Peters, "Theory Testing in Social Research," The Marketing Review, vol. 3, pp. 65-82, 2002.
• [11] M. R. de Paula, M. G. Ravetti, R. Berretta, and P. Moscato, "Differences in Abundances of Cell-Signalling Proteins in Blood Reveal Novel
Biomarkers for Early Detection Of Clinical Alzheimer’s Disease," PLoS ONE, vol. 6, pp. 1-14, 2011.
• [12] A. S. Arefin, L. Mathieson, D. Johnstone, R. Berretta, and P. Moscato, "Unveiling Clusters of RNA Transcript Pairs Associated with Markers
of Alzheimer's Disease Progression," PLoS ONE, vol. 7, Sep 21 2012.
• [13] M. Inostroza-Ponta, R. Berretta, A. Mendes, and P. Moscato, "An automatic graph layout procedure to visualize correlated data," in
Artificial Intelligence in Theory and Practice, ed: Springer, 2006, pp. 179-188.
• [14] A. S. Arefin, L. Mathieson, D. Johnstone, R. Berretta, and P. Moscato, "Unveiling clusters of RNA transcript pairs associated with markers of
Alzheimer’s disease progression," PLoS ONE, vol. 7, p. e45535, 2012.
• [15] A. Capp, M. Inostroza-Ponta, D. Bill, P. Moscato, C. Lai, D. Christie, et al., "Is there more than one proctitis syndrome? A revisitation using
data from the TROG 96.01 trial," Radiotherapy and oncology, vol. 90, pp. 400-407, 2009.
• [16] M. Inostroza-Ponta, A. Mendes, R. Berretta, and P. Moscato, "An integrated QAP-based approach to visualize patterns of gene expression
similarity," in Progress in Artificial Life, ed: Springer, 2007, pp. 156-167.
• [17] M. Inostroza-Ponta, R. Berretta, and P. Moscato, "QAPgrid: A two level QAP-based approach for large-scale data analysis and
visualization," PloS one, vol. 6, p. e14468, 2011.
• [18] A. S. Arefin, M. Inostroza-Ponta, L. Mathieson, R. Berretta, and P. Moscato, "Clustering nodes in large-scale biological networks using
external memory algorithms," in Algorithms and Architectures for Parallel Processing, ed: Springer, 2011, pp. 375-386.
• [19] A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "Gpu-fs-knn: A software tool for fast and scalable knn computation using GPUs," PLoS
ONE, vol. 7, p. e44000, 2012.
• [20] A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "kNN-Borůvka-GPU: A Fast and Scalable MST Construction from kNN Graphs on
GPU," in Computational Science and Its Applications–ICCSA 2012, ed: Springer, 2012, pp. 71-86.
• [21] A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "kNN-MST-Agglomerative: A fast and scalable graph-based data clustering
approach on GPU," in Computer Science & Education (ICCSE), 2012 7th International Conference on, 2012, pp. 585-590.
• [22] E. J. Chesler and M. A. Langston, Combinatorial genetic regulatory network analysis tools for high throughput transcriptomic data:
Springer, 2006.
• [23] M. Hollander, D. A. Wolfe, and E. Chicken, Nonparametric statistical methods vol. 751: John Wiley & Sons, 2013.
• [24] C. E. Shannon, "The mathematical theory of communication," The Bell System Technical Journal, vol. 27, pp. 379-423 & 623-656,
1948.
• [25] A. W. Kruglanski, "The Human Subject in the Psychology Experiment: Fact and Artifact," in Advances in Experimental Social
Psychology vol. 8, L. Berkowittz, Ed., ed New York: Academic Press, 1975, pp. 101-147.
• [26] H. Krasnova, S. Spiekermann, K. Koroleva, and T. Hildebrand, "Online Social Networks: Why we Disclose," Journal of Information
Technology, vol. 25, pp. 109-125, 2010.
• [27] S. C. Chu and Y. Kim, "Determinants of Consumer Engagement in electronic Word of Mouth (eWoM) in Social Networking Sites,"
International Journal of Advertising, vol. 30, pp. 47-75, 2011.
• [28] J. M. Pinho and A. M. Soares, "Examining the Technology Acceptance Model in the Adoption of Social Networks," Journal of
Research in Interactive Marketing, vol. 5, pp. 116-129, 2011.
Additional reference from presentation:
• Arefin AS, Mathieson L, Johnstone D, Berretta R, Moscato P (2012) Unveiling Clusters of RNA Transcript Pairs Associated with Markers of
Alzheimer’s Disease Progression. PLoS ONE 7(9): e45535. doi: 10.1371/journal.pone.0045535
• Rocha de Paula M, Gómez Ravetti M, Berretta R, Moscato P (2011) Differences in Abundances of Cell-Signalling Proteins in Blood Reveal Novel
Biomarkers for Early Detection Of Clinical Alzheimer's Disease. PLoS ONE 6(3): e17481. doi: 10.1371/journal.pone.0017481
References cont.

More Related Content

What's hot

Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...
IOSR Journals
 
RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...
RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...
RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...
ijaia
 
Demography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendationDemography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendation
UmmeSalmaM1
 
Using content features to enhance the
Using content features to enhance theUsing content features to enhance the
Using content features to enhance the
ijaia
 
Ontological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systemsOntological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systems
vikramadityajakkula
 
A location based movie recommender system
A location based movie recommender systemA location based movie recommender system
A location based movie recommender system
ijfcstjournal
 
Extending UTAUT to explain social media adoption by microbusinesses
Extending UTAUT to explain social media adoption by microbusinessesExtending UTAUT to explain social media adoption by microbusinesses
Extending UTAUT to explain social media adoption by microbusinesses
Debashish Mandal
 
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
Journal For Research
 
Unified theory of acceptance and use of technology
Unified theory of acceptance and use of technologyUnified theory of acceptance and use of technology
Unified theory of acceptance and use of technology
Muhammad Farhan Javed
 
Paper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive MultimediaPaper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive MultimediaJeanette Howe
 
Data quality assessment in context= a cognitive perspective
Data quality assessment in context= a cognitive perspectiveData quality assessment in context= a cognitive perspective
Data quality assessment in context= a cognitive perspective
Francisco Vasconcellos
 
Study to investigate which analysis is the best equipped to understand how co...
Study to investigate which analysis is the best equipped to understand how co...Study to investigate which analysis is the best equipped to understand how co...
Study to investigate which analysis is the best equipped to understand how co...
Charm Rammandala
 
Evaluating the Impact of Gamification in High School Library Media Centers
Evaluating the Impact of Gamification in High School Library Media CentersEvaluating the Impact of Gamification in High School Library Media Centers
Evaluating the Impact of Gamification in High School Library Media Centers
Ariel Dagan
 
Ijsom19051398886200
Ijsom19051398886200Ijsom19051398886200
Ijsom19051398886200
IJSOM
 
FACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTAN
FACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTANFACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTAN
FACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTAN
Muhammad Ahmad
 
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Devansh16
 

What's hot (20)

Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...
 
Paper 3 (iftikhar alam)
Paper 3 (iftikhar alam)Paper 3 (iftikhar alam)
Paper 3 (iftikhar alam)
 
T0 numtq0njc=
T0 numtq0njc=T0 numtq0njc=
T0 numtq0njc=
 
How online social ties and product-related risks influence purchase intention...
How online social ties and product-related risks influence purchase intention...How online social ties and product-related risks influence purchase intention...
How online social ties and product-related risks influence purchase intention...
 
RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...
RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...
RANKING BASED ON COLLABORATIVE FEATURE-WEIGHTING APPLIED TO THE RECOMMENDATIO...
 
Mr1480.ch4
Mr1480.ch4Mr1480.ch4
Mr1480.ch4
 
Demography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendationDemography basedhybridrecommendersystemformovierecommendation
Demography basedhybridrecommendersystemformovierecommendation
 
Using content features to enhance the
Using content features to enhance theUsing content features to enhance the
Using content features to enhance the
 
Ontological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systemsOntological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systems
 
A location based movie recommender system
A location based movie recommender systemA location based movie recommender system
A location based movie recommender system
 
Extending UTAUT to explain social media adoption by microbusinesses
Extending UTAUT to explain social media adoption by microbusinessesExtending UTAUT to explain social media adoption by microbusinesses
Extending UTAUT to explain social media adoption by microbusinesses
 
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
 
Unified theory of acceptance and use of technology
Unified theory of acceptance and use of technologyUnified theory of acceptance and use of technology
Unified theory of acceptance and use of technology
 
Paper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive MultimediaPaper Presentation: Data Mining User Preference in Interactive Multimedia
Paper Presentation: Data Mining User Preference in Interactive Multimedia
 
Data quality assessment in context= a cognitive perspective
Data quality assessment in context= a cognitive perspectiveData quality assessment in context= a cognitive perspective
Data quality assessment in context= a cognitive perspective
 
Study to investigate which analysis is the best equipped to understand how co...
Study to investigate which analysis is the best equipped to understand how co...Study to investigate which analysis is the best equipped to understand how co...
Study to investigate which analysis is the best equipped to understand how co...
 
Evaluating the Impact of Gamification in High School Library Media Centers
Evaluating the Impact of Gamification in High School Library Media CentersEvaluating the Impact of Gamification in High School Library Media Centers
Evaluating the Impact of Gamification in High School Library Media Centers
 
Ijsom19051398886200
Ijsom19051398886200Ijsom19051398886200
Ijsom19051398886200
 
FACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTAN
FACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTANFACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTAN
FACTORS INFLUENCING THE ADOPTION OF E-GOVERNMENT SERVICES IN PAKISTAN
 
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
Paper Annotated: SinGAN-Seg: Synthetic Training Data Generation for Medical I...
 

Similar to Presentation at Socialcom2014: Gauging Heterogeneity in Online Consumer Behaviour Data: A Proximity Graph Approach

BIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.pptBIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.ppt
rajsharma159890
 
Mouton Dfid Presentaion 14 September2009
Mouton Dfid Presentaion  14 September2009Mouton Dfid Presentaion  14 September2009
Mouton Dfid Presentaion 14 September2009CRUOnline
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender System
theijes
 
FAIR and metadata standards - FAIRsharing and Neuroscience
FAIR and metadata standards - FAIRsharing and NeuroscienceFAIR and metadata standards - FAIRsharing and Neuroscience
FAIR and metadata standards - FAIRsharing and Neuroscience
Susanna-Assunta Sansone
 
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
Susanna-Assunta Sansone
 
DM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year studentsDM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year students
sriharipatilin
 
Sabina Leonelli
Sabina LeonelliSabina Leonelli
Sabina Leonelli
Anita de Waard
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysisSetia Pramana
 
Automating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge BaseAutomating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge Base
Vaticle
 
Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...
eSAT Journals
 
USER STUDY FOR EXPLORATION OF USERS NEEDS
USER STUDY FOR EXPLORATION OF USERS NEEDS USER STUDY FOR EXPLORATION OF USERS NEEDS
USER STUDY FOR EXPLORATION OF USERS NEEDS
IAEME Publication
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
World Agroforestry (ICRAF)
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approach
Megan O'Donnell
 
Real Life Machine Learning Case on Mobile Advertisement
Real Life Machine Learning Case on Mobile AdvertisementReal Life Machine Learning Case on Mobile Advertisement
Real Life Machine Learning Case on Mobile Advertisement
şadi şeker
 
Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...
CloudTechnologies
 
METHODS1Sampling and MethodologyStuden
METHODS1Sampling and MethodologyStudenMETHODS1Sampling and MethodologyStuden
METHODS1Sampling and MethodologyStuden
DioneWang844
 
What's Trending In Higher Ed Tracking
What's Trending In Higher Ed TrackingWhat's Trending In Higher Ed Tracking
What's Trending In Higher Ed TrackingMelissa Sullivan
 
Supervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For CancerSupervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For Cancer
paperpublications3
 

Similar to Presentation at Socialcom2014: Gauging Heterogeneity in Online Consumer Behaviour Data: A Proximity Graph Approach (20)

BIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.pptBIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.ppt
 
Mouton Dfid Presentaion 14 September2009
Mouton Dfid Presentaion  14 September2009Mouton Dfid Presentaion  14 September2009
Mouton Dfid Presentaion 14 September2009
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender System
 
FAIR and metadata standards - FAIRsharing and Neuroscience
FAIR and metadata standards - FAIRsharing and NeuroscienceFAIR and metadata standards - FAIRsharing and Neuroscience
FAIR and metadata standards - FAIRsharing and Neuroscience
 
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
"Standards landscape" NIF Big Data 2 Knowledge (BD2K) Initiative, Sep, 2013
 
DM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year studentsDM UNIT_5 ppt for btech final year students
DM UNIT_5 ppt for btech final year students
 
Sabina Leonelli
Sabina LeonelliSabina Leonelli
Sabina Leonelli
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysis
 
Automating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge BaseAutomating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge Base
 
Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...Correlation of artificial neural network classification and nfrs attribute fi...
Correlation of artificial neural network classification and nfrs attribute fi...
 
USER STUDY FOR EXPLORATION OF USERS NEEDS
USER STUDY FOR EXPLORATION OF USERS NEEDS USER STUDY FOR EXPLORATION OF USERS NEEDS
USER STUDY FOR EXPLORATION OF USERS NEEDS
 
009 cho
009   cho009   cho
009 cho
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approach
 
Real Life Machine Learning Case on Mobile Advertisement
Real Life Machine Learning Case on Mobile AdvertisementReal Life Machine Learning Case on Mobile Advertisement
Real Life Machine Learning Case on Mobile Advertisement
 
Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...
 
METHODS1Sampling and MethodologyStuden
METHODS1Sampling and MethodologyStudenMETHODS1Sampling and MethodologyStuden
METHODS1Sampling and MethodologyStuden
 
What's Trending In Higher Ed Tracking
What's Trending In Higher Ed TrackingWhat's Trending In Higher Ed Tracking
What's Trending In Higher Ed Tracking
 
TBerger_FinalReport
TBerger_FinalReportTBerger_FinalReport
TBerger_FinalReport
 
Supervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For CancerSupervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For Cancer
 

Recently uploaded

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 

Recently uploaded (20)

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 

Presentation at Socialcom2014: Gauging Heterogeneity in Online Consumer Behaviour Data: A Proximity Graph Approach

  • 1. Gauging Heterogeneity in Online Consumer Behaviour Data: A Proximity Graph Approach Natalie de Vries, Ahmed Shamsul Arefin, Pablo Moscato The Priority Research Centre for Bioinformatics, Biomarker Discovery and Information-Based Medicine (CIBM) School of Electrical Engineering and Computer Science Faculty of Engineering and Built Environment The University of Newcastle, Australia
  • 2. Agenda • Introduction and objectives • Dataset characteristics • Outline of the study • Methodology • Results • Significance of the work and future research directions • Questions
  • 3. Introduction • Increase in online behaviours towards brands • Increasing importance of social media in marketing strategies • High levels of heterogeneity amongst consumers • Need for clustering consumers or objects into similar groups
  • 4. Middle- aged females Middle- aged males Retirees Teenagers Housewives Introduction: Importance of Clustering in Marketing “Brand lovers” “Brand haters” “Excited sharers” “Online lurkers” “Quiet supporters” • Gaining insights into consumer behaviour • Market segmentation • Targeted marketing strategies • Personalised marketing messages • Online technologies available to personalise brand messages at a very small or individual level “Old-fashioned Way”Modern “data-driven way”
  • 5. Objectives of this Study • Create an understanding of the natural groupings in a consumer cohort based on their online consumer behaviours towards a particular brand • Find a suitable distance measure for analysing a specific dataset in a specific context • Explore the use of meta-features for finding a more accurate partitioning of respondents • Uncover the best way to cluster consumers; e.g. using raw data or using a form of meta-features and using either; intra- or inter-construct relationships
  • 6. Methodology: Dataset collection and preparation Construct Source Code Number of Items Usage Intensity (Jahn and Kunz 2012) UI 3 Functional Value FUV 4 Hedonic Value HED 4 Social Interaction Value SOC 4 Customer Engagement CE 5 Customer Loyalty LO 6 Brand Involvement (Carlson and O'Cass 2012) INV 6 Co-Creation Value (O'Cass and Ngo 2011) CCV 6 SNS-Specific Loyalty Behaviours (O'Cass and Carlson 2012) ON 3 Self-Brand- Congruency (Hohenstein, Sirgy et al. 2007) SBC 5 Survey Constructs Category No. Explanation Percentage of sample 1 Fashion Brands 31.54% 2 Community, Charities, Personality and Sports Fan Pages 23.99% 3 Other Services 19.68% 4 Other Consumer Goods 8.09% 5 Hospitality (Restaurants, Cafes, Bars) 7.28% 6 Consumer Electronics 7.01% 7 Automotive 2.43% Respondents’ chosen brands’ categories
  • 8. Methodology: Difference Meta-features The difference of values between two measured features might be capable to distinguish between two given categories, even when those features are not able to do so alone (De Paula et al, 2011) Previous successful application of difference meta-features in Alzheimer’s Disease biomarker detection (De Paula et al. 2011) and (Arefin et al. 2012), both in PLoS ONE. Data collection and pre- processing Meta-features: Pair-wise differences Meta-features: Pair-wise products Intra- and inter-construct relationships Distance Computation Data preparation -6 -4 -2 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 11 f1 f2 Meta-f Class A Class B -6 -4 -2 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 11 12 f1 f2 Meta-f Class A Class B
  • 9. Methodology: Product Meta-features The product of values between two measured features might be capable to distinguish between two given categories, even when those features are not able to do so alone. This study is the first to trial the application of this idea. Left, the values of f1 (blue) and f2 (red) do not distinguish the classes well but their product (meta-feature in green) does. Data collection and pre- processing Meta-features: Pair-wise differences Meta-features: Pair-wise products Intra- and inter-construct relationships Distance Computation Data preparation 0 2 4 6 8 10 12 14 16 18 1 2 3 4 5 6 7 8 9 10 11 12 f1 f2 Meta-f Class A Class B0 2 4 6 8 10 12 14 16 18 1 2 3 4 5 6 7 8 9 10 11 12 f1 f2 Meta-f Class A Class B
  • 10. Methodology: Distance Computation and Dataset Variations • Distance matrices computed for all 7 datasets • Various distance/correlations metrics used on each of the dataset variations X X X Distance Metrics: • Pearson • Spearman • Robust • Euclidean • Cosine Various datasets: • Original • Difference meta-features • Product meta- features Interactions: • Intra-construct item relationships • Inter-construct item relationships Values of k for kNN Cliques: k=3 k=4 k=5 k=6 = 7 datasets and 140 graphs
  • 11. Methodology: MST-kNN and kNN Cliques Complete graph Minimum Spanning Tree Select and remove edges that are not k-Nearest Neigbors Final forest (a forest is a set of trees) = clusters Previous applications of the MST-kNN method • U.S. Stock market time series data (Inostroza-Ponta, Berretta, & Moscato, 2011) • Yeast gene expression data (Inostroza-Ponta, Mendes, Berretta, & Moscato, 2007) • Alzheimer’s disease data - in the order of 1 million data elements (Arefin, Mathieson, Johnstone, Berretta, & Moscato, 2012) • Prostate cancer data (Capp et al., 2009) These examples show the methodology proposed here has a proven scalability for larger datasets
  • 12. MST-kNN + kNN Cliques Results
  • 13.
  • 14. Results: Clustering Highlights Heterogeneous cluster?More homogenous cluster? And what about the statistical difference of the clustering result that these highlights came from?
  • 15. Results: Clustering and Significance Values Data Rows selected Distance Metric MST-kNN merged with the kNN cliques of size p-values Wilcoxon’s Test Kruskal-Wallis Original All Robust 5NN 0.021187 0.042364 Spearman 6NN 0.025987 0.051962 Robust 6NN 0.028565 0.057117 Pearson 3NN 0.030232 0.060451 Spearman 3NN 0.040661 0.081306 Euclidean 6NN 0.041232 0.082448 Difference Metafeatures ‘Intra’ constructs Robust 3NN 0.016551 0.033095 Robust 6NN 0.017177 0.03434 Pearson 3NN 0.018628 0.0372481 Pearson 6NN 0.019066 0.038124 Pearson 5NN 0.019656 0.039303 All Pearson 3NN 0.020594 0.041180 Product Metafeatures ‘Inter’ Constructs Spearman 3NN 0.016949 0.033891 Pearson 4NN 0.01757 0.035132 All Pearson 4NN 0.017721 0.035433 ‘Inter’ Constructs Pearson 6NN 0.01781 0.035611 Pearson 3NN 0.017816 0.035624 ‘Inter’ Constructs Robust 4NN 0.017998 0.035988
  • 16. Results: Analysis of clusters Cluster No. of respondents Avg. Age Age range % Males/ Females 1 103 20.5 17-32 39.8 / 60.2 2 92 21.3 18-36 39.1 / 60.9 3 31 23.4 19-49 51.6 / 48.4 4 71 21.0 18-44 40.8 / 59.2 5 4 22.3 20-24 75 / 25 6 18 21.1 18-26 33.3 / 66.7 7 10 22.5 18-29 20 / 80 8 5 21 20-24 80 / 20 9 20 23 19-44 45 / 55 10 12 22 18-45 41.7 / 58.3 11 5 26.4 20-46 0 / 100 Clusters’ demographic informationThis figure presents the frequencies of the respondents’ chosen brand categories for two of the largest clusters The difference in degrees of heterogeneity between different clusters can be seen in these figures. Furthermore, these two clusters highlight the differences in brand preferences amongst respondents that do exist within each cluster of similar consumers Heterogeneous spread of respondents’ chosen brand categories
  • 17. Contribution and Significance • Methodological guide for the investigation of several distance measures, meta-features, relationships of theoretical construct items to find ‘best’ clustering results • Expanded on the MST-kNN clustering method for increased potential to find statistically significant clusters of categories of consumers and their chosen brands • The clustering methodology used in this study highlights the high levels of heterogeneity found in consumer’s online behaviours towards brands
  • 18. Future Research Directions • Various domains and contexts to apply the novel process outlined in this study • Combine a study using survey data as well as ‘live’ behaviour data from social networking sites (real-time interactions) • Further exploration of meta-features in both survey data and ‘real’ online behaviour clustering studies; ‘differences’ meta-features in this study yielded better results • This study guides the development of future feature selection models to identify group of consumers according to higher-order characteristics.
  • 19. Thank you Questions? We would like to thank Dr. Jamie Carlson and Mr. Benjamin Lucas for their advise and proofreading. Dr. Jamie Carlson supervised Ms. de Vries’ thesis project and the initial collection and analysis of this data. Thanks to Mario Inostroza-Ponta for the use of his MST-kNN images.
  • 20. References (from paper) • [1] I. P. Cvijikj and F. Michahelles, "Online engagement factors on Facebook brand pages," Social Network Analysis and Mining, vol. 3, pp. 843- 861, 2013. • [2] B. Jahn and W. Kunz, "How to transform consumers into fans of your brand," Journal of Service Management, vol. 23, pp. 344-361, 2012. • [3] T. S. Chung and M. Wedel, "Adaptive personalization of mobile information services," in Handbook of Service Marketing Research, R. T. Rust and M.-H. Huang, Eds., ed Cheltenham: Edward Elgar Publishing Limited, 2014. • [4] N. J. de Vries, J. Carlson, and P. Moscato, "A Data-Driven Approach to Reverse Engineering Customer Engagement Models: Towards Functional Constructs," PLoS ONE, vol. 9, p. e102768, 2014. • [5] B. Jahn and W. Kunz, "How to Transform Consumers into Fans of your Brand," Journal of Service Management, vol. 23, pp. 344-361, 2012. • [6] J. Carlson and A. O'Cass, "Optimizing the Online Channel in Professional Sport to Create Trusting and Loyal Consumers: The Role of the Professional Sports Team Brand and Service Quality. ," Journal of Sport Management, vol. 26, p. 463, 2012. • [7] N. Hohenstein, M. J. Sirgy, A. Herrmann, and M. Heitmann, "Self-Congruity: Antecedents and Consequences," in 34th La Londe International Research Conference in Marketing Communications and Consumer Behaviour Aix en Provance: France University Paul Cezanne, 2007, pp. 118-130. • [8] A. O'Cass and L. Ngo, "Examining the Firm’s Value Creation Process: A Managerial Perspective of the Firm’s Value Offering Strategy and Performance," British Journal of Management, vol. 22, pp. 646-671, 2011. • [9] A. O'Cass and J. Carlson, "An Empirical Assessment of Consumers' Evaluations of Web Site Service Quality: Conceptualizing and Testing a Formative Model," Journal of Services Marketing, vol. 26, pp. 419-434, 2012. • [10] L. D. Peters, "Theory Testing in Social Research," The Marketing Review, vol. 3, pp. 65-82, 2002. • [11] M. R. de Paula, M. G. Ravetti, R. Berretta, and P. Moscato, "Differences in Abundances of Cell-Signalling Proteins in Blood Reveal Novel Biomarkers for Early Detection Of Clinical Alzheimer’s Disease," PLoS ONE, vol. 6, pp. 1-14, 2011. • [12] A. S. Arefin, L. Mathieson, D. Johnstone, R. Berretta, and P. Moscato, "Unveiling Clusters of RNA Transcript Pairs Associated with Markers of Alzheimer's Disease Progression," PLoS ONE, vol. 7, Sep 21 2012. • [13] M. Inostroza-Ponta, R. Berretta, A. Mendes, and P. Moscato, "An automatic graph layout procedure to visualize correlated data," in Artificial Intelligence in Theory and Practice, ed: Springer, 2006, pp. 179-188. • [14] A. S. Arefin, L. Mathieson, D. Johnstone, R. Berretta, and P. Moscato, "Unveiling clusters of RNA transcript pairs associated with markers of Alzheimer’s disease progression," PLoS ONE, vol. 7, p. e45535, 2012. • [15] A. Capp, M. Inostroza-Ponta, D. Bill, P. Moscato, C. Lai, D. Christie, et al., "Is there more than one proctitis syndrome? A revisitation using data from the TROG 96.01 trial," Radiotherapy and oncology, vol. 90, pp. 400-407, 2009. • [16] M. Inostroza-Ponta, A. Mendes, R. Berretta, and P. Moscato, "An integrated QAP-based approach to visualize patterns of gene expression similarity," in Progress in Artificial Life, ed: Springer, 2007, pp. 156-167. • [17] M. Inostroza-Ponta, R. Berretta, and P. Moscato, "QAPgrid: A two level QAP-based approach for large-scale data analysis and visualization," PloS one, vol. 6, p. e14468, 2011. • [18] A. S. Arefin, M. Inostroza-Ponta, L. Mathieson, R. Berretta, and P. Moscato, "Clustering nodes in large-scale biological networks using external memory algorithms," in Algorithms and Architectures for Parallel Processing, ed: Springer, 2011, pp. 375-386. • [19] A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "Gpu-fs-knn: A software tool for fast and scalable knn computation using GPUs," PLoS ONE, vol. 7, p. e44000, 2012.
  • 21. • [20] A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "kNN-Borůvka-GPU: A Fast and Scalable MST Construction from kNN Graphs on GPU," in Computational Science and Its Applications–ICCSA 2012, ed: Springer, 2012, pp. 71-86. • [21] A. S. Arefin, C. Riveros, R. Berretta, and P. Moscato, "kNN-MST-Agglomerative: A fast and scalable graph-based data clustering approach on GPU," in Computer Science & Education (ICCSE), 2012 7th International Conference on, 2012, pp. 585-590. • [22] E. J. Chesler and M. A. Langston, Combinatorial genetic regulatory network analysis tools for high throughput transcriptomic data: Springer, 2006. • [23] M. Hollander, D. A. Wolfe, and E. Chicken, Nonparametric statistical methods vol. 751: John Wiley & Sons, 2013. • [24] C. E. Shannon, "The mathematical theory of communication," The Bell System Technical Journal, vol. 27, pp. 379-423 & 623-656, 1948. • [25] A. W. Kruglanski, "The Human Subject in the Psychology Experiment: Fact and Artifact," in Advances in Experimental Social Psychology vol. 8, L. Berkowittz, Ed., ed New York: Academic Press, 1975, pp. 101-147. • [26] H. Krasnova, S. Spiekermann, K. Koroleva, and T. Hildebrand, "Online Social Networks: Why we Disclose," Journal of Information Technology, vol. 25, pp. 109-125, 2010. • [27] S. C. Chu and Y. Kim, "Determinants of Consumer Engagement in electronic Word of Mouth (eWoM) in Social Networking Sites," International Journal of Advertising, vol. 30, pp. 47-75, 2011. • [28] J. M. Pinho and A. M. Soares, "Examining the Technology Acceptance Model in the Adoption of Social Networks," Journal of Research in Interactive Marketing, vol. 5, pp. 116-129, 2011. Additional reference from presentation: • Arefin AS, Mathieson L, Johnstone D, Berretta R, Moscato P (2012) Unveiling Clusters of RNA Transcript Pairs Associated with Markers of Alzheimer’s Disease Progression. PLoS ONE 7(9): e45535. doi: 10.1371/journal.pone.0045535 • Rocha de Paula M, Gómez Ravetti M, Berretta R, Moscato P (2011) Differences in Abundances of Cell-Signalling Proteins in Blood Reveal Novel Biomarkers for Early Detection Of Clinical Alzheimer's Disease. PLoS ONE 6(3): e17481. doi: 10.1371/journal.pone.0017481 References cont.