Moving Towards Real-Time Geodemographic Segmentation - Presentation Transcript
MOVING TOWARDS REAL-TIME
GEODEMOGRAPHIC
SEGMENTATION
Dr Alex D Singleton
Department of Geography and Centre for Advance Spatial Analysis ,
University College London
www.alex-singleton.com
London Terraces Council Flat
Blue Collar Central Districts
SEGMENTATIONS ARE CREATED BY
CLUSTER ANALYSIS
Area V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ...
Area1
Area2
Area3
Area4
Area5
Area6
Area7
Area8
...
NEED FOR REAL TIME GEODEMOGRAPHICS
• Current classifications are created using static data sources.
• Rate and scale of current population change is making large surveys
(census) increasingly redundant.
• Significant hidden value in transactional data
• Data is increasingly available in near real time
• Application specific (bespoke) classifications have demonstrated
utility.
REALTIME FEEDS OF DATA
• Involve integration of large and possibly disparate databases
• Common protocol
• XML: E.g. UK Neighbourhood Statistics API
• Formal
• E.g. Doctor registrations; HE Data; Census data
• Informal
• Is there any value in other non- traditional data sources?
http://www.loopt.com/phones/iphone
SOCIAL GPS
http://smalltalkapp.com
http://smalltalkapp.com
http://senseable.mit.edu/
STORE CARDS
What information can be
extracted?
What information can be
extracted?
ONLINE SPECIFICATION INPUTS
• Usability
• Expert V Non-Expert Users
• Non-Expert Users
• Pre selection of variables and weighting for specific application
• Expert Users
• Selection of any variable and any weighting
CLUSTERING
K=4
• K-Means algorithm:
Unstable - initial start
conditions effect the
results
• Measured within sum
of squares or R2
CLUSTERING
K=4
• K-Means algorithm:
Unstable - initial start
conditions effect the
results
• Measured within sum
of squares or R2
CLUSTERING
• Alternate algorithms?
• PAM (Partitioning around medoids) tries to minimize the sum of
distances of the objects to their cluster centers.
• CLARA draws multiple samples of the dataset, applies PAM to each
sample and returns the best result.
• GA (Genetic Algorithm) is inspired by models of biological evolution.
Produce results through a breeding procedure.
CLUSTERING
or... refine k-means ~99% Similar
K=4
K-means result for 41 “OAC variables” K-means result for 26 OAC Principle Components
VISUALISATION
VISUALISATION
STATE OF THE ART
• Slowly moving beyond:
• Idea of expert producers
• General purpose classifications
• There is only one correct representation
• Creating classifications which are
• Responsive to changes in local populations
• Fit for purpose (bespoke classifications)
• Open to scrutiny and verifiable by the public
0 comments
Post a comment