Visualising large spatial databases and Building bespoke geodemographics<br />Muhammad Adnan<br />University College Londo...
About Me<br /><ul><li>2007 – 2009
Worldnames (http://worldnames.publicprofiler.org)
Onomap (http://www.onomap.org)
Nov. 2009 – Oct. 2011 (A KTP between UCL and Local Futures Group)
LFG is a research and strategy consultancy
Aim of the KTP was to device a better visualisation of the data </li></li></ul><li>Data<br /><ul><li>A database of 1600 in...
Data sources cover social, economic, and environmental change in the UK
The data is held at 8 spatial levels
Region, Sub region, District 2009, Nuts 3, District (pre 2009), Ward, LSOA, OA</li></li></ul><li>Visualisation of the data...
Building Bespoke Geodemographics<br />
Geodemographics<br /><ul><li>“Analysis of people by where they live” or “locality marketing” </li></ul>(Sleight, 1993:3)<b...
How a classification is created ?<br />Data – Census + Other<br />ONS Output Area Classification<br /><ul><li> Census data...
 Non-Census data: 46%</li></ul>CACI: Accorn<br /><ul><li> Census data: 30%
 Non-Census data: 70%</li></li></ul><li>How a classification is created ?<br />Segmentations are created by cluster analys...
How a classification is created ?<br />Cluster Analysis<br />Variable 2<br />Cluster 1<br />Cluster 2<br />Variable 1<br /...
How a classification is created ?<br />Output of Cluster Analysis<br />
Research Issues<br /><ul><li>Optimisation of clustering algorithms
K-means
PAM (Partitioning Around Mediods)
Open Tools ?
OACoder
GeodemCreator
Bespoke local area classifications
UK’s open data initiative
ONS Neighbour Statistics API
UK’s police API
Barclays cycle hire API</li></li></ul><li>Optimisation of Clustering Algorithms (K-Means)<br />
K-means optimisation<br />
K-means(100 runs of k-means on OAC data set for k=4)<br />
K-means(100 runs of k-means on OAC data set for k=4)<br />Run k-means multiple times (10,000 times) (Singleton & Longley, ...
CUDA & GPUs (Graphical Processing Units)<br /><ul><li>Nvidia graphics cards have GPUs (Graphical Processing Units)
Can be used for parallel processing
NvidiaGeForce GT 420M (96 GPUs)
Latest Telsa graphics cards have 1000 GPUs
Upcoming SlideShare
Loading in …5
×

Visualising large spatial databases and Building bespoke geodemographics

884 views
782 views

Published on

This presentation outlines my work at the Local Futures and the PhD research. I have been working on a combined project between Local Futures and UCL and the presentation starts by giving an introduction of the project. My PhD investigated the creation of Real-time bespoke geodemographics, and this presentation presents the work i did during the PhD journey.

Published in: Design, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
884
On SlideShare
0
From Embeds
0
Number of Embeds
56
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Visualising large spatial databases and Building bespoke geodemographics

  1. 1. Visualising large spatial databases and Building bespoke geodemographics<br />Muhammad Adnan<br />University College London<br />
  2. 2. About Me<br /><ul><li>2007 – 2009
  3. 3. Worldnames (http://worldnames.publicprofiler.org)
  4. 4. Onomap (http://www.onomap.org)
  5. 5. Nov. 2009 – Oct. 2011 (A KTP between UCL and Local Futures Group)
  6. 6. LFG is a research and strategy consultancy
  7. 7. Aim of the KTP was to device a better visualisation of the data </li></li></ul><li>Data<br /><ul><li>A database of 1600 indicators around 130 data sources
  8. 8. Data sources cover social, economic, and environmental change in the UK
  9. 9. The data is held at 8 spatial levels
  10. 10. Region, Sub region, District 2009, Nuts 3, District (pre 2009), Ward, LSOA, OA</li></li></ul><li>Visualisation of the data<br /><ul><li>A ‘total place maps’ solution using different technologies (Video)</li></ul>Base Layer Data<br />On the fly rendering of tiles<br />Programming in C# and ASP.NET<br />Data retrieval from database<br />
  11. 11. Building Bespoke Geodemographics<br />
  12. 12. Geodemographics<br /><ul><li>“Analysis of people by where they live” or “locality marketing” </li></ul>(Sleight, 1993:3)<br />Home<br />Address<br />Person<br />Area<br />
  13. 13. How a classification is created ?<br />Data – Census + Other<br />ONS Output Area Classification<br /><ul><li> Census data: 100%</li></ul>Experian: Mosaic<br /><ul><li> Census data: 54%
  14. 14. Non-Census data: 46%</li></ul>CACI: Accorn<br /><ul><li> Census data: 30%
  15. 15. Non-Census data: 70%</li></li></ul><li>How a classification is created ?<br />Segmentations are created by cluster analysis<br />Inputs…<br />
  16. 16. How a classification is created ?<br />Cluster Analysis<br />Variable 2<br />Cluster 1<br />Cluster 2<br />Variable 1<br />Cluster 3<br />K-means is used for clustering<br />
  17. 17. How a classification is created ?<br />Output of Cluster Analysis<br />
  18. 18. Research Issues<br /><ul><li>Optimisation of clustering algorithms
  19. 19. K-means
  20. 20. PAM (Partitioning Around Mediods)
  21. 21. Open Tools ?
  22. 22. OACoder
  23. 23. GeodemCreator
  24. 24. Bespoke local area classifications
  25. 25. UK’s open data initiative
  26. 26. ONS Neighbour Statistics API
  27. 27. UK’s police API
  28. 28. Barclays cycle hire API</li></li></ul><li>Optimisation of Clustering Algorithms (K-Means)<br />
  29. 29. K-means optimisation<br />
  30. 30. K-means(100 runs of k-means on OAC data set for k=4)<br />
  31. 31. K-means(100 runs of k-means on OAC data set for k=4)<br />Run k-means multiple times (10,000 times) (Singleton & Longley, 2009)<br />
  32. 32. CUDA & GPUs (Graphical Processing Units)<br /><ul><li>Nvidia graphics cards have GPUs (Graphical Processing Units)
  33. 33. Can be used for parallel processing
  34. 34. NvidiaGeForce GT 420M (96 GPUs)
  35. 35. Latest Telsa graphics cards have 1000 GPUs
  36. 36. CUDA (Computer United Device Architecture)
  37. 37. Parallel computing architecture
  38. 38. C and C++ can be used for programming
  39. 39. A parallel implementation of k-means (Adnan & Longley, 2011)</li></li></ul><li>K-means vs Parallel K-means<br />Could be useful for building geodemographics quickly in online environments<br />
  40. 40. Open Tools for Geodemographics<br />
  41. 41. Open Tools - OACoder<br />Developed with Alex Singleton<br />Assigns UK’s postcodes their corresponding OAC groups<br />Download from<br />http://areaclassification.org.uk/<br />
  42. 42. Open Tools – ‘GeodemCreator’<br />Allows users to create their local area Geodemographic Classifications<br />Provides data available in the public domain (but users can use ancillary data sources)<br />
  43. 43. Open Tools – ‘GeodemCreator’<br />Allows users to create their local area Geodemographic Classifications<br />Provides data available in the public domain (but users can use ancillary data sources)<br />Will be available to download from http://publicprofiler.org<br />
  44. 44. Spatially Weighted Geodemographics<br />
  45. 45. Spatially Weighted Geodemographcis<br /><ul><li>Geodemographic classifications do not account for spatial weights in the results
  46. 46. A spatially weighted Geodemographic classification introduces spatial weights in addition to the socio-economic characteristics
  47. 47. Tobler’s first law of geography
  48. 48. “Everything is related to everything else, but near things are more related than distant things”</li></li></ul><li>Spatially weighted Geodemographics<br />Step - 1: Construct a Neighbours Graph<br />
  49. 49. Spatially weighted Geodemographics<br />Step - 1: Construct a Neighbours Graph<br />
  50. 50. Spatially weighted Geodemographics<br />Step - 2: Apply Moran’s I to the data set<br /><ul><li>It is a measure of spatial autocorrelation
  51. 51. Values of spatial auto-correlation range from -1 to 1
  52. 52. A negative value represents a negative spatial auto-correlation</li></li></ul><li>Spatially weighted Geodemographics<br />Step - 2: Apply Moran’s I to the data set<br />
  53. 53. Spatially weighted Geodemographics<br />Step - 3: Apply K-means<br />Moran’s I Result<br />
  54. 54. Spatially weighted Geodemographics<br />Result<br />
  55. 55. Conclusion and future work<br /><ul><li>Open methods and tools for building geodemographics are important
  56. 56. A testing of Spatial Weighted Geodemographics technique
  57. 57. On lower spatial levels
  58. 58. I will be working on the new research grant of Paul Longley on “Uncertainty of Identity”
  59. 59. How behaviours of people in the real-world could be mapped with their behaviours in the virtual world ?
  60. 60. Could marketing strategies be devised for targeting online social networks and communities ?</li></li></ul><li>A quick illustration<br />http://worldnames.publicprofiler.org<br /><ul><li>We have a record of 100,000 ‘IP Address’ entries for the last 6 months</li></li></ul><li>A quick illustration<br />http://quova.com<br />An API to convert “IP addresses” to their corresponding latitude / longitude values<br />
  61. 61. A quick illustration<br />
  62. 62. A quick illustration<br />
  63. 63. References<br />Adnan, M., Longley, P.A., Singleton, A.D., Brunsdon, C. (2010) Towards Real-Time Geodemographics: Clustering Algorithm Performance for Large Multidimensional Spatial Databases. Transactions in GIS, 14(3), 283 – 297.<br /> <br />Hall, J.D., Hart, J.C. (2004). GPU acceleration of iterative clustering. In: ACM Workshop on General-Purpose Computing on Graphics Processors, p C-6<br />Harris, R., Sleight, P., Webber, R. (2005). Geodemographics, GIS and Neighbourhood Targeting. Wiley, London.<br /> <br />Reynolds, A.P., Richards, G., Rayward-Smith, V.J. (2004) The Application of K-Medoids and PAM to the Clustering of Rules. Lecture Notes in Computer Science. 3177/2004, 173-178.<br /> <br />Singleton, A.D., Longley, P.A (2008). Creating open source geodemographic classifications for Higher Education applications. Papers in Regional Science, 88(3), 643-666.<br /> <br />Takizawa, H., Kobayashi, H. (2006). Hierarchical parallel processing of large scale data clustering on a pc cluster with GPU co-processing. J. Supercomput.,36(3):219–234.<br /> <br />Vickers, D.W. and Rees, P.H. (2007). Creating the National Statistics 2001 Output Area Classification. Journal of the Royal Statistical Society, Series A. 170(2), 379-403.<br />Any Questions ?<br />

×