CensusGIV - Geographic Information Visualisation of Census Data

2,274 views
2,151 views

Published on

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,274
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
49
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

CensusGIV - Geographic Information Visualisation of Census Data

  1. 1. UCL DEPARTMENT OF GEOGRAPHY UCL DEPARTMENT OF GEOGRAPHY UCL DEPARTMENT OF GEOGRAPHY CensusGIV Geographic Information Visualisation of Census Data CASA Seminar 9 December 2009 Pablo Mateos Oliver O’Brien Department of Geography University College London www.censusprofiler.org
  2. 2. UCL DEPARTMENT OF GEOGRAPHY Contents • Context & Justification • CensusGIV Aims & Objectives • Design Considerations • System Architecture • Demo
  3. 3. UCL DEPARTMENT OF GEOGRAPHY Context & Justification
  4. 4. UCL DEPARTMENT OF GEOGRAPHY The Generation • Those born after 1993 have only known life with the Internet – A generation whose first port of call for knowledge is the internet through Google‟s search engine, as opposed to books, libraries or traditional (off-line) information sources (CIBER, 2008)
  5. 5. UCL DEPARTMENT OF GEOGRAPHY Moving Beyond “Traditional Web-GIS”
  6. 6. UCL DEPARTMENT OF GEOGRAPHY Geographic Visualisation at UCL • www.londonprofiler.org • www.maptube.org • www.publicprofiler.org/WorldNames • www.nationaltrustnames.org • atlas.publicprofiler.org • Coming soon www.censusprofiler.org
  7. 7. UCL DEPARTMENT OF GEOGRAPHY London Profiler – KML Search & Feeds
  8. 8. UCL DEPARTMENT OF GEOGRAPHY Geoweb 2.0 in Teaching UCL Geography undergraduate field course in London
  9. 9. UCL DEPARTMENT OF GEOGRAPHY Geovisualisation (GVis) • Refers to the visual representation of spatial data. • GVis as a research tool use for: – Hypotheses generation, knowledge discovery, analysis, presentation and evaluation (Buckley, 2000) • Increasing realisation of the potential for „geography‟ to provide the primary basis for innovative visualisation and knowledge exploration (Dodge, McDerby and Turner, 2006) • Recognised potential of GVis – To make sense of increasingly large datasets – Produce alternative representations of space
  10. 10. UCL DEPARTMENT OF GEOGRAPHY Current Census Thematic Maps by ONS • Neighbourhood Statistics (NeSS) – 11 steps to view a census thematic map! • Mapping in CASWEB – not present
  11. 11. UCL DEPARTMENT OF GEOGRAPHY NeSS maps via SVG/ Flex applications
  12. 12. UCL DEPARTMENT OF GEOGRAPHY The need for Census mapping is clear!
  13. 13. UCL DEPARTMENT OF GEOGRAPHY gCensus: A First Approach • Query-based KML maps of 2000 US Census variables • http://gecensus.stanford.edu
  14. 14. UCL DEPARTMENT OF GEOGRAPHY CensusGIV Aims & Objectives
  15. 15. UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Objectives 1. Develop a prototype to provide innovative geographical visualization of the Census small area statistics datasets. 2. Provide an extensive technical evaluation of the different technological alternatives. 3. Proposal to scale up to a full service in 2011. 4. Promote the use of innovative geographic visualisation of population datasets using mapping mashups.
  16. 16. UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Plan • ESRC Census Development grant £80,000 • Timeframe: 15 months (2009/10) • Develop a Geovisualisation prototype of the UK 2011 Census using “Geoweb 2.0” technologies • Mapping mashups based on data feeds from an ONS “Census hypercube” or NeSS data stream
  17. 17. UCL DEPARTMENT OF GEOGRAPHY People • UCL Geography – Pablo Mateos (P.I.) – Paul Longley (co-P.I.) – Oliver O‟Brien • UCL CASA – Mike Batty (co-P.I.) – Richard Milton (consultant) • User Panel – Jointly with EDINA DIaD project
  18. 18. UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Requirements and Issues • User not faced with queries or complex questions! – Start with a map (e.g. population density) – Automatic scale-determined geographical units – Base map backdrop • Available to the general public & “mashable” • Issues: – Intellectual Property Rights • Geographic boundaries & Census datasets – Data size: Over 3,000 Census variables x 300k geog units – Managing a large number of concurrent users
  19. 19. UCL DEPARTMENT OF GEOGRAPHY Evaluation Criteria for Final Solution • Scalability • Response time • Maximum number of concurrent users • Data storage and retrieval • Flexibility of geovisualisation options • Ease of use and simplicity • Intellectual Property Rights (IPR) issues • Cost of development and implementation
  20. 20. UCL DEPARTMENT OF GEOGRAPHY Geovisualisation Prototypes • Different technologies have been explored: – WMS/ WFS – Adobe Flash (Flex) vector maps – SVG vector maps – KML vector maps with Google Maps API – Raster maps with OpenLayers
  21. 21. UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Timeline • October 2008 – February 2009 – Evaluation phase (completed) • October 2009 – June 2010 – Developing prototype • Trade-offs to be made between: – response time, storage space, concurrent users, IPR protection, ease of navigability, flexible visualisation, back-end/front-end solutions, cost • First version of prototype to be tested this month • ONS / Census Programme to decide full implementation for 2011 Census
  22. 22. UCL DEPARTMENT OF GEOGRAPHY Design Considerations
  23. 23. UCL DEPARTMENT OF GEOGRAPHY Fundamental Design Decisions • Server-based rasters – Faster on the client side – Fast enough on the server side – Not delivering restricted data to the client • Open Source software – Leverage the powerful OpenLayers mapping API – More powerful than Google Maps API – An active development community – Full access to the source – can do “cool stuff” • “Slippy” map – Intuitive – Encourages exploration
  24. 24. UCL DEPARTMENT OF GEOGRAPHY Maps of Population Data • Cartograms – Fairer representation – Multiple variables can be shown together • Choropleth Maps – Easier to relate to • Surface mapping – Interpolation
  25. 25. UCL DEPARTMENT OF GEOGRAPHY Accessing the Census Data • Neighbourhood Statistics – Hunter vs Gatherer – NeSS Data API (SOAP) – CSV Downloads • Still tedious – for each UV: – Download files for each GOR – Stitch them together (has been automated) – Create corresponding tables in the database, add data – Add ranking scores – Add metadata – NeSS Data API (REST) coming February 2010 • CASWEB
  26. 26. UCL DEPARTMENT OF GEOGRAPHY Structure of The Web-App • OpenLayers “slippy” map – Fully opaque grey base layer • Could be switched for aerial imagery from Google/Microsoft – Opaque choropleth overlay • Variable translucency if aerial imagery underneath – Context overlay • Points, lines and names • Sea area in lighter grey • Otherwise transparent – POIs • e.g. schools, hospitals • SVG vectors rather than tiles • “Clickable”
  27. 27. UCL DEPARTMENT OF GEOGRAPHY Screenshot
  28. 28. UCL DEPARTMENT OF GEOGRAPHY Why a Custom Context Layer? • Having full control is a definite advantage – Underlay • Google colours/features can clash with choropleths • Lose the context if choropleth is fully opaque – Overlay • Google labels can obscure information • Google‟s cartography recently changed (for the better) • But no control over future changes
  29. 29. UCL DEPARTMENT OF GEOGRAPHY Cartography of the Context Layer • Difficult to get right – Urban vs Rural • Strictly Black & White • Few point features – Hospitals, airports, place names • Fewer areal features – Lakes, sea • Mainly a network of roads/rivers/railways • Less is more
  30. 30. UCL DEPARTMENT OF GEOGRAPHY Creating the Context Layer • PostGIS database – Using the OpenStreetMap dataset for the UK – Relatively slow to create the images from the data • ~50 database queries for each image tile • Higher zoom levels have tiles with smaller extent, but we include more detail at these levels, which cancels out the speed increase – Render on demand “unimportant” tiles at zoom levels 16-18 – Pre-render everything else • Painter‟s Algorithm
  31. 31. UCL DEPARTMENT OF GEOGRAPHY Painter’s Algorithm • Two hierarchies of layering – Feature-based layering • Land, water, road/railway casings & cores, place names – Intra-feature level z-ordering • Complex road junctions • Railway/road crossing
  32. 32. UCL DEPARTMENT OF GEOGRAPHY Pre-rendering of Context Layer • Rendered on “gibin”, a quad-core computer running Linux • Utilising the Python “Threading” module – 4 tiles created at once • The image “tiles” are PNGs with an alpha layer • Bounding box: -10.7 W to 1.8 E, 49.8 N to 60.9 N (All of the UK) Zoom Scale No of Size Detail Time Level Tiles /MB /min 6-9 < 1:1M 790 5 Cities, motorways <1 10 1:600,000 2,146 15 + towns, trunk roads, lakes 1 11 1:300,000 8,208 40 + main roads, rivers, airfields 2 12 1:150,000 32,318 156 + minor roads, railways, villages 7 13 1:72,000 128,250 500 + main road, water & area names 24 14 1:36,000 510,962 1.4 GB + paths 1h 28 15 1:18,000 2,041,572 4.4 GB + minor road names 5h 34
  33. 33. UCL DEPARTMENT OF GEOGRAPHY The Context Layer (Levels 6-11)
  34. 34. UCL DEPARTMENT OF GEOGRAPHY The Context Layer (Levels 12-17) On Demand On Demand
  35. 35. UCL DEPARTMENT OF GEOGRAPHY Creating the Choropleth Layers • PostGIS database of census data • Would never want to pre-render all the choropleths at all zoom levels – 1000+ metrics × 10 groupings × 30 colour schemes × 2 colour orders × 13 zoom levels × 000s of tiles per zoom – Makes sense to cache most popular zoom levels, metrics, colours – Most people will never “explore” the map at a greater zoom level - usage decreases exponentially with the number of clicks in a web app. • Specially crafted URL – Boundary Table, Data Table, Metric – Bounding Box, Zoom – Colour Scheme, No of Groups – Range Type, Range Attributes • Min/Max • Average/Deviation
  36. 36. UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem Boundary Level Average No of Vertices Number Type (Simplified) MSOA 6, 7, 8 145 7,196 (Eng & Wal) LSOA 9, 10, 11 62 40,884 (not N.I.) OA 12 - 18 26 223,131
  37. 37. UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem MSOA
  38. 38. UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem LSOA
  39. 39. UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem OA
  40. 40. UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem
  41. 41. UCL DEPARTMENT OF GEOGRAPHY Colour Theory • “practical guidance to colour mixing and the visual impacts of specific colour combinations” • Formal considerations – Colour harmony (complementary colours – pink vs blue) – Colour context (bright colours beside subdued colours) – Colour blindness • Very subjective
  42. 42. UCL DEPARTMENT OF GEOGRAPHY Colour Considerations • Colour should relate to data type: – Sequential – Diverging – Qualitative • The “most of the UK is countryside” problem – Try not to use bright colours for the countryside. • Hot Bad High Cold Natural Good Neutral Girls Boys
  43. 43. UCL DEPARTMENT OF GEOGRAPHY Colour Harmony • Colour Harmony – Complementary Colours – Analogous Colours • Colour Variation – Hue – Saturation – Lightness
  44. 44. UCL DEPARTMENT OF GEOGRAPHY Colourbrewer • Cynthia Brewer‟s colorbrewer2.com – Provides a set of “good” colour schemes which can be incorporated easily into Python scripts, ArcMap, etc. – Generally vary by hue and/or lightness • Sequential – Lightness should be varied, use analogous colours if varying hue – Plenty of “good” maps that don‟t follow this rule • Diverging – Mid-point should be a light colour – Extremes should have darker colours with complementary hues • Qualitative – Hues should vary
  45. 45. UCL DEPARTMENT OF GEOGRAPHY Aerial Imagery layerAerial = new OpenLayers.Layer.Google("Aerial Imagery", {numZoomLevels: 16, type: G_SATELLITE_MAP, sphericalMercator: true}); • Very easy! • OpenLayers – Google Maps imagery layer – Microsoft Virtual Earth layer • Only useful when zoomed in • Need to be mindful that colour imagery interferes with choropleth colours • No longer self-contained
  46. 46. UCL DEPARTMENT OF GEOGRAPHY Points of Interest (POIs) • PostgreSQL (or MySQL) database • Can be a completely separate server • Client‟s OpenLayers does the work • Aim is to provide even more context • School names & performance indicators
  47. 47. UCL DEPARTMENT OF GEOGRAPHY User Interface Less is more Choice is good • How do you get people to explore the maps? – Maptube “visual directory” – Hierarchical drop-down lists – Tag cloud of keywords, maybe with a hierarchy
  48. 48. UCL DEPARTMENT OF GEOGRAPHY User Interface – Tag Cloud • Useful for exploring if you don‟t know what you want • More structured alternative needed for specific research
  49. 49. UCL DEPARTMENT OF GEOGRAPHY System Architecture
  50. 50. UCL DEPARTMENT OF GEOGRAPHY A Note on Python • If you don‟t use it already, you will! – ArcGIS 9.4 • “Python is now integrated directly into ArcMap [9.4]. I say it every year, but if you are an ArcGIS Desktop user, you need to take a close look at python as your scripting language.” - James Fee • The best thing about Python is: – Tidy scripts!
  51. 51. UCL DEPARTMENT OF GEOGRAPHY Servers Server Room dev tba tiler1 tiler2 tiler3 blog pois tiles1 tiles2 tiles3 www Web browsers
  52. 52. UCL DEPARTMENT OF GEOGRAPHY System Architecture – Website Web Apache browser (www)
  53. 53. UCL DEPARTMENT OF GEOGRAPHY System Architecture – Context Apache Web Apache (tiles2) browser (www) 404 • No python involved Tile – Less strain on the Tile exists No server ? • Web browser may have to request image twice Yes – Slow for the client
  54. 54. UCL DEPARTMENT OF GEOGRAPHY System Architecture – Context XML mod_python Apache Web Apache (tiler2) browser (www) Python XML renderer.py gen_tile.py Cache
  55. 55. UCL DEPARTMENT OF GEOGRAPHY System Architecture – Choropleth mod_python Apache Web Apache (tiler3) browser (www) Python Colorbrewer Tile renderer.py gen_tile.py Cache Tile (low Yes exists No zoom) ?
  56. 56. UCL DEPARTMENT OF GEOGRAPHY Scalability • OpenLayers allows multiple servers to be specified for retrieving image tiles • Different servers for different tasks • Random server chosen per- tile • So should scale? • Process is still processor intensive if generating the tiles at the same time • Stress testing needed
  57. 57. UCL DEPARTMENT OF GEOGRAPHY Prototype: Current State & Next Steps  On-demand tile × Legend generation × Automated data updates  Fast (enough) × Tag cloud  Will scale (hopefully!) × Improve cartography  OSM not quite “complete” × Internet Explorer 6 but getting there www.savethedevelopers.org  Context layer finished × Other census & ONS data  Some data added × Interactive data combination × Scotland & Northern Ireland × Points of Interest Running until June 2010
  58. 58. UCL DEPARTMENT OF GEOGRAPHY Live Demo • http://www.censusprofiler.org/prototype/
  59. 59. UCL DEPARTMENT OF GEOGRAPHY Google Earth
  60. 60. UCL DEPARTMENT OF GEOGRAPHY Q&A www.censusprofiler.org www.oliverobrien.co.uk Google Street View and Google Earth POI data is Copyright Google. Google Maps mapping data is Copyright Tele Atlas. Google aerial imagery is Copyright Digital Globe, Infoterra Ltd, Bluesky, GeoEye, Getmapping plc, The Geoinformation Group. OpenStreetMap data is CC-BY-SA OpenStreetMap and contributors. Logos depicted are generally Copyright of their respective organisations. Some image tiles include boundary information supplied by EDINA‟s UKBORDERS service. The Census data is supplied by the Office for National Statistics. The Word Cloud was produced with Wordle. The colour wheel diagrams are from worqx.com. The Painter’s Algorithm picture and the HSL colour diagram are from Wikipedia. The cartogram was produced by James Cheshire. The corresponding choropleth was produced by the BBC. The following references were used in the first part of this presentation: CIBER (2008) information behaviour of the researcher of the future. A report commissioned by The British Library and JISC 11 January 2008. http://www.bl.uk/news/pdf/googlegen.pdf Goodchild (2007) Citizens as Sensors: The world of Volunteered Geography. Workshop on Volunteered Geographic Information, Santa Barbara, CA. December 13-14, 2007 http://www.ncgia.ucsb.edu/projects/vgi/docs/position/Goodchild_VGI2007.pdf O’Reilly, T (2005) What Is web 2.0 Design Patterns and Business Models for the Next Generation of Software http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/whatisWeb20.html Turner A (2007) Introduction to Neogeography. O‟Reilly Media Short Cuts.

×