CensusGIV - Geographic Information Visualisation of Census Data
Upcoming SlideShare
Loading in...5
×
 

CensusGIV - Geographic Information Visualisation of Census Data

on

  • 2,857 views

 

Statistics

Views

Total Views
2,857
Views on SlideShare
2,853
Embed Views
4

Actions

Likes
3
Downloads
42
Comments
0

2 Embeds 4

http://www.slideshare.net 3
http://oliverobrien.co.uk 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

CensusGIV - Geographic Information Visualisation of Census Data CensusGIV - Geographic Information Visualisation of Census Data Presentation Transcript

  • UCL DEPARTMENT OF GEOGRAPHY UCL DEPARTMENT OF GEOGRAPHY UCL DEPARTMENT OF GEOGRAPHY CensusGIV Geographic Information Visualisation of Census Data CASA Seminar 9 December 2009 Pablo Mateos Oliver O’Brien Department of Geography University College London www.censusprofiler.org
  • UCL DEPARTMENT OF GEOGRAPHY Contents • Context & Justification • CensusGIV Aims & Objectives • Design Considerations • System Architecture • Demo
  • UCL DEPARTMENT OF GEOGRAPHY Context & Justification
  • UCL DEPARTMENT OF GEOGRAPHY The Generation • Those born after 1993 have only known life with the Internet – A generation whose first port of call for knowledge is the internet through Google‟s search engine, as opposed to books, libraries or traditional (off-line) information sources (CIBER, 2008)
  • UCL DEPARTMENT OF GEOGRAPHY Moving Beyond “Traditional Web-GIS”
  • UCL DEPARTMENT OF GEOGRAPHY Geographic Visualisation at UCL • www.londonprofiler.org • www.maptube.org • www.publicprofiler.org/WorldNames • www.nationaltrustnames.org • atlas.publicprofiler.org • Coming soon www.censusprofiler.org
  • UCL DEPARTMENT OF GEOGRAPHY London Profiler – KML Search & Feeds
  • UCL DEPARTMENT OF GEOGRAPHY Geoweb 2.0 in Teaching UCL Geography undergraduate field course in London
  • UCL DEPARTMENT OF GEOGRAPHY Geovisualisation (GVis) • Refers to the visual representation of spatial data. • GVis as a research tool use for: – Hypotheses generation, knowledge discovery, analysis, presentation and evaluation (Buckley, 2000) • Increasing realisation of the potential for „geography‟ to provide the primary basis for innovative visualisation and knowledge exploration (Dodge, McDerby and Turner, 2006) • Recognised potential of GVis – To make sense of increasingly large datasets – Produce alternative representations of space
  • UCL DEPARTMENT OF GEOGRAPHY Current Census Thematic Maps by ONS • Neighbourhood Statistics (NeSS) – 11 steps to view a census thematic map! • Mapping in CASWEB – not present
  • UCL DEPARTMENT OF GEOGRAPHY NeSS maps via SVG/ Flex applications
  • UCL DEPARTMENT OF GEOGRAPHY The need for Census mapping is clear!
  • UCL DEPARTMENT OF GEOGRAPHY gCensus: A First Approach • Query-based KML maps of 2000 US Census variables • http://gecensus.stanford.edu
  • UCL DEPARTMENT OF GEOGRAPHY CensusGIV Aims & Objectives
  • UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Objectives 1. Develop a prototype to provide innovative geographical visualization of the Census small area statistics datasets. 2. Provide an extensive technical evaluation of the different technological alternatives. 3. Proposal to scale up to a full service in 2011. 4. Promote the use of innovative geographic visualisation of population datasets using mapping mashups.
  • UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Plan • ESRC Census Development grant £80,000 • Timeframe: 15 months (2009/10) • Develop a Geovisualisation prototype of the UK 2011 Census using “Geoweb 2.0” technologies • Mapping mashups based on data feeds from an ONS “Census hypercube” or NeSS data stream
  • UCL DEPARTMENT OF GEOGRAPHY People • UCL Geography – Pablo Mateos (P.I.) – Paul Longley (co-P.I.) – Oliver O‟Brien • UCL CASA – Mike Batty (co-P.I.) – Richard Milton (consultant) • User Panel – Jointly with EDINA DIaD project
  • UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Requirements and Issues • User not faced with queries or complex questions! – Start with a map (e.g. population density) – Automatic scale-determined geographical units – Base map backdrop • Available to the general public & “mashable” • Issues: – Intellectual Property Rights • Geographic boundaries & Census datasets – Data size: Over 3,000 Census variables x 300k geog units – Managing a large number of concurrent users
  • UCL DEPARTMENT OF GEOGRAPHY Evaluation Criteria for Final Solution • Scalability • Response time • Maximum number of concurrent users • Data storage and retrieval • Flexibility of geovisualisation options • Ease of use and simplicity • Intellectual Property Rights (IPR) issues • Cost of development and implementation
  • UCL DEPARTMENT OF GEOGRAPHY Geovisualisation Prototypes • Different technologies have been explored: – WMS/ WFS – Adobe Flash (Flex) vector maps – SVG vector maps – KML vector maps with Google Maps API – Raster maps with OpenLayers
  • UCL DEPARTMENT OF GEOGRAPHY CensusGIV: Timeline • October 2008 – February 2009 – Evaluation phase (completed) • October 2009 – June 2010 – Developing prototype • Trade-offs to be made between: – response time, storage space, concurrent users, IPR protection, ease of navigability, flexible visualisation, back-end/front-end solutions, cost • First version of prototype to be tested this month • ONS / Census Programme to decide full implementation for 2011 Census
  • UCL DEPARTMENT OF GEOGRAPHY Design Considerations
  • UCL DEPARTMENT OF GEOGRAPHY Fundamental Design Decisions • Server-based rasters – Faster on the client side – Fast enough on the server side – Not delivering restricted data to the client • Open Source software – Leverage the powerful OpenLayers mapping API – More powerful than Google Maps API – An active development community – Full access to the source – can do “cool stuff” • “Slippy” map – Intuitive – Encourages exploration
  • UCL DEPARTMENT OF GEOGRAPHY Maps of Population Data • Cartograms – Fairer representation – Multiple variables can be shown together • Choropleth Maps – Easier to relate to • Surface mapping – Interpolation
  • UCL DEPARTMENT OF GEOGRAPHY Accessing the Census Data • Neighbourhood Statistics – Hunter vs Gatherer – NeSS Data API (SOAP) – CSV Downloads • Still tedious – for each UV: – Download files for each GOR – Stitch them together (has been automated) – Create corresponding tables in the database, add data – Add ranking scores – Add metadata – NeSS Data API (REST) coming February 2010 • CASWEB
  • UCL DEPARTMENT OF GEOGRAPHY Structure of The Web-App • OpenLayers “slippy” map – Fully opaque grey base layer • Could be switched for aerial imagery from Google/Microsoft – Opaque choropleth overlay • Variable translucency if aerial imagery underneath – Context overlay • Points, lines and names • Sea area in lighter grey • Otherwise transparent – POIs • e.g. schools, hospitals • SVG vectors rather than tiles • “Clickable”
  • UCL DEPARTMENT OF GEOGRAPHY Screenshot
  • UCL DEPARTMENT OF GEOGRAPHY Why a Custom Context Layer? • Having full control is a definite advantage – Underlay • Google colours/features can clash with choropleths • Lose the context if choropleth is fully opaque – Overlay • Google labels can obscure information • Google‟s cartography recently changed (for the better) • But no control over future changes
  • UCL DEPARTMENT OF GEOGRAPHY Cartography of the Context Layer • Difficult to get right – Urban vs Rural • Strictly Black & White • Few point features – Hospitals, airports, place names • Fewer areal features – Lakes, sea • Mainly a network of roads/rivers/railways • Less is more
  • UCL DEPARTMENT OF GEOGRAPHY Creating the Context Layer • PostGIS database – Using the OpenStreetMap dataset for the UK – Relatively slow to create the images from the data • ~50 database queries for each image tile • Higher zoom levels have tiles with smaller extent, but we include more detail at these levels, which cancels out the speed increase – Render on demand “unimportant” tiles at zoom levels 16-18 – Pre-render everything else • Painter‟s Algorithm
  • UCL DEPARTMENT OF GEOGRAPHY Painter’s Algorithm • Two hierarchies of layering – Feature-based layering • Land, water, road/railway casings & cores, place names – Intra-feature level z-ordering • Complex road junctions • Railway/road crossing
  • UCL DEPARTMENT OF GEOGRAPHY Pre-rendering of Context Layer • Rendered on “gibin”, a quad-core computer running Linux • Utilising the Python “Threading” module – 4 tiles created at once • The image “tiles” are PNGs with an alpha layer • Bounding box: -10.7 W to 1.8 E, 49.8 N to 60.9 N (All of the UK) Zoom Scale No of Size Detail Time Level Tiles /MB /min 6-9 < 1:1M 790 5 Cities, motorways <1 10 1:600,000 2,146 15 + towns, trunk roads, lakes 1 11 1:300,000 8,208 40 + main roads, rivers, airfields 2 12 1:150,000 32,318 156 + minor roads, railways, villages 7 13 1:72,000 128,250 500 + main road, water & area names 24 14 1:36,000 510,962 1.4 GB + paths 1h 28 15 1:18,000 2,041,572 4.4 GB + minor road names 5h 34
  • UCL DEPARTMENT OF GEOGRAPHY The Context Layer (Levels 6-11)
  • UCL DEPARTMENT OF GEOGRAPHY The Context Layer (Levels 12-17) On Demand On Demand
  • UCL DEPARTMENT OF GEOGRAPHY Creating the Choropleth Layers • PostGIS database of census data • Would never want to pre-render all the choropleths at all zoom levels – 1000+ metrics × 10 groupings × 30 colour schemes × 2 colour orders × 13 zoom levels × 000s of tiles per zoom – Makes sense to cache most popular zoom levels, metrics, colours – Most people will never “explore” the map at a greater zoom level - usage decreases exponentially with the number of clicks in a web app. • Specially crafted URL – Boundary Table, Data Table, Metric – Bounding Box, Zoom – Colour Scheme, No of Groups – Range Type, Range Attributes • Min/Max • Average/Deviation
  • UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem Boundary Level Average No of Vertices Number Type (Simplified) MSOA 6, 7, 8 145 7,196 (Eng & Wal) LSOA 9, 10, 11 62 40,884 (not N.I.) OA 12 - 18 26 223,131
  • UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem MSOA
  • UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem LSOA
  • UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem OA
  • UCL DEPARTMENT OF GEOGRAPHY The Modifiable Areal Unit Problem
  • UCL DEPARTMENT OF GEOGRAPHY Colour Theory • “practical guidance to colour mixing and the visual impacts of specific colour combinations” • Formal considerations – Colour harmony (complementary colours – pink vs blue) – Colour context (bright colours beside subdued colours) – Colour blindness • Very subjective
  • UCL DEPARTMENT OF GEOGRAPHY Colour Considerations • Colour should relate to data type: – Sequential – Diverging – Qualitative • The “most of the UK is countryside” problem – Try not to use bright colours for the countryside. • Hot Bad High Cold Natural Good Neutral Girls Boys
  • UCL DEPARTMENT OF GEOGRAPHY Colour Harmony • Colour Harmony – Complementary Colours – Analogous Colours • Colour Variation – Hue – Saturation – Lightness
  • UCL DEPARTMENT OF GEOGRAPHY Colourbrewer • Cynthia Brewer‟s colorbrewer2.com – Provides a set of “good” colour schemes which can be incorporated easily into Python scripts, ArcMap, etc. – Generally vary by hue and/or lightness • Sequential – Lightness should be varied, use analogous colours if varying hue – Plenty of “good” maps that don‟t follow this rule • Diverging – Mid-point should be a light colour – Extremes should have darker colours with complementary hues • Qualitative – Hues should vary
  • UCL DEPARTMENT OF GEOGRAPHY Aerial Imagery layerAerial = new OpenLayers.Layer.Google("Aerial Imagery", {numZoomLevels: 16, type: G_SATELLITE_MAP, sphericalMercator: true}); • Very easy! • OpenLayers – Google Maps imagery layer – Microsoft Virtual Earth layer • Only useful when zoomed in • Need to be mindful that colour imagery interferes with choropleth colours • No longer self-contained
  • UCL DEPARTMENT OF GEOGRAPHY Points of Interest (POIs) • PostgreSQL (or MySQL) database • Can be a completely separate server • Client‟s OpenLayers does the work • Aim is to provide even more context • School names & performance indicators
  • UCL DEPARTMENT OF GEOGRAPHY User Interface Less is more Choice is good • How do you get people to explore the maps? – Maptube “visual directory” – Hierarchical drop-down lists – Tag cloud of keywords, maybe with a hierarchy
  • UCL DEPARTMENT OF GEOGRAPHY User Interface – Tag Cloud • Useful for exploring if you don‟t know what you want • More structured alternative needed for specific research
  • UCL DEPARTMENT OF GEOGRAPHY System Architecture
  • UCL DEPARTMENT OF GEOGRAPHY A Note on Python • If you don‟t use it already, you will! – ArcGIS 9.4 • “Python is now integrated directly into ArcMap [9.4]. I say it every year, but if you are an ArcGIS Desktop user, you need to take a close look at python as your scripting language.” - James Fee • The best thing about Python is: – Tidy scripts!
  • UCL DEPARTMENT OF GEOGRAPHY Servers Server Room dev tba tiler1 tiler2 tiler3 blog pois tiles1 tiles2 tiles3 www Web browsers
  • UCL DEPARTMENT OF GEOGRAPHY System Architecture – Website Web Apache browser (www)
  • UCL DEPARTMENT OF GEOGRAPHY System Architecture – Context Apache Web Apache (tiles2) browser (www) 404 • No python involved Tile – Less strain on the Tile exists No server ? • Web browser may have to request image twice Yes – Slow for the client
  • UCL DEPARTMENT OF GEOGRAPHY System Architecture – Context XML mod_python Apache Web Apache (tiler2) browser (www) Python XML renderer.py gen_tile.py Cache
  • UCL DEPARTMENT OF GEOGRAPHY System Architecture – Choropleth mod_python Apache Web Apache (tiler3) browser (www) Python Colorbrewer Tile renderer.py gen_tile.py Cache Tile (low Yes exists No zoom) ?
  • UCL DEPARTMENT OF GEOGRAPHY Scalability • OpenLayers allows multiple servers to be specified for retrieving image tiles • Different servers for different tasks • Random server chosen per- tile • So should scale? • Process is still processor intensive if generating the tiles at the same time • Stress testing needed
  • UCL DEPARTMENT OF GEOGRAPHY Prototype: Current State & Next Steps  On-demand tile × Legend generation × Automated data updates  Fast (enough) × Tag cloud  Will scale (hopefully!) × Improve cartography  OSM not quite “complete” × Internet Explorer 6 but getting there www.savethedevelopers.org  Context layer finished × Other census & ONS data  Some data added × Interactive data combination × Scotland & Northern Ireland × Points of Interest Running until June 2010
  • UCL DEPARTMENT OF GEOGRAPHY Live Demo • http://www.censusprofiler.org/prototype/
  • UCL DEPARTMENT OF GEOGRAPHY Google Earth
  • UCL DEPARTMENT OF GEOGRAPHY Q&A www.censusprofiler.org www.oliverobrien.co.uk Google Street View and Google Earth POI data is Copyright Google. Google Maps mapping data is Copyright Tele Atlas. Google aerial imagery is Copyright Digital Globe, Infoterra Ltd, Bluesky, GeoEye, Getmapping plc, The Geoinformation Group. OpenStreetMap data is CC-BY-SA OpenStreetMap and contributors. Logos depicted are generally Copyright of their respective organisations. Some image tiles include boundary information supplied by EDINA‟s UKBORDERS service. The Census data is supplied by the Office for National Statistics. The Word Cloud was produced with Wordle. The colour wheel diagrams are from worqx.com. The Painter’s Algorithm picture and the HSL colour diagram are from Wikipedia. The cartogram was produced by James Cheshire. The corresponding choropleth was produced by the BBC. The following references were used in the first part of this presentation: CIBER (2008) information behaviour of the researcher of the future. A report commissioned by The British Library and JISC 11 January 2008. http://www.bl.uk/news/pdf/googlegen.pdf Goodchild (2007) Citizens as Sensors: The world of Volunteered Geography. Workshop on Volunteered Geographic Information, Santa Barbara, CA. December 13-14, 2007 http://www.ncgia.ucsb.edu/projects/vgi/docs/position/Goodchild_VGI2007.pdf O’Reilly, T (2005) What Is web 2.0 Design Patterns and Business Models for the Next Generation of Software http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/whatisWeb20.html Turner A (2007) Introduction to Neogeography. O‟Reilly Media Short Cuts.