Implementing Netezza Spatial
Upcoming SlideShare
Loading in...5
×
 

Implementing Netezza Spatial

on

  • 2,106 views

 

Statistics

Views

Total Views
2,106
Views on SlideShare
2,106
Embed Views
0

Actions

Likes
0
Downloads
53
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Implementing Netezza Spatial Implementing Netezza Spatial Presentation Transcript

  • in  conjunc(on  with  Data Management & Warehousing http://www.datamgmt.com
  • What  is  the  Spa(al  Module?  •  It’s  the  ability  to  analyse  informa(on  in  a   geographic  context:   –  Where  is  the  nearest  petrol  sta(on?   –  Which  road  am  I  on?   –  How  many  ATMs  are  in  this  area?  •  It’s  not  maps  and  images   –  These  come  later  with  tools  that  help  present  the   informa(on  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   2  
  • The  three  types  of  data  &  many  ques(ons  •  Points   •  How  close  are  two   –  OS  Grid   points?   –  La(tude  &  Longitude     •  Does  a  point  touch  a  •  Lines   line?   –  Pairs  of  points   •  Is  a  point  inside  or   –  e.g.  Road  Segments   outside  a  polygon?  •  Polygons   •  Does  a  line  cross  a   –  A  series  of  points  that   polygon?   define  a  boundary   •  How  many  points  are  in   –  e.g.  Postcode  Boundaries   a  polygon?  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   3   View slide
  • Using  Spa(al  Data  Is  Complex  •  Different  distances   between  points  at   different  longitudes  and   la(tudes  •  Measurement  over  a   curved  irregular  surface  •  Mul(ple  input  and  output   formats  •  Mul(ple  co-­‐ordinate   systems  see: A  Guide  to  Coordinate   Systems  in  Great  Britain    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   4   View slide
  • Sources  of  Informa(on  –  GPS  •  In  Car  Device   –  Sends  frequent  data  sets  to   processing  centre   –  Point  Data   •  Speed,  Direc(on,     Loca(on  and  G-­‐force   –  Aggregate  Data   •  Speed  and  Direc(on  •  Other  Devices   –  Sat  Nav  Systems   –  Smart  Phone  Apps     e.g.  ‘GPS  Tracker’   –  Cameras  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   5  
  • Sources  of  Informa(on  –  Ordnance  Survey  •  Integrated  Road  Network:   A  series  of  3  million   ‘linestrings’  and  17  million   points  that  describe  every   road  in  the  UK  •  Linestrings  have  between  2   and  655  points,  most  have   less  than  10  •  23  points  for  this  picture      Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   6  
  • Sources  of  Informa(on  –  Post  Office/GAdm  •  Postal  Address  File:   A  series  of  c.1.75M  UK   postcodes   –  Postcode  Boundaries     –  Over  28M  complete   addresses  •  Global  Admin  Boundaries   –  Na(onal  and  regional   boundaries  for  c.245   countries   –  hgp://www.gadm.org    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   7  
  • Data  Layers  –  Enriching  what  you  have  •  Data  Layers  are  sets  of  informa(on  (ed  to  a   geographic  point   –  Road  Speed  for  a  given  road  segment   –  ATM  Loca(on   –  House  Price  for  a  postcode  •  Where  data  has  loca(on  informa(on  it  is   known  as  ‘Geo-­‐tagged’  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   8  
  • Data  Layer  Sources  (1)  •  Ordnance  Survey   –  Road  Types,  Limits,  Closures,  etc.  •  Government   –  UK  Government  now  providing  masses  of     geo-­‐tagged  info  (hgp://data.gov.uk)  •  Met  Office  /  HM  Nau(cal  Almanac  Office     –  Weather,  Daylight  to  Postcode  Level  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   9  
  • Data  Layer  Sources  (2)  •  Wikipedia   –  Geo-­‐tag  Access  API  –  what’s  nearby?  •  Google  Maps   –  Road  level  photographic  images  •  Commercial  Sources   –  Fast  Food  Outlets,  Supermarkets,  Petrol  Sta(ons,  ATMs,   etc.  •  Massive  growth  in  both  commercial  and  public  domain   geo-­‐tagged  data  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   10  
  • Issues  with  Geo-­‐tagged  data  •  Geo-­‐tagging  uses  different  formats   –  Longitude  &  La(tude,  OS  Grid  Reference,  etc  •  Geo-­‐tagging  at  different  levels   –  Data  for  a  postcode  or  a  an  en(re  county  which  makes   it  difficult  to  compare  •  Geo-­‐tagging  coverage  is  patchy  and/or  historic   –  Rate  of  change  of  fine  detail  data  is  very  high     –  e.g.  OS  issues  monthly  updates  to  the  UK  mapping  •  Mul(ple  standards  and  formats   –  XML  &  CSV,  different  file  formats,  etc.    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   11  
  • Our  Model  For  Delivering  Spa(al  Data   Source   1.  Load  Mul(ple  File  Formats   Netezza   2.  Standardise  Geo-­‐Tagging   3.  Extract  &  Load  CSVs    (Proximity,  Contains,  Excludes)   (Tableau,  Google  Maps,  etc.)   Query  &  Presenta(on  Tools   Source   4.  Perform  Spa(al  Analysis   (Sets  of  data  with  spa(al   Spa(al  Presenta(on   5.  Create  User  Access  Area   Spa(al  Analysis   agributes)   Source   Source   (Small)   1   Postgres   3   Database   Source   2   Source   4   5  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   12  
  • Netezza  Spa(al  Value  Add  •  Netezza  Spa(al  is  fast   •  Netezza  Spa(al  is  easy   –  Analysis   –  Distance  and  proximity   •  Look  up  a  typical  18  point   calcula(ons  are  simple   trip  in  the  3M  linestrings  to   –  ‘Touches’,  ‘Overlaps’  &   find  the  roads  that  the   ‘Contains’  queries  allow   vehicle  was  on  in  less  than   1  second   instant  value  add     •  Overnight  batch  process  of   300,000  points  to  matching   •  Netezza  Spa(al  integrates   road  names  in  under  30   minutes   –  Works  well  with  Tableau   –  Presenta(on   –  Easy  to  generate  KML  for   •  Tools  rely  on  fast  query   use  with  Google  Earth  and   access  to  render  any   Google  Maps   queried  map  with  sub-­‐ second  response  (mes  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   13  
  • Netezza  Spa(al  Limita(ons  •  Fails  the  Slar(barpast  Test:   –  Polygons  for  very  detailed  maps   are  too  big  to  be  loaded  as   Netezza  limits  the  maximum   block  size  to  64000  characters   Norway   –  Named  aqer  the  Hitch-­‐Hikers   Guide  to  the  Galaxy  coastline   designer  responsible  for  the   twiddly  bits  around  the   Norwegian  rords  •  Work-­‐around:   –  Use  regional  boundaries  (e.g.   UK  Coun(es,  US  States,  etc.)   and  then  aggregate  into   na(onal  boundaries   –  If  a  point  is  in  Berkshire  then  by   Slar(barpast   defini(on  it  is  also  in  England  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   Page  14  
  • Current  Uses  …  •  M/A/B  road  driving  profiles  •  Time  of  day  driving  profiles  •  Speed  Limits  vs.  Driven  Speed  •  Matching  GPS  posi(ons  to  road  names  •  Out  of  bounds  driving  •  Customer  Demographic  Profiles    …  but  this  is  only  the  start  in  a  very  short  (me  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   15  
  • in  conjunc(on  with  Data Management & Warehousing http://www.datamgmt.com