Implementing Netezza Spatial
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Implementing Netezza Spatial

on

  • 2,157 views

 

Statistics

Views

Total Views
2,157
Views on SlideShare
2,157
Embed Views
0

Actions

Likes
0
Downloads
53
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Implementing Netezza Spatial Presentation Transcript

  • 1. in  conjunc(on  with  Data Management & Warehousing http://www.datamgmt.com
  • 2. What  is  the  Spa(al  Module?  •  It’s  the  ability  to  analyse  informa(on  in  a   geographic  context:   –  Where  is  the  nearest  petrol  sta(on?   –  Which  road  am  I  on?   –  How  many  ATMs  are  in  this  area?  •  It’s  not  maps  and  images   –  These  come  later  with  tools  that  help  present  the   informa(on  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   2  
  • 3. The  three  types  of  data  &  many  ques(ons  •  Points   •  How  close  are  two   –  OS  Grid   points?   –  La(tude  &  Longitude     •  Does  a  point  touch  a  •  Lines   line?   –  Pairs  of  points   •  Is  a  point  inside  or   –  e.g.  Road  Segments   outside  a  polygon?  •  Polygons   •  Does  a  line  cross  a   –  A  series  of  points  that   polygon?   define  a  boundary   •  How  many  points  are  in   –  e.g.  Postcode  Boundaries   a  polygon?  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   3  
  • 4. Using  Spa(al  Data  Is  Complex  •  Different  distances   between  points  at   different  longitudes  and   la(tudes  •  Measurement  over  a   curved  irregular  surface  •  Mul(ple  input  and  output   formats  •  Mul(ple  co-­‐ordinate   systems  see: A  Guide  to  Coordinate   Systems  in  Great  Britain    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   4  
  • 5. Sources  of  Informa(on  –  GPS  •  In  Car  Device   –  Sends  frequent  data  sets  to   processing  centre   –  Point  Data   •  Speed,  Direc(on,     Loca(on  and  G-­‐force   –  Aggregate  Data   •  Speed  and  Direc(on  •  Other  Devices   –  Sat  Nav  Systems   –  Smart  Phone  Apps     e.g.  ‘GPS  Tracker’   –  Cameras  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   5  
  • 6. Sources  of  Informa(on  –  Ordnance  Survey  •  Integrated  Road  Network:   A  series  of  3  million   ‘linestrings’  and  17  million   points  that  describe  every   road  in  the  UK  •  Linestrings  have  between  2   and  655  points,  most  have   less  than  10  •  23  points  for  this  picture      Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   6  
  • 7. Sources  of  Informa(on  –  Post  Office/GAdm  •  Postal  Address  File:   A  series  of  c.1.75M  UK   postcodes   –  Postcode  Boundaries     –  Over  28M  complete   addresses  •  Global  Admin  Boundaries   –  Na(onal  and  regional   boundaries  for  c.245   countries   –  hgp://www.gadm.org    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   7  
  • 8. Data  Layers  –  Enriching  what  you  have  •  Data  Layers  are  sets  of  informa(on  (ed  to  a   geographic  point   –  Road  Speed  for  a  given  road  segment   –  ATM  Loca(on   –  House  Price  for  a  postcode  •  Where  data  has  loca(on  informa(on  it  is   known  as  ‘Geo-­‐tagged’  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   8  
  • 9. Data  Layer  Sources  (1)  •  Ordnance  Survey   –  Road  Types,  Limits,  Closures,  etc.  •  Government   –  UK  Government  now  providing  masses  of     geo-­‐tagged  info  (hgp://data.gov.uk)  •  Met  Office  /  HM  Nau(cal  Almanac  Office     –  Weather,  Daylight  to  Postcode  Level  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   9  
  • 10. Data  Layer  Sources  (2)  •  Wikipedia   –  Geo-­‐tag  Access  API  –  what’s  nearby?  •  Google  Maps   –  Road  level  photographic  images  •  Commercial  Sources   –  Fast  Food  Outlets,  Supermarkets,  Petrol  Sta(ons,  ATMs,   etc.  •  Massive  growth  in  both  commercial  and  public  domain   geo-­‐tagged  data  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   10  
  • 11. Issues  with  Geo-­‐tagged  data  •  Geo-­‐tagging  uses  different  formats   –  Longitude  &  La(tude,  OS  Grid  Reference,  etc  •  Geo-­‐tagging  at  different  levels   –  Data  for  a  postcode  or  a  an  en(re  county  which  makes   it  difficult  to  compare  •  Geo-­‐tagging  coverage  is  patchy  and/or  historic   –  Rate  of  change  of  fine  detail  data  is  very  high     –  e.g.  OS  issues  monthly  updates  to  the  UK  mapping  •  Mul(ple  standards  and  formats   –  XML  &  CSV,  different  file  formats,  etc.    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   11  
  • 12. Our  Model  For  Delivering  Spa(al  Data   Source   1.  Load  Mul(ple  File  Formats   Netezza   2.  Standardise  Geo-­‐Tagging   3.  Extract  &  Load  CSVs    (Proximity,  Contains,  Excludes)   (Tableau,  Google  Maps,  etc.)   Query  &  Presenta(on  Tools   Source   4.  Perform  Spa(al  Analysis   (Sets  of  data  with  spa(al   Spa(al  Presenta(on   5.  Create  User  Access  Area   Spa(al  Analysis   agributes)   Source   Source   (Small)   1   Postgres   3   Database   Source   2   Source   4   5  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   12  
  • 13. Netezza  Spa(al  Value  Add  •  Netezza  Spa(al  is  fast   •  Netezza  Spa(al  is  easy   –  Analysis   –  Distance  and  proximity   •  Look  up  a  typical  18  point   calcula(ons  are  simple   trip  in  the  3M  linestrings  to   –  ‘Touches’,  ‘Overlaps’  &   find  the  roads  that  the   ‘Contains’  queries  allow   vehicle  was  on  in  less  than   1  second   instant  value  add     •  Overnight  batch  process  of   300,000  points  to  matching   •  Netezza  Spa(al  integrates   road  names  in  under  30   minutes   –  Works  well  with  Tableau   –  Presenta(on   –  Easy  to  generate  KML  for   •  Tools  rely  on  fast  query   use  with  Google  Earth  and   access  to  render  any   Google  Maps   queried  map  with  sub-­‐ second  response  (mes  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   13  
  • 14. Netezza  Spa(al  Limita(ons  •  Fails  the  Slar(barpast  Test:   –  Polygons  for  very  detailed  maps   are  too  big  to  be  loaded  as   Netezza  limits  the  maximum   block  size  to  64000  characters   Norway   –  Named  aqer  the  Hitch-­‐Hikers   Guide  to  the  Galaxy  coastline   designer  responsible  for  the   twiddly  bits  around  the   Norwegian  rords  •  Work-­‐around:   –  Use  regional  boundaries  (e.g.   UK  Coun(es,  US  States,  etc.)   and  then  aggregate  into   na(onal  boundaries   –  If  a  point  is  in  Berkshire  then  by   Slar(barpast   defini(on  it  is  also  in  England  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   Page  14  
  • 15. Current  Uses  …  •  M/A/B  road  driving  profiles  •  Time  of  day  driving  profiles  •  Speed  Limits  vs.  Driven  Speed  •  Matching  GPS  posi(ons  to  road  names  •  Out  of  bounds  driving  •  Customer  Demographic  Profiles    …  but  this  is  only  the  start  in  a  very  short  (me  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   15  
  • 16. in  conjunc(on  with  Data Management & Warehousing http://www.datamgmt.com