in	  conjunc(on	  with	  Data Management & Warehousing   http://www.datamgmt.com
What	  is	  the	  Spa(al	  Module?	  •  It’s	  the	  ability	  to	  analyse	  informa(on	  in	  a	     geographic	  contex...
The	  three	  types	  of	  data	  &	  many	  ques(ons	  •  Points	                                                        ...
Using	  Spa(al	  Data	  Is	  Complex	  •  Different	  distances	     between	  points	  at	     different	  longitudes	  and...
Sources	  of	  Informa(on	  –	  GPS	  •  In	  Car	  Device	            –  Sends	  frequent	  data	  sets	  to	            ...
Sources	  of	  Informa(on	  –	  Ordnance	  Survey	  •  Integrated	  Road	  Network:	     A	  series	  of	  3	  million	   ...
Sources	  of	  Informa(on	  –	  Post	  Office/GAdm	  •  Postal	  Address	  File:	     A	  series	  of	  c.1.75M	  UK	     po...
Data	  Layers	  –	  Enriching	  what	  you	  have	  •  Data	  Layers	  are	  sets	  of	  informa(on	  (ed	  to	  a	     ge...
Data	  Layer	  Sources	  (1)	  •  Ordnance	  Survey	            –  Road	  Types,	  Limits,	  Closures,	  etc.	  •  Governm...
Data	  Layer	  Sources	  (2)	  •  Wikipedia	            –  Geo-­‐tag	  Access	  API	  –	  what’s	  nearby?	  •  Google	  M...
Issues	  with	  Geo-­‐tagged	  data	  •  Geo-­‐tagging	  uses	  different	  formats	            –  Longitude	  &	  La(tude,...
Our	  Model	  For	  Delivering	  Spa(al	  Data	               Source	                   1.       Load	  Mul(ple	  File	  F...
Netezza	  Spa(al	  Value	  Add	  •  Netezza	  Spa(al	  is	  fast	                                            •  Netezza	  ...
Netezza	  Spa(al	  Limita(ons	  •  Fails	  the	  Slar(barpast	  Test:	            –  Polygons	  for	  very	  detailed	  ma...
Current	  Uses	  …	  •        M/A/B	  road	  driving	  profiles	  •        Time	  of	  day	  driving	  profiles	  •        S...
in	  conjunc(on	  with	  Data Management & Warehousing   http://www.datamgmt.com
Upcoming SlideShare
Loading in...5
×

Implementing Netezza Spatial

1,847

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,847
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
64
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Implementing Netezza Spatial

  1. 1. in  conjunc(on  with  Data Management & Warehousing http://www.datamgmt.com
  2. 2. What  is  the  Spa(al  Module?  •  It’s  the  ability  to  analyse  informa(on  in  a   geographic  context:   –  Where  is  the  nearest  petrol  sta(on?   –  Which  road  am  I  on?   –  How  many  ATMs  are  in  this  area?  •  It’s  not  maps  and  images   –  These  come  later  with  tools  that  help  present  the   informa(on  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   2  
  3. 3. The  three  types  of  data  &  many  ques(ons  •  Points   •  How  close  are  two   –  OS  Grid   points?   –  La(tude  &  Longitude     •  Does  a  point  touch  a  •  Lines   line?   –  Pairs  of  points   •  Is  a  point  inside  or   –  e.g.  Road  Segments   outside  a  polygon?  •  Polygons   •  Does  a  line  cross  a   –  A  series  of  points  that   polygon?   define  a  boundary   •  How  many  points  are  in   –  e.g.  Postcode  Boundaries   a  polygon?  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   3  
  4. 4. Using  Spa(al  Data  Is  Complex  •  Different  distances   between  points  at   different  longitudes  and   la(tudes  •  Measurement  over  a   curved  irregular  surface  •  Mul(ple  input  and  output   formats  •  Mul(ple  co-­‐ordinate   systems  see: A  Guide  to  Coordinate   Systems  in  Great  Britain    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   4  
  5. 5. Sources  of  Informa(on  –  GPS  •  In  Car  Device   –  Sends  frequent  data  sets  to   processing  centre   –  Point  Data   •  Speed,  Direc(on,     Loca(on  and  G-­‐force   –  Aggregate  Data   •  Speed  and  Direc(on  •  Other  Devices   –  Sat  Nav  Systems   –  Smart  Phone  Apps     e.g.  ‘GPS  Tracker’   –  Cameras  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   5  
  6. 6. Sources  of  Informa(on  –  Ordnance  Survey  •  Integrated  Road  Network:   A  series  of  3  million   ‘linestrings’  and  17  million   points  that  describe  every   road  in  the  UK  •  Linestrings  have  between  2   and  655  points,  most  have   less  than  10  •  23  points  for  this  picture      Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   6  
  7. 7. Sources  of  Informa(on  –  Post  Office/GAdm  •  Postal  Address  File:   A  series  of  c.1.75M  UK   postcodes   –  Postcode  Boundaries     –  Over  28M  complete   addresses  •  Global  Admin  Boundaries   –  Na(onal  and  regional   boundaries  for  c.245   countries   –  hgp://www.gadm.org    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   7  
  8. 8. Data  Layers  –  Enriching  what  you  have  •  Data  Layers  are  sets  of  informa(on  (ed  to  a   geographic  point   –  Road  Speed  for  a  given  road  segment   –  ATM  Loca(on   –  House  Price  for  a  postcode  •  Where  data  has  loca(on  informa(on  it  is   known  as  ‘Geo-­‐tagged’  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   8  
  9. 9. Data  Layer  Sources  (1)  •  Ordnance  Survey   –  Road  Types,  Limits,  Closures,  etc.  •  Government   –  UK  Government  now  providing  masses  of     geo-­‐tagged  info  (hgp://data.gov.uk)  •  Met  Office  /  HM  Nau(cal  Almanac  Office     –  Weather,  Daylight  to  Postcode  Level  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   9  
  10. 10. Data  Layer  Sources  (2)  •  Wikipedia   –  Geo-­‐tag  Access  API  –  what’s  nearby?  •  Google  Maps   –  Road  level  photographic  images  •  Commercial  Sources   –  Fast  Food  Outlets,  Supermarkets,  Petrol  Sta(ons,  ATMs,   etc.  •  Massive  growth  in  both  commercial  and  public  domain   geo-­‐tagged  data  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   10  
  11. 11. Issues  with  Geo-­‐tagged  data  •  Geo-­‐tagging  uses  different  formats   –  Longitude  &  La(tude,  OS  Grid  Reference,  etc  •  Geo-­‐tagging  at  different  levels   –  Data  for  a  postcode  or  a  an  en(re  county  which  makes   it  difficult  to  compare  •  Geo-­‐tagging  coverage  is  patchy  and/or  historic   –  Rate  of  change  of  fine  detail  data  is  very  high     –  e.g.  OS  issues  monthly  updates  to  the  UK  mapping  •  Mul(ple  standards  and  formats   –  XML  &  CSV,  different  file  formats,  etc.    Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   11  
  12. 12. Our  Model  For  Delivering  Spa(al  Data   Source   1.  Load  Mul(ple  File  Formats   Netezza   2.  Standardise  Geo-­‐Tagging   3.  Extract  &  Load  CSVs    (Proximity,  Contains,  Excludes)   (Tableau,  Google  Maps,  etc.)   Query  &  Presenta(on  Tools   Source   4.  Perform  Spa(al  Analysis   (Sets  of  data  with  spa(al   Spa(al  Presenta(on   5.  Create  User  Access  Area   Spa(al  Analysis   agributes)   Source   Source   (Small)   1   Postgres   3   Database   Source   2   Source   4   5  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   12  
  13. 13. Netezza  Spa(al  Value  Add  •  Netezza  Spa(al  is  fast   •  Netezza  Spa(al  is  easy   –  Analysis   –  Distance  and  proximity   •  Look  up  a  typical  18  point   calcula(ons  are  simple   trip  in  the  3M  linestrings  to   –  ‘Touches’,  ‘Overlaps’  &   find  the  roads  that  the   ‘Contains’  queries  allow   vehicle  was  on  in  less  than   1  second   instant  value  add     •  Overnight  batch  process  of   300,000  points  to  matching   •  Netezza  Spa(al  integrates   road  names  in  under  30   minutes   –  Works  well  with  Tableau   –  Presenta(on   –  Easy  to  generate  KML  for   •  Tools  rely  on  fast  query   use  with  Google  Earth  and   access  to  render  any   Google  Maps   queried  map  with  sub-­‐ second  response  (mes  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   13  
  14. 14. Netezza  Spa(al  Limita(ons  •  Fails  the  Slar(barpast  Test:   –  Polygons  for  very  detailed  maps   are  too  big  to  be  loaded  as   Netezza  limits  the  maximum   block  size  to  64000  characters   Norway   –  Named  aqer  the  Hitch-­‐Hikers   Guide  to  the  Galaxy  coastline   designer  responsible  for  the   twiddly  bits  around  the   Norwegian  rords  •  Work-­‐around:   –  Use  regional  boundaries  (e.g.   UK  Coun(es,  US  States,  etc.)   and  then  aggregate  into   na(onal  boundaries   –  If  a  point  is  in  Berkshire  then  by   Slar(barpast   defini(on  it  is  also  in  England  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   Page  14  
  15. 15. Current  Uses  …  •  M/A/B  road  driving  profiles  •  Time  of  day  driving  profiles  •  Speed  Limits  vs.  Driven  Speed  •  Matching  GPS  posi(ons  to  road  names  •  Out  of  bounds  driving  •  Customer  Demographic  Profiles    …  but  this  is  only  the  start  in  a  very  short  (me  Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   15  
  16. 16. in  conjunc(on  with  Data Management & Warehousing http://www.datamgmt.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×