Sql Server 2008 Spatial Analysis


Published on

Presented at December 2010 IndyPASS

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Sql Server 2008 Spatial Analysis

  1. 1. SQL Server 2008 Spatial Analysis<br />Dan Crawford<br />Integrated Network Strategies<br />dcrawford@insindy.com<br />http://www.insindy.com<br />
  2. 2. What is spatial data?<br />Geometric<br />Represents data in a 2D plain, similar to graph paper in high school. Units are user-defined and could be inches, miles, pixels, etc.<br />
  3. 3. What is spatial data?<br />Geographic<br />Represents data points using angles of Latitude and Longitude. Latitude measures North/South, and Longitude measures degrees East/West of Prime Meridian<br />
  4. 4. System Requirements<br />SQL Server 2008 Express or higher – recommend R2 to use maps in SSRS<br />Dev Tools<br />Visual Studio 2005, 2008, or 2010<br />SQL Management Studio 2008<br />Now supported on SQL Azure<br />
  5. 5. Uses of spatial data<br />Used by central cancer registries for statistical analysis with other geography specific data sources, such as census data<br />Integrated route mapping with MapPoint, Google Maps, etc<br />Geographical business intelligence analytics<br />
  6. 6. Geometry data type<br />Geometry data type stores points, lines, polygons, and collections of geometric objects<br />Represent using WKT (well-known text), WKB (well-known binary), or GML (geography markup language)<br />WKT seems to be most common<br />
  7. 7. WKT Markup<br />POINT(x y)<br />LINESTRING(x1 y1,x2 y2)<br />POLYGON((x1 y1,x2 y2,x3 y3,x4 y4,x1 y1))<br />GEOMETRYCOLLECTION(Geo1, Geo2, …)<br />
  8. 8. Spatial Expressions<br />
  9. 9. More Spatial Expressions<br />
  10. 10. Geocoding<br />Geography data type does not directly understanding mailing address data<br />Mailing addresses must be converted to latitude/longitude coordinates<br />Geocoding = conversion of geographic data like address or zip code to geographic coordinates<br />Options – MapPoint/Bing Map Services, Google Maps API, many others<br />
  11. 11. Rendering Options<br />SQL Management Studio 2008 – very basic for query testing<br />VirtualEarth<br />Google Maps or similar<br />3rd party mapping component (e.g. Dundas)<br />SSRS/Report Builder in R2<br />
  12. 12. Spatial Indexing<br />Images from Microsoft Technet<br />
  13. 13. Spatial Indexing<br />CREATE SPATIAL INDEX SPATIAL_Hospitals ON dbo.Hospitals(LocationGeography) <br />USING GEOGRAPHY_GRID<br /> WITH( GRIDS = ( LEVEL_1 = MEDIUM, <br /> LEVEL_2 = MEDIUM, <br /> LEVEL_3 = MEDIUM, <br /> LEVEL_4 = MEDIUM), <br />CELLS_PER_OBJECT = 16, STATISTICS_NORECOMPUTE = OFF, <br />ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)<br />
  14. 14. Spatial Indexing - Utilization<br />SELECT *<br />FROM Hospitals WITH (INDEX(SPATIAL_Hospitals))<br />WHERE<br />LocationGeography.STIntersects(@P.STBuffer(@eps*1609.344)) = 1<br />
  15. 15. Goal of Geographic Analysis<br />“I want SQL Server to tell me when there are clusters of geographic data points and where they are located.”<br /> - Dan Crawford, 2010<br />
  16. 16. It’s easy to see points on a map with SQL Server<br />
  17. 17. Why use cluster analysis?<br />Analysis of injury severity and hospital resource use in a regional health care system<br />Customer purchasing patterns<br />Choosing a business or advertising location<br />Crime analysis<br />Easy visualization for dashboard<br />
  18. 18. What is a geographic cluster?<br />For our purposes a cluster is a group of a significant number of data points which are geographically close to each other. <br />There are two variables:<br /><ul><li>The number of data points which are required in order to be considered a cluster
  19. 19. Distance which defines being “geographically close”</li></li></ul><li>What we want…<br />
  20. 20. Or better yet…<br />
  21. 21. DBSCAN<br />DBSCAN(D, eps, MinPts) <br /> C = 0 <br /> for each unvisited point P in dataset D <br /> mark P as visited <br /> N = getNeighbors (P, eps) <br /> if sizeof(N) < MinPts<br /> mark P as NOISE <br /> else <br /> C = next cluster <br />expandCluster(P, N, C, eps, MinPts)<br />Source: http://en.wikipedia.org/wiki/DBSCAN<br />
  22. 22. DBSCAN (cont’d)<br />expandCluster(P, N, C, eps, MinPts) <br /> add P to cluster C <br /> for each point P' in N <br /> if P' is not visited <br /> mark P' as visited <br /> N' = getNeighbors(P', eps) <br /> if sizeof(N') >= MinPts<br /> N = N joined with N' <br /> if P' is not yet member of any cluster <br /> add P' to cluster C <br />Source: http://en.wikipedia.org/wiki/DBSCAN<br />
  23. 23.
  24. 24. To make life easier<br />Report Builder 3.0<br />SQL Server Spatial Tools – http://sqlspatialtools.codeplex.com<br />