Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sql Server 2008 Spatial Analysis


Published on

Presented at December 2010 IndyPASS

  • Be the first to comment

  • Be the first to like this

Sql Server 2008 Spatial Analysis

  1. 1. SQL Server 2008 Spatial Analysis<br />Dan Crawford<br />Integrated Network Strategies<br /><br /><br />
  2. 2. What is spatial data?<br />Geometric<br />Represents data in a 2D plain, similar to graph paper in high school. Units are user-defined and could be inches, miles, pixels, etc.<br />
  3. 3. What is spatial data?<br />Geographic<br />Represents data points using angles of Latitude and Longitude. Latitude measures North/South, and Longitude measures degrees East/West of Prime Meridian<br />
  4. 4. System Requirements<br />SQL Server 2008 Express or higher – recommend R2 to use maps in SSRS<br />Dev Tools<br />Visual Studio 2005, 2008, or 2010<br />SQL Management Studio 2008<br />Now supported on SQL Azure<br />
  5. 5. Uses of spatial data<br />Used by central cancer registries for statistical analysis with other geography specific data sources, such as census data<br />Integrated route mapping with MapPoint, Google Maps, etc<br />Geographical business intelligence analytics<br />
  6. 6. Geometry data type<br />Geometry data type stores points, lines, polygons, and collections of geometric objects<br />Represent using WKT (well-known text), WKB (well-known binary), or GML (geography markup language)<br />WKT seems to be most common<br />
  7. 7. WKT Markup<br />POINT(x y)<br />LINESTRING(x1 y1,x2 y2)<br />POLYGON((x1 y1,x2 y2,x3 y3,x4 y4,x1 y1))<br />GEOMETRYCOLLECTION(Geo1, Geo2, …)<br />
  8. 8. Spatial Expressions<br />
  9. 9. More Spatial Expressions<br />
  10. 10. Geocoding<br />Geography data type does not directly understanding mailing address data<br />Mailing addresses must be converted to latitude/longitude coordinates<br />Geocoding = conversion of geographic data like address or zip code to geographic coordinates<br />Options – MapPoint/Bing Map Services, Google Maps API, many others<br />
  11. 11. Rendering Options<br />SQL Management Studio 2008 – very basic for query testing<br />VirtualEarth<br />Google Maps or similar<br />3rd party mapping component (e.g. Dundas)<br />SSRS/Report Builder in R2<br />
  12. 12. Spatial Indexing<br />Images from Microsoft Technet<br />
  13. 13. Spatial Indexing<br />CREATE SPATIAL INDEX SPATIAL_Hospitals ON dbo.Hospitals(LocationGeography) <br />USING GEOGRAPHY_GRID<br /> WITH( GRIDS = ( LEVEL_1 = MEDIUM, <br /> LEVEL_2 = MEDIUM, <br /> LEVEL_3 = MEDIUM, <br /> LEVEL_4 = MEDIUM), <br />CELLS_PER_OBJECT = 16, STATISTICS_NORECOMPUTE = OFF, <br />ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)<br />
  14. 14. Spatial Indexing - Utilization<br />SELECT *<br />FROM Hospitals WITH (INDEX(SPATIAL_Hospitals))<br />WHERE<br />LocationGeography.STIntersects(@P.STBuffer(@eps*1609.344)) = 1<br />
  15. 15. Goal of Geographic Analysis<br />“I want SQL Server to tell me when there are clusters of geographic data points and where they are located.”<br /> - Dan Crawford, 2010<br />
  16. 16. It’s easy to see points on a map with SQL Server<br />
  17. 17. Why use cluster analysis?<br />Analysis of injury severity and hospital resource use in a regional health care system<br />Customer purchasing patterns<br />Choosing a business or advertising location<br />Crime analysis<br />Easy visualization for dashboard<br />
  18. 18. What is a geographic cluster?<br />For our purposes a cluster is a group of a significant number of data points which are geographically close to each other. <br />There are two variables:<br /><ul><li>The number of data points which are required in order to be considered a cluster
  19. 19. Distance which defines being “geographically close”</li></li></ul><li>What we want…<br />
  20. 20. Or better yet…<br />
  21. 21. DBSCAN<br />DBSCAN(D, eps, MinPts) <br /> C = 0 <br /> for each unvisited point P in dataset D <br /> mark P as visited <br /> N = getNeighbors (P, eps) <br /> if sizeof(N) < MinPts<br /> mark P as NOISE <br /> else <br /> C = next cluster <br />expandCluster(P, N, C, eps, MinPts)<br />Source:<br />
  22. 22. DBSCAN (cont’d)<br />expandCluster(P, N, C, eps, MinPts) <br /> add P to cluster C <br /> for each point P' in N <br /> if P' is not visited <br /> mark P' as visited <br /> N' = getNeighbors(P', eps) <br /> if sizeof(N') >= MinPts<br /> N = N joined with N' <br /> if P' is not yet member of any cluster <br /> add P' to cluster C <br />Source:<br />
  23. 23.
  24. 24. To make life easier<br />Report Builder 3.0<br />SQL Server Spatial Tools –<br />