Spatial functions in MySQL 5.6, MariaDB 5.5, PostGIS 2.0 and others

17,707 views
17,255 views

Published on

Published in: Technology

Spatial functions in MySQL 5.6, MariaDB 5.5, PostGIS 2.0 and others

  1. 1. Company Confidential. ©2010 NokiaNokia Internal Use OnlySpatial functions inMySQL 5.6, MariaDB 5.5,PostGIS 2.0 and othersPercona Live MySQL Conference and Expo 2013Henrik IngoSenior Performance Architect, Nokia(CC) 2013 Nokia. Please share and modify this presentation licensed with the Creative Commons Attribution license.
  2. 2. (CC BY) 2013 Nokia 2GIS is a lot of things.Open Geospatial Consortium defines lots of standards• http://www.opengeospatial.org/standards/sfsThe one we are talking about is:OpenGIS Implementation Specification for Geographicinformation - Simple feature access - Part 2: SQL optionWHAT is GIS?
  3. 3. (CC BY) 2013 Nokia 3Is the world flat, or a sphere?GEOMETRY types GEOGRAPHY types
  4. 4. (CC BY) 2013 Nokia 4Its neither!But what about mountains and skyscrapers?
  5. 5. (CC BY) 2013 Nokia 5Projections?AB CDistance(A, B) = 0.0001 deg = 11 mDistance(B, C) = 0.0001 deg = 8.5 m(in Manhattan)AB CAll the lines above are straight.
  6. 6. (CC BY) 2013 Nokia 6POINT(0 0) LINESTRING(0 0,1 1,1 2) POLYGON((0 0,4 0,4 4,0 4,0 0),(1 1, 2 1, 2 2, 1 2,1 1))...INSERT INTO geotable ( the_geom, the_name )VALUES ( ST_GeomFromText(POINT(-126.4 45.32), 312), A Place);db=# SELECT road_id, ST_AsText(road_geom) AS geom, road_name FROM roads;road_id | geom | road_name--------+-----------------------------------------+-----------1 | LINESTRING(191232 243118,191108 243242) | Jeff Rd2 | LINESTRING(189141 244158,189265 244817) | Geordie Rd3 | LINESTRING(192783 228138,192612 229814) | Paul St4 | LINESTRING(189412 252431,189631 259122) | Graeme Ave5 | LINESTRING(190131 224148,190871 228134) | Phil Tce6 | LINESTRING(198231 263418,198213 268322) | Dave Cres7 | LINESTRING(218421 284121,224123 241231) | Chris Way(6 rows)SELECT the_geomFROM geom_tableWHERE ST_Distance(the_geom, ST_GeomFromText(POINT(100000 200000))) < 100 AND type="road"See also: http://blog.mariadb.org/screencast-mariadb-gis-demo/Example SQL
  7. 7. (CC BY) 2013 Nokia 7PostgreSQL MySQL & MariaDB MongoDB Solr SQLiteStandard feature PostGIS +Extension+ + + SpatialiteType: Point + + + + +Type: Geometry (x,y) + + * - +Type: Geography (lat, lon) + - * - -Type: 3D (ish) + - - - -SRID projections + - * - +Query by radius + ~ + + ~Precise decimal math - MariaDB - - -Query by bounding box + + * - +Notes: Mostfunctionsdont supportGeographyMyISAM only WGS84 onlyLimitedfunction set.Indexeshave to beexplicitlyJOINedProducts that implement GIS* Since MongoDB 2.4. This evaluation was done on v 2.0.~ No, but you can query with bounding box (uses index) AND sort that result set by radius.
  8. 8. (CC BY) 2013 Nokia 8Spatial use cases-74.001417, 40.719811Canal Street, New York, USAGeocodingReverse Geocoding(text search)(GIS)Points-of-InterestWe are here
  9. 9. (CC BY) 2013 Nokia 9• Scan HERE.com with script:40.48, -75.23 to 42.42, -73.38New York City+ 4 neighbor states+ Atlantic Ocean• 0.0001 deg steps =11 m vertically, 8.5 m horizontally• 358M points9.6M unique locations• 7 daysCreating my data set
  10. 10. (CC BY) 2013 Nokia 10SELECT * FROM LocationJOIN Point ON Location.id=Point.LocationIdWHERE Location.id=1;id Label Country State County1 E Sawmill Rd, Haycock Twp, PA18951, United StatesUSA PA BucksPostalCode City District Street HouseNumberLocationType18951 Haycock E Sawmill Rd street1:n
  11. 11. (CC BY) 2013 Nokia 11• GIS functionsused:ST_Envelope()ST_Union()• Limitations inGeography type• 12 daysBottlenecked byCPUCreating areas out of points
  12. 12. (CC BY) 2013 Nokia 12My dataset!
  13. 13. (CC BY) 2013 Nokia 13Accuracy compared to source = 93% (...5m margin of error)
  14. 14. (CC BY) 2013 Nokia 14sql = """SELECT id,ST_X(st_geomfromtext(st_astext("p"))) "x",ST_Y(st_geomfromtext(st_astext("p"))) "y"FROM "Point"WHERE "Point"."LocationId" = %s"""cur.execute(sql, [id] )points = cur.fetchall()for p in points :db.point.insert({ "_id" : p[id],"LocationId" : id,"p":[p[x], p[y]] })Migrating from SQL to NoSQL
  15. 15. Company Confidential. ©2010 NokiaNokia Internal Use OnlyMongoDB requires points to be ordered as (lon, lat).Python dictionaries are serialized in alphabeticalorder.YouareHERE
  16. 16. (CC BY) 2013 Nokia 16SQL with polygonsSELECT *FROM "GeomArea"JOIN "Location" ON "GeomArea"."id" = "Location"."id"WHERE ST_Within(ST_GeomFromEWKT(SRID=4326;POINT(<lon> <lat>)), "p")SQL with pointsSELECT *FROM PointJOIN Location ON Point.LocationId = Location.idWHERE ST_Within(p, ST_GeomFromText(POLYGON((<lon>+1 <lat>+1, <lon>+1 <lat>-1,<lon>-1 <lat>-1, <lon>-1 <lat>+1,<lon>+1 <lat>+1))))ORDER BY ST_Distance(ST_GeomFromText(POINT(<lon> <lat>)), p)MongoDB with pointspoint = db.point.find( { "p": { "$near" : [ lon, lat ] } } ).limit(1)id = point[0]["LocationId"]location = db.location.find_one( {"_id": id} )Reverse geocoding HowTo
  17. 17. (CC BY) 2013 Nokia 17Centos 68 CPUs, 32GB RAM, all tests with data set in RAMPostGIS 9.1MySQL 5.6.9 RCMariaDB 5.5.29MongoDB 2.0.7Versions
  18. 18. (CC BY) 2013 Nokia 18My data (GB) World (GB)PostGIS polygons 34 165 240PostGIS points 70 340 200MySQL & MariaDB polygons 3.9 18 954MySQL & MariaDB points 18 87 480MongoDB 71 345 060Data size (note that my data set not packed for optimal for size)Size for World is extrapolated by multiplier 4860This is based on 30% of the Earth surface being landPolygons could be smoothened to reduce data set size by factor of 20-100
  19. 19. (CC BY) 2013 Nokia 19Benchmark Results (data set in memory, 8 CPUs)Clients TPS Avg RT (msec) 50% RT 98% RTPostGIS polygons1 138 7 6 184 547 7 6 188 1072 7 6 19PostGIS points1 419 2 24 1613 2 38 3136 3 3PostGIS points disk bound: 100 TPS. Didnt scale with threads.
  20. 20. (CC BY) 2013 Nokia 20Benchmark Results (data set in memory, 8 CPUs)Clients TPS Avg RT (msec) 50% RT 98% RTMySQL polygons1 2866 0 04 10k 0 18 16.5k 0 1MySQL points1 1800 1 14 2110 2 38 1402 6 7Using InnoDB for Location table (non-gis address data) was slightly faster for polygons.Is MySQL faster because it doesnt support projections? -> Try PostGIS with SRID=0.Points approach is stuck in "Creating sort index". (Should increase join buffers and tmp table.)
  21. 21. (CC BY) 2013 Nokia 21Benchmark Results (data set in memory, 8 CPUs)Clients TPS Avg RT (msec) 50% RT 98% RTMariaDB polygons1 2340 0 14 9146 0 18 15k 1 1MariaDB points1 1650 1 14 2270 2 28 1647 5 6MariaDB GIS functions are independent of MySQL, but data format and indexes are the same.Performance within +/- 10% of MySQL.
  22. 22. (CC BY) 2013 Nokia 22Benchmark Results (data set in memory, 8 CPUs)Clients TPS Avg RT (msec) 50% RT 98% RTMongoDB points1 411 2 2 24 454 9 3 208 525 14 7 25PostGIS points1 419 2 24 1613 2 38 3136 3 3MySQL & MariaDB points1 1650 1 14 2270 2 28 1647 5 6
  23. 23. (CC BY) 2013 Nokia 23• Nice linear scalability, stable response times• Most advanced, but "bolted on" user experience• Wasteful in CPU and data size• Decent on disk bound workload• Polygon based performance a small disappointment• Wishlist:• No more feutures needed.• Ease of use and performance please.• Future: Real 3DPostGIS Summary
  24. 24. (CC BY) 2013 Nokia 24MongoDB• Simple: Radius from point (Foursquare)• Combinations possible: type=restaurant within 1 km• Single thread performance ok, but didnt scale• Could be issue with benchmark framework• Main gotcha: dont use python dictionary for (lon, lat)• 2.4 brings lots of enhancements, not covered here.MongoDB Summary
  25. 25. (CC BY) 2013 Nokia 25• 5x better than anything else• For Within()• Contention on sorting by Distance()• Delivered on the vision of polygon based model• Different implementations, same performance• MySQL slightly faster, but within +/- 10%• MariaDB has precise math operations• Wishlist:• Projections (SRID)• InnoDB support• Distance() using RTree indexMySQL & MariaDB Summary
  26. 26. Company Confidential. ©2010 NokiaNokia Internal Use OnlyThank you!For more informationhttp://www.openlife.cc/bloghenrik.ingo@nokia.com

×