Geoindexing with MongoDB

2,478 views

Published on

Presentation from WebClusters 2012 conference

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,478
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
31
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Geoindexing with MongoDB

  1. 1. Geoindexingwith MongoDB Leszek Krupiński WebClusters 2012
  2. 2. About me
  3. 3. On-line since 1997
  4. 4. Funny times
  5. 5. 1 hr of internet for 1 USD
  6. 6. First social site: geocities
  7. 7. My first web page
  8. 8. What do I do now
  9. 9. Day-time jobManaging team of developers for Polish Air Force
  10. 10. Side:consulting, optimizing, de siging
  11. 11. Buzzwords incoming!
  12. 12. The Internet 2008
  13. 13. Web 2.0
  14. 14. http://en.wikipedia.org/wiki/File:Web_2.0_Map.svg CC-BY-SA-2.5
  15. 15. Be social in your bedroom
  16. 16. alone.
  17. 17. The Internet 2012
  18. 18. Web 3.0
  19. 19. Why geospatial?
  20. 20. Needs shifted
  21. 21. Why?Because they could.
  22. 22. How to implement?
  23. 23. Database. Duh.
  24. 24. Keep, but also query
  25. 25. Is there a person at 53.438522,14.52198? Nope.Is there a person at 53.438522,14.52199? Nope.Is there a person at 53.438522,14.52199? Yeah, here’s Johnny!
  26. 26. Not too useful.
  27. 27. Give me nearby homies.Within the range of 1 km there is:• Al Gore (53.438625,14.52103)• Bill Clinton (53.432531,14.55127)• Johnny Bravo (53.438286,14.52363)
  28. 28. Now that’s better.
  29. 29. Geoindexing.Nothing new.
  30. 30. Oracle, PostreSQL,Lucene/Solr, even MySQL (via extensions)
  31. 31. SELECT c.holding_company, c.location FROM competitor c, bank bWHERE b.site_id = 1604 AND SDO_WITHIN_DISTANCE(c.location, b.location, ’distance=2 unit=mile’) = ’TRUE’ ORACLE
  32. 32. SQL is so last year
  33. 33. Let’s use something cool
  34. 34. MongoDB.Because all the cool kids use NoSQL now
  35. 35. Why MongoDB?
  36. 36. Choose your NoSQL wise.
  37. 37. NoSQL in MongoDB • Document –based • Queries (JS-like syntax) • JSON-like storage
  38. 38. Why MongoDB?Features Use Cases• Ad hoc queries • Archiving• Indexing • Event logging• Replication • Document and CMS• Load Balancing • Gaming• File Storage • High volume sites• Aggregation • Mobile• Server-side JavaScript • Operational datastore• Capped collections • Agile development • Real-time stats http://en.wikipedia.org/wiki/Mongodb
  39. 39. Back to geo.
  40. 40. { loc: [ 52.0, 21.0 ], name: ”Warsaw”, type: ”City”}
  41. 41. db.nodes.ensureIndex({loc: 2d})
  42. 42. That’s it.
  43. 43. Query• Exact o db.places.find( { loc : [50,50] } )• Near o db.places.find( { loc : { $near : [50,50] } } )• Limit o db.places.find( { loc : { $near : [50,50] } } ).limit(20)• Distance o db.places.find( { loc : { $near : [50,50] , $maxDistance : 5 } } ).limit(20)
  44. 44. Compound index• db.places.ensureIndex( { location : "2d" , category : 1 } );• db.places.find( { location : { $near : [50,50] }, category : coffee‚ } );
  45. 45. Bound queries• box = [ [40.73083, -73.99756], [40.741404, -73.988135] ]• db.places.find( {"loc" : {"$within" : {"$box" : box }} } )
  46. 46. Problems
  47. 47. Units
  48. 48. Coordinates in arc units Distance in kilometers
  49. 49. In query
  50. 50. earthRadius = 6378 // kmmulti = earthRadius * PI / 180.0range = 3000 // km… maxDistance : range * multi…
  51. 51. In results
  52. 52. pointDistance = distances[0].dis / multi
  53. 53. Earth is not flat.
  54. 54. Problem: can’t use linear distance
  55. 55. Earth isn’t flat too.
  56. 56. Solution?Use approximation.
  57. 57. MongoDB has it built-indistances = db.runCommand( { geoNear : "points", near : [0, 0], spherical : true, maxDistance : range / earthRadius /* to radians */} ).results
  58. 58. Focus: runCommanddistances = db.runCommand({ geoNear : "points" …
  59. 59. Sort by distance Only with runCommand
  60. 60. Automatically sorted• db.runCommand( { geoNear : "places" , near : [50,50], num : 10 } );• { "ns" : "test.places", "results" : [ { "dis" : 69.29646421910687, "obj" : … }, { "dis" : 69.29646421910687, "obj" : … }, … ], … }
  61. 61. Demo
  62. 62. OpenStreetMaps database of Poland imported into MongoDB
  63. 63. 14.411.552 nodes
  64. 64. 3GB of raw XML data
  65. 65. PHP in virtual machine
  66. 66. Imported about 100.000 nodes every 10s.
  67. 67. Pretty cool, eh?
  68. 68. Kudos to Derick Rethans Part of this talk was inspired by his talk
  69. 69. Questions?
  70. 70. Thanks!Rate me at https://joind.in/talk/view/6475
  71. 71. Geoindexingwith MongoDB supplement Leszek Krupiński WebClusters 2012
  72. 72. Why MongoDB?
  73. 73. Evaluate.
  74. 74. PostGIS is cool too. (but it’s SQL, meh)
  75. 75. Why MongoDB?Features Use Cases• Ad hoc queries • Archiving• Indexing • Event logging• Replication • Document and CMS• Load Balancing • Gaming• File Storage • High volume sites• Aggregation • Mobile• Server-side JavaScript • Operational datastore• Capped collections • Agile development • Real-time stats http://en.wikipedia.org/wiki/Mongodb
  76. 76. If you need other features of MongoDB, use it
  77. 77. If you don’t, evaluate.
  78. 78. Evaluate.
  79. 79. Demo (hopefully)
  80. 80. Questions?
  81. 81. Please leave feedback! Rate me at https://joind.in/6475

×