Giving MongoDB the way to play with the GIS community
To make GIScience directly supported by the NoSQL Technology, so pre...
Spatial Pyramid – View the world with multiple spatiotemporal scales
1
 Real world example - Spatial Pyramid
 Challenges...
Global
North
America
Canada U.S.A.
Illinois
Champaign
UIUC
Campus
Downtown
Chicago
New York
Asia
South Asia East Asia
Chin...
Open
Layers
Internet
Leaflets
ArcJs
Spatial Pyramid | PostGIS in the Open Stack
LAN
uDig
QGIS
GRASS
ArcGIS
Mapserver
GeoSe...
Spatial Pyramid | Generator Architecture
Spatial Pyramid Generator Architecture
Data Server
Spatial
Pyramid
Generator
Post...
Spatial Pyramid | MongoDB Approach
SpatialPyramid
Requests
Load Balance
MongoS
P
S
S
P
S
S
MongoS
Shard
Shard
C
C C
Config...
 Open Source
– GDAL is released under an X/MIT style Open Source license
– supported by the Open Source Geospatial Founda...
a
Spatial Pyramid | GDAL Architecture
GDAL Driver for MongoDB
– Giving MongoDB the way to play with the GIS community
2
 View MongoDB as a spatial database
 D...
FID Geometry Name States Time Zone
10001 POINT(40.77, 73.98) NYC New York UTC-05:00
10002 POINT(41.90, 87.65) Chicago Illi...
GDAL | spatial database structure
https://lib.stanford.edu/gis
Tables – Layers
Rows – Features
Where is
RDBS
GDAL | Simple Feature Access
RDBMS GeoDatabase MongoDB
Database Datasource Database
Table Layer Collection
Row(s) Feature(s) JSON Document
Field(s) Fie...
 WKT, Well-known text, originally defined by the Open Geospatial
Consortium (OGC) and described in their Simple Feature A...
GDAL | WKT for Spatial data
{
GEM: POINT(41.90, 87.65)
FID:10002
Name: Chicago,
States: Illinois,
Time Zone: UTC-06:00,
}
...
GDAL | WKT for Spatial data
U.S.A
States
Cities
Canada
Roads
G_sys_Metadata
MongoDB Cluster
NYC
Chicago
……
Database
Collec...
GDAL | GeoJSON for spatial data
FID Geometry Name States Time Zone
10001 POINT(40.77, 73.98) NYC New York UTC-05:00
10002 ...
U.S.A
States
Cities
Canada
Roads
G_sys_Metadata
MongoDB Cluster
NYC
Chicago
……
Database
Collection
GeoJSON
Feature
Layer
D...
World
Canada
U.S.A
Oceans
Rivers
Cities
MongoDB Cluster
States
Rivers
……
Database
Collection
FeatureCollection
Layer
Datas...
GDAL | Terminology
* FeatureCollection for GeoJSON format
RDBMS MongoDB GeoDatabase WKT GeoJSON FTCL*
Database Database Da...
GDAL | who is better?
*http://en.wikipedia.org/wiki/Well-known_text
** http://geojson.org/geojson-spec.html
Features WKT G...
 ogr2ogr
– convert simple features data between file formats
– spatial or attribute selections, reducing the set of attri...
Work with various GIS software
MongoDB Works with QGIS
A step forward : MongoGIS
– Mend the way for the GIS community to play with MongoDB
3
 Evolution of spatial database Tech...
GIS Application
Geometries
GeometriesGeometries
files
FID
 20th Century late 80s & early 90s
 RDBMS for attribute data
...
IT
 20th Century mid 90s
 Attributes & Geometries in database
 But geometry as binary large object
 SDE as a middlewar...
GIS
eBusiness
GeometriesAttributes
E-SQL
 20th Century late 90s
 Spatial is a native Data Type
 Attributes & geometries...
BIG DATA Spreading
2008.9
Nature
2009.1
Google
2009.5
UN
Detecting influenza epidemics using search engine query data
Glob...
FeatureSolutions
PostGIS As A Cluster
MongoDB
Cluster
Shared Disk
Failover
File System
Replication
Transaction Log
Shippin...
Solutions OGC SFA SQL/MM GeoJSON ArcSDE PostGIS
Oracle
Spatial MongoDB
Spatial Data Types 17 18 6 +++ ++ ++ GeoJSON
Spatia...
 GDAL driver for mongodb
– The way that mongodb plays with the GIS community
– Work with GDAL community to included in th...
Appreciate Your Time!
Sponsored by the China Scholarship Council for one year program at UIUC, Illinois, USA.
Supported by...
Giving MongoDB a Way to Play with the GIS Community
Upcoming SlideShare
Loading in...5
×

Giving MongoDB a Way to Play with the GIS Community

2,045

Published on

The Geographic Information System (GIS), industry is booming, especially with the continued reliance on online maps and the rise of location-aware mobile devices. GIS tech can be one of the key players in the mobile internet, big data, and the internet of things, and is an essential tool for the next generation of the global IT industry.

Yet, the GIS community is not prepared. With all the data available, GIS experts lack an off-the-shelf solutions to manage the growing volume of spatial data. Relational spatial databases (RSDB) were the leader in this field for decades, but RSDBs have failed to innovate to handle massive volumes of data coming in at high velocity.

Fortunately, MongoDB a useful tool for this challenge, but needs some tooling to create a connector to the GIS tech ecosystem. In order to bridge the gap, we built a pipeline to comply with the architecture of the Geospatial Data Abstraction Library (GDAL), so that MongoDB can work with most of popular GIS tools such as OpenLayers, Mapserver, GeoServer, QGIS, ArcGIS and others with ease. In this talk, I'll go through this pipeline tool and showcase some examples of how you can use this in your next application.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,045
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
37
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Giving MongoDB a Way to Play with the GIS Community

  1. 1. Giving MongoDB the way to play with the GIS community To make GIScience directly supported by the NoSQL Technology, so prepared for BIG DATA ERA Jiangsu Key Laboratory of Geographical Information Technology, Nanjing University. Cyber-Infrastructure and Geospatial Information Laboratory (CIGI), Department of Geography, School of Earth, Society and Environment, National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, Illinois, USA Jun. 25, 2014 Hanson Shuai Zhang shuai@illinois.edu
  2. 2. Spatial Pyramid – View the world with multiple spatiotemporal scales 1  Real world example - Spatial Pyramid  Challenges with PostGIS  Handling with MongoDB cluster
  3. 3. Global North America Canada U.S.A. Illinois Champaign UIUC Campus Downtown Chicago New York Asia South Asia East Asia China Shanghai Beijing Olympic Park Xidan Street Japan Spatial Pyramid | Introduction
  4. 4. Open Layers Internet Leaflets ArcJs Spatial Pyramid | PostGIS in the Open Stack LAN uDig QGIS GRASS ArcGIS Mapserver GeoServer ArcServer PostGIS
  5. 5. Spatial Pyramid | Generator Architecture Spatial Pyramid Generator Architecture Data Server Spatial Pyramid Generator PostGIS HPC Cluster Pyrimad Model Python OGR, MPI Postgre SQL What is ArcSDE 8? 2.3 hours !!
  6. 6. Spatial Pyramid | MongoDB Approach SpatialPyramid Requests Load Balance MongoS P S S P S S MongoS Shard Shard C C C Config GDAL/OGR 15 minutes !!
  7. 7.  Open Source – GDAL is released under an X/MIT style Open Source license – supported by the Open Source Geospatial Foundation  A library for geospatial data formats – abstract data model conformed to OGC standards. – 133 raster data formats, 79 vector data formats  Widely used by the GIS community – 88 software listed in the gdal.org using GDAL  Basic Library for HPGC – We use GDAL as the basic tools to build high performance computing algorithms Spatial Pyramid | GDAL Library
  8. 8. a Spatial Pyramid | GDAL Architecture
  9. 9. GDAL Driver for MongoDB – Giving MongoDB the way to play with the GIS community 2  View MongoDB as a spatial database  Design GDAL Driver for MongoDB  Cooperate with other GIS tools
  10. 10. FID Geometry Name States Time Zone 10001 POINT(40.77, 73.98) NYC New York UTC-05:00 10002 POINT(41.90, 87.65) Chicago Illinois UTC-06:00 Feature – a spatial object Point Line Polygon Geometries Attributes, Non-Spatial Data GDAL | spatial database structure Spatial Relational Table 1 2 3
  11. 11. GDAL | spatial database structure https://lib.stanford.edu/gis Tables – Layers Rows – Features Where is RDBS
  12. 12. GDAL | Simple Feature Access
  13. 13. RDBMS GeoDatabase MongoDB Database Datasource Database Table Layer Collection Row(s) Feature(s) JSON Document Field(s) Field(s) Key:Value Index R tree Index Join Join Embedding & Linking Partition — Shard GDAL | Terminology
  14. 14.  WKT, Well-known text, originally defined by the Open Geospatial Consortium (OGC) and described in their Simple Feature Access and Coordinate Transformation Service specifications. GDAL | WKT for Spatial data Type Examples Point POINT (30 10) LineString LINESTRING (30 10, 10 30, 40 40) Polygon POLYGON ((30 10, 10 20, 20 40, 40 40, 30 10)) POLYGON ((35 10, 10 20, 15 40, 45 45, 35 10), (20 30, 35 35, 30 20, 20 30))  In total, there are 18 distinct geometric objects that can be represented. http://en.wikipedia.org/wiki/Well-known_text
  15. 15. GDAL | WKT for Spatial data { GEM: POINT(41.90, 87.65) FID:10002 Name: Chicago, States: Illinois, Time Zone: UTC-06:00, } FID Geometry Name States Time Zone 10001 POINT(40.77, 73.98) NYC New York UTC-05:00 10002 POINT(41.90, 87.65) Chicago Illinois UTC-06:00 WKT Geospatial Metadata collection
  16. 16. GDAL | WKT for Spatial data U.S.A States Cities Canada Roads G_sys_Metadata MongoDB Cluster NYC Chicago …… Database Collection WKT Feature Layer Datasource |c_name | coord_d | src | type | Extent| +----------------------+-------------------+ | Cities | 2 | 4326 | Point | [p1,p2] | States | 2 | 4326 | Polygon | [p1,p2] No spatial Index
  17. 17. GDAL | GeoJSON for spatial data FID Geometry Name States Time Zone 10001 POINT(40.77, 73.98) NYC New York UTC-05:00 10002 POINT(41.90, 87.65) Chicago Illinois UTC-06:00 { type: "Feature", properties: { FID:10002 Name: Chicago, States: Illinois, Time Zone: UTC-06:00, }, geometry: { type: "Point", coordinates: [ 41.90 87.63] } } GeoJSON Geospatial Metadata collection
  18. 18. U.S.A States Cities Canada Roads G_sys_Metadata MongoDB Cluster NYC Chicago …… Database Collection GeoJSON Feature Layer Datasource |c_name | coord_d | src | type | Extent| +----------------------+-------------------+ | Cities | 2 | 4326 | Point | [p1,p2] | States | 2 | 4326 | Polygon | [p1,p2] GDAL | GeoJSON for spatial data
  19. 19. World Canada U.S.A Oceans Rivers Cities MongoDB Cluster States Rivers …… Database Collection FeatureCollection Layer Dataset Datasource GDAL | FeatureCollection { "type": "FeatureCollection", " crs " :{…} " bbox " :[….] "features": [ { "type": "Feature", "geometry": { "type": "Point", "coordinates": […] }, "properties": {"prop0": "value0"} }, … ] }
  20. 20. GDAL | Terminology * FeatureCollection for GeoJSON format RDBMS MongoDB GeoDatabase WKT GeoJSON FTCL* Database Database Datasource Datasource Datasource Datasource Table Collection Layer Layer Layer Dataset Row(s) JSON Document Feature Feature Feature Layer Index Index R tree — Grid Index Grid Index Join Embedding & Linking Join Embedding & Linking Embedding & Linking Embedding & Linking Partition Shard — Shard Shard Shard
  21. 21. GDAL | who is better? *http://en.wikipedia.org/wiki/Well-known_text ** http://geojson.org/geojson-spec.html Features WKT GeoJSON Feature Collection Structure Flexible & Tight Semi- Semi- & un- Spatial Index NO Grid Index Grid Index Efficiency SLOW FAST MEDIUM Self-explanatory NO YES with semi- YES Easy-sharing MEDIUM MEDIUM CONVENIENT Geometry types ALL SFA, 18* LIMITED, 6** LIMITED, 6**
  22. 22.  ogr2ogr – convert simple features data between file formats – spatial or attribute selections, reducing the set of attributes, – setting the output coordinate system or even reprojecting – Extract, Transform, and Load (ETL) Tools for MongoDB Geospatial GDAL | Load all sorts of spatial data
  23. 23. Work with various GIS software
  24. 24. MongoDB Works with QGIS
  25. 25. A step forward : MongoGIS – Mend the way for the GIS community to play with MongoDB 3  Evolution of spatial database Tech  Comparison of spatial database solutions  Roadmap to make the way
  26. 26. GIS Application Geometries GeometriesGeometries files FID  20th Century late 80s & early 90s  RDBMS for attribute data  File systems for geometry data.  An unique ID of feature link the two  ESRI Shapefile is one of most famous  Problems with data integrity, multiuser access and editing 1st Generation | Hybrid Solution Standard SQL Geoprocessing Attributes
  27. 27. IT  20th Century mid 90s  Attributes & Geometries in database  But geometry as binary large object  SDE as a middleware by GIS venders  Geometries are not understandable.  Poor integration, no spatial structure query language 2nd Generation | Spatial Database Engine SDE Attributes Geometries GeometriesGeometries blobsSQL GIS Application
  28. 28. GIS eBusiness GeometriesAttributes E-SQL  20th Century late 90s  Spatial is a native Data Type  Attributes & geometries all in  Rich GIS functions built inside  Supported by major DB venders  Spatial data queried using E-SQL  DB functionality fully supported E-SQL GISGIS eBusiness eBusiness 3rd Generation | Object-based Spatial Database
  29. 29. BIG DATA Spreading 2008.9 Nature 2009.1 Google 2009.5 UN Detecting influenza epidemics using search engine query data Global Plus Project "Big Data for Development: Opportunities & Challenges”: A Global Pulse White Paper 2009.12 Microsoft The Fourth Paradigm: Data-Intensive Scientific Discovery 2011.2 Science Dealing with data highlight both the challenges posed by the data deluge and the opportunities that can be realized if we can better organize and access the data. 2012.3 The White House Big Data Initiative more than $200 million to big data research projects.
  30. 30. FeatureSolutions PostGIS As A Cluster MongoDB Cluster Shared Disk Failover File System Replication Transaction Log Shipping Trigger-Based Master-Standby Replication Statement-Based Replication Middleware Asynchronous Multi-Master Replication Implementation NAS DRBD Streaming Slony-I pgpool-II Bucardo Sharding Communication Shared Disk Disk Blocks WAL Table Rows SQL Table Rows olog No Special Hardware × √ √ √ √ √ √ Data Synchronous Sync Sync Sync, Async Async Sync Async Sync Replication Method × M-S M-S M-S M-M, M-S M-M, M-S M-M No Master Overhead √ × √ × √ √ √ Failover No Data Loss √ √ With Sync On × √ × √ Failover for HA Fast Fast Fast with Hot Manual Hard to Re-attach × Fast Writes Scalability × × × × With M-M √ Good Reads Scalability × × With Hot √ √ √ Good Parallel Query × × × × With M-M √ √ Complexity For Admin Low Low Low High Very High High Low Load Balancing × × × × √ × √ MongoDB as a High Performance Database
  31. 31. Solutions OGC SFA SQL/MM GeoJSON ArcSDE PostGIS Oracle Spatial MongoDB Spatial Data Types 17 18 6 +++ ++ ++ GeoJSON Spatial Reference -- -- -- +++ +++ +++ WGS84 Spatial Index -- -- -- R tree Gist, Rtree R tree GeoHash Geometry I/O √ √ -- +++ +++ ++ × Geometry Accessors √ √ -- +++ ++ ++ × Geometry Editors -- -- -- +++ ++ + × Topological Info -- √ -- +++ ++ +++ × Spatial Measurements √ √ -- +++ ++ ++ × Geo-processing √ √ -- +++ ++ ++ × Spatial Relationships √ √ -- +++ ++ ++ 4 GIS Tech Ecosystems -- -- -- +++ +++ + × MongoDB as a spatial database
  32. 32.  GDAL driver for mongodb – The way that mongodb plays with the GIS community – Work with GDAL community to included in the next release – Open Source: https://github.com/mongogis/mongodb-gdal-driver  MongoGIS – The Next Generation Infrastructure for the GIS community – MongoGIS group in the github: https://github.com/mongogis – We may build it together! MongoGIS in github
  33. 33. Appreciate Your Time! Sponsored by the China Scholarship Council for one year program at UIUC, Illinois, USA. Supported by the Scientific Research Foundation of Graduate School of Nanjing University. Great Thanks go to Craig Wilson, Greg Steinbruner for their precious advices.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×