Reading and writing spatial data for the non-spatial programmer
                                                                                                          Chad Cooper
                                                                                           Center for Advanced Spatial Technologies
          For help, visit:                                                                    University of Arkansas, Fayetteville
                                                                                         http://cast.uark.edu | chad@cast.uark.edu



                                    The Problem                                                                                                              The Solutions
Location has become ubiquitous in today’s society and is integral in everything from web                            Fortunately, Python is tightly integrated, accepted, and used within the GIS community, and has been for
applications, to smartphone apps, to automotive navigation systems. Spatial data, often derived                     some time. Python packages and other libraries that are accessible through Python exist to both read and
from Geographic Information Systems (GIS), drives these applications at their core. More and                        write many common (and some not so common) spatial data formats. With the help of these packages
more, non-spatial developers and programmers with little or no knowledge of spatial data                            and libraries, Python developers can manipulate, read, and write many spatial data formats.
formats are being tasked with working with and consuming spatial data in their applications.
Spatial data exists in a wide variety of formats which often adds to the confusion and complexity.                                           (Some) Libraries for working with spatial data
We need to be able to make sense of the formats and read/write them with Python.
                                                                                                                                                 Translator library (C++)                    import ogr
                                                                                                                                                                                             driver = ogr.GetDriverByName(“ESRI Shapefile”)
                                                                                                                                                 GDAL – raster data
                    Spatial data formats (a [very] small sampling)                                                                GDAL/OGR       OGR – vector data
                                                                                                                                                                                             ds = driver.Open(“world.shp”)
                                                                                                                                                                                             layer = ds.GetLayer()
                                                                                                                                                                                             feat_count = layer.GetFeatureCount()
                                                                                                                                                 Python bindings available.         Open     extent = layer.GetExtent()
File-based                                                                                                                                       source.
                                                                                                                                                                                             >>> import shapefile
                                                                                                                                                                                             >>> sf = "Farms"
                             Vector storage of points, lines,                                                                                                                                >>> sfr = shapefile.Reader(sf)
                             polygons. Open specification.                                                                                       Pure Python library for reading and         >>> sfr.fields
                             Around since early 90's. Very                                                                        pyshp          writing Esri shapefiles. Compatible with    [['ID', 'C', 254, 0],
              Shapefile                                                                                                                          Python 2.4 to 3.x. Open source.             ['Lat', 'F', 19, 11],
                             common and widely used and                                                                                                                                      ['Lon', 'F', 19, 11],
                                                                                                                                                                                             ['Farm_Name', 'C', 254, 0]]
                             available. Made up of at least
                             3 files: .shp, .shx, .dbf                                                                                           Python package for programming with         >>>coords = [(0, 0), (1, 1)]
                                                                                                                                                 2D geospatial geometries. Perform           >>> LineString(coords).contains(Point(0.5, 0.5))
                                                                                                                                  shapely        PostGIS type geometry operations
                                                                                                                                                                                             True
                                                                                                                                                                                             >>> Point(0.5, 0.5).within(LineString(coords))
                                                                                                                                                 outside of an RDBMS. Open source.           True
                             Columns and rows of data.
                             Think elevation data or land
                                                                                                                                                                                             >>> import pyproj
              Raster         use where each cell stores a                                                                                        Performs cartographic transformations       >>> lat1, lon1 = (36.076040, -94.137640)
                             single value. Can be ASCII or                                                                                       and geodetic computations. Convert          >>> lat2, lon2 = (37.404473, -121.975150)
                                                                                                                                                                                             >>> geod = pyproj.Geod(ellps="WGS84")
                             binary (TIFF, JPEG, etc.).                                                                                          from lat/lon to x/y or between
                                                                                                                                  pyproj                                                     >>> angle1, angle2, distance = geod.inv(lon1,
                                                                                                                                                 projected coordinate systems. Perform       lat1, lon2, lat2)
                                                                                                                                                                                             >>> print "It's %0.0f miles to Chad's house from
                                                                                                                                                 Great Circle computations. Wraps            PyCon 2012." % (distance * 0.000621)
                                                                <kml xmlns:gx="http://www.google.com/kml/ext/
                             Keyhole Markup Language.           2.2" xmlns:atom="http://www.w3.org/2005/Atom"                                    PROJ.4 library.                             It's 1541 miles to Chad's house from PyCon 2012.
                                                                xmlns="http://www.opengis.net/kml/2.2">
                             XML notation. Can be used in         <Placemark>                                                                    C/C++ library for reading and writing the
   .kml                      Google Earth and Google                <name>PyCon 2012</name>                                                                                                  from liblas import file
   .kmz       KML                                                   <Point>                                                                      LAS LiDAR format. A building block for      f = file.File('file.las',mode='r')
                             Maps. Can be simple with just            <coordinates>-121.9751,37.4044,0</                          libLAS         developers looking to implement their       for p in f:
                             geographic data or complex         coordinates>
                                                                                                                                                 own LiDAR data processing. Open
                                                                                                                                                                                                 print 'X,Y,Z: ', p.x, p.y, p.z
                                                                    </Point>
                             with styling. Vector or raster.      </Placemark>                                                                   source. Python API.
                                                                </kml>                                                                                                                       from pysqlite2 import dbapi2 as sqlite
                                                                                                                                                                                             conn = sqlite.connect('test.db')

                             Light Detection and Ranging.                                                                                        Python binding for the SQLite database      conn.enable_load_extension(True)
                                                                                                                                  PySqLite                                                   conn.execute(‘SELECT load_extension(
                             Distance to object measured                                                                                         engine.                                                  “libspatialite.dll”)’)
   .las                                                                                                                                                                                      cursor = conn.cursor()
   .laz       LiDAR          by illuminating target with
                             light, often from a laser. Point                                                                                                                                from lxml import etree
                                                                                                                                                                                             from pykml.factory import KML_ElementMaker as KML
                             clouds, elevation data.                                                                                                                                         doc = KML.kml(KML.Placemark(
                                                                                                                                                 Python package for creating, parsing,               KML.name('PyCon 2012'),
                                                                                                                                                                                                     KML.Point(
                                                                                                                                  pyKML          manipulating, and validating KML. Open                 KML.coordinates('-121.9751,37.4044,0'),
                                                                                                                                                 source.
Geodatabases                                                                                                                                                                                         ),),)
                                                                                                                                                                                             outfile = file(__file__.rstrip('.py')+'.kml','w')
                                                                                                                                                                                             outfile.write(etree.tostring(
                                                                                                                                                                                                           doc, pretty_print=True))
                             Spatially enables PostgreSQL.
              PostGIS        FOSS. Stores vector and raster                                                                                      Add-on for Django that turns it into a
                             data. Native Python support.                                                                         geodjango      geographic Web framework. Cross
                                                                                                                                                 platform.

                             Open source library that                                                                                            C++ open source toolkit for developing      import mapnik
                             extends SQLite to support                                                                                                                                       m = mapnik.Map(500, 500)
                                                                                                                                                 mapping applications. Python bindings.
              SpatiaLite     spatial capabilities. Stores
                                                                                                                                                                                             ...
                                                                                                                                  mapnik         For desktop and web development.            s = mapnik.Style()
                             vector and raster data.                                                                                                                                         ds = mapnik.Shapefile(file=”world.shp”)
                                                                                                                                                 Uses shapefile, PostGIS, GDAL/OGR           ...
                                                                                                                                                 datasources.                                mapnik.render_to_file(m, “world.png”, “png”)

                          Esri proprietary file-based
                          storage system. C++ API                                                                                                                                            import arcpy
                                                                                                                                                 Site package for performing geographic      import os
              File        recently released (could be                                                                                            data analysis, data conversion, data        fc_list = arcpy.ListFeatureClasses(gdb)
                                                                                                                                                                                             for in_fc in fc_list:
              Geodatabase wrapped with SWIG to use                                                                                arcpy          management, and map automation                  desc = arcpy.Describe(in_fc)
                          with Python). Stores vector                                                                                            with Python. Not open source.                   if desc.shapeType == 'Point':
                          and raster data.                                                                                                                                                           arcpy.Buffer_analysis(in_fc, out_fc,
                                                                                                                                                 Integrated with Esri ArcGIS suite.                                        3000)

Reading and writing spatial data for the non-spatial programmer

  • 1.
    Reading and writingspatial data for the non-spatial programmer Chad Cooper Center for Advanced Spatial Technologies For help, visit: University of Arkansas, Fayetteville http://cast.uark.edu | chad@cast.uark.edu The Problem The Solutions Location has become ubiquitous in today’s society and is integral in everything from web Fortunately, Python is tightly integrated, accepted, and used within the GIS community, and has been for applications, to smartphone apps, to automotive navigation systems. Spatial data, often derived some time. Python packages and other libraries that are accessible through Python exist to both read and from Geographic Information Systems (GIS), drives these applications at their core. More and write many common (and some not so common) spatial data formats. With the help of these packages more, non-spatial developers and programmers with little or no knowledge of spatial data and libraries, Python developers can manipulate, read, and write many spatial data formats. formats are being tasked with working with and consuming spatial data in their applications. Spatial data exists in a wide variety of formats which often adds to the confusion and complexity. (Some) Libraries for working with spatial data We need to be able to make sense of the formats and read/write them with Python. Translator library (C++) import ogr driver = ogr.GetDriverByName(“ESRI Shapefile”) GDAL – raster data Spatial data formats (a [very] small sampling) GDAL/OGR OGR – vector data ds = driver.Open(“world.shp”) layer = ds.GetLayer() feat_count = layer.GetFeatureCount() Python bindings available. Open extent = layer.GetExtent() File-based source. >>> import shapefile >>> sf = "Farms" Vector storage of points, lines, >>> sfr = shapefile.Reader(sf) polygons. Open specification. Pure Python library for reading and >>> sfr.fields Around since early 90's. Very pyshp writing Esri shapefiles. Compatible with [['ID', 'C', 254, 0], Shapefile Python 2.4 to 3.x. Open source. ['Lat', 'F', 19, 11], common and widely used and ['Lon', 'F', 19, 11], ['Farm_Name', 'C', 254, 0]] available. Made up of at least 3 files: .shp, .shx, .dbf Python package for programming with >>>coords = [(0, 0), (1, 1)] 2D geospatial geometries. Perform >>> LineString(coords).contains(Point(0.5, 0.5)) shapely PostGIS type geometry operations True >>> Point(0.5, 0.5).within(LineString(coords)) outside of an RDBMS. Open source. True Columns and rows of data. Think elevation data or land >>> import pyproj Raster use where each cell stores a Performs cartographic transformations >>> lat1, lon1 = (36.076040, -94.137640) single value. Can be ASCII or and geodetic computations. Convert >>> lat2, lon2 = (37.404473, -121.975150) >>> geod = pyproj.Geod(ellps="WGS84") binary (TIFF, JPEG, etc.). from lat/lon to x/y or between pyproj >>> angle1, angle2, distance = geod.inv(lon1, projected coordinate systems. Perform lat1, lon2, lat2) >>> print "It's %0.0f miles to Chad's house from Great Circle computations. Wraps PyCon 2012." % (distance * 0.000621) <kml xmlns:gx="http://www.google.com/kml/ext/ Keyhole Markup Language. 2.2" xmlns:atom="http://www.w3.org/2005/Atom" PROJ.4 library. It's 1541 miles to Chad's house from PyCon 2012. xmlns="http://www.opengis.net/kml/2.2"> XML notation. Can be used in <Placemark> C/C++ library for reading and writing the .kml Google Earth and Google <name>PyCon 2012</name> from liblas import file .kmz KML <Point> LAS LiDAR format. A building block for f = file.File('file.las',mode='r') Maps. Can be simple with just <coordinates>-121.9751,37.4044,0</ libLAS developers looking to implement their for p in f: geographic data or complex coordinates> own LiDAR data processing. Open print 'X,Y,Z: ', p.x, p.y, p.z </Point> with styling. Vector or raster. </Placemark> source. Python API. </kml> from pysqlite2 import dbapi2 as sqlite conn = sqlite.connect('test.db') Light Detection and Ranging. Python binding for the SQLite database conn.enable_load_extension(True) PySqLite conn.execute(‘SELECT load_extension( Distance to object measured engine. “libspatialite.dll”)’) .las cursor = conn.cursor() .laz LiDAR by illuminating target with light, often from a laser. Point from lxml import etree from pykml.factory import KML_ElementMaker as KML clouds, elevation data. doc = KML.kml(KML.Placemark( Python package for creating, parsing, KML.name('PyCon 2012'), KML.Point( pyKML manipulating, and validating KML. Open KML.coordinates('-121.9751,37.4044,0'), source. Geodatabases ),),) outfile = file(__file__.rstrip('.py')+'.kml','w') outfile.write(etree.tostring( doc, pretty_print=True)) Spatially enables PostgreSQL. PostGIS FOSS. Stores vector and raster Add-on for Django that turns it into a data. Native Python support. geodjango geographic Web framework. Cross platform. Open source library that C++ open source toolkit for developing import mapnik extends SQLite to support m = mapnik.Map(500, 500) mapping applications. Python bindings. SpatiaLite spatial capabilities. Stores ... mapnik For desktop and web development. s = mapnik.Style() vector and raster data. ds = mapnik.Shapefile(file=”world.shp”) Uses shapefile, PostGIS, GDAL/OGR ... datasources. mapnik.render_to_file(m, “world.png”, “png”) Esri proprietary file-based storage system. C++ API import arcpy Site package for performing geographic import os File recently released (could be data analysis, data conversion, data fc_list = arcpy.ListFeatureClasses(gdb) for in_fc in fc_list: Geodatabase wrapped with SWIG to use arcpy management, and map automation desc = arcpy.Describe(in_fc) with Python). Stores vector with Python. Not open source. if desc.shapeType == 'Point': and raster data. arcpy.Buffer_analysis(in_fc, out_fc, Integrated with Esri ArcGIS suite. 3000)