Geospatial web services using little­known GDAL 
features and modern Perl middleware
Ari Jolma
FOSS4G 2016
Bonn, August 25, 2016
About me
● Academic, Engineer, Scientist
● Previously at Aalto University
– Water resources, geoinformatics, environmental informatics
● Now at Finnish Environment Institute
– Marine Spatial Planning
● Just another Perl hacker
● GDAL user and developer, maintain Perl interface Geo::GDAL)
● Github and CPAN: ajolma
● ari.jolma at gmail.com
Overview of the talk
● Geospatial web services
– Programs (on servers) accessed by other programs (on clients)
● Actually a stack of programs and code on the server
– Data discovery, data visualization, data access
– Standardized by The Open Geospatial Consortium
● Little-known GDAL features
– Redirection of /vsistdout/
– Utilities as functions (not so obscure but new)
● Modern Perl middleware
– Software based on PSGI
GDAL virtual file system (VSI)
● Many GDAL drivers read from and write to files
● GDAL has virtualized (added a layer of code) the file I/O
● Benefits of that are
– things can be done to the data before it is read/written
– sources and sinks that are not really files can be made looking like files for
the drivers
● Read/write to
– ZIP file
– HTTP server
– Memory buffer
– See Even's blog from May 18, 2012
Redirection of /vsistdout/
● /vsistdout/ is GDAL virtualized stdout (uses fwrite to write to
stdout)
● GDAL provides an API function to change fwrite and stdout
– void VSIStdoutSetRedirection(VSIWriteFunction fct, FILE* stream)
● I built for the GDAL Perl interface a mechanism, which exploits
this so that
– One can use a specific perl object as the data source (sink) name when
creating a dataset object
– The only requirement is that the perl object in question must belong to a
class, which has write and close methods
– A related bug (VSIFCloseL does not flush the file, #6149) was fixed for
1.11.4, 2.0.1, and 2.1
Sidenote
● Even: ”/vsistdout/ redirection was developed for
MapServer, for FCGI and streamable OGR
output”
– https://github.com/mapserver/mapserver/commit/2c
5aad92a1e441fea8b08e38c995f8c3f831473a
PSGI
●
Perl Web Server Gateway Interface Specification
– Tatsuhiko Miyagawa, 2009
●
Inspired by Python Web Server Gateway Interface specification, Ruby's Rack specification, and
JavaScript Gateway Interface specification
●
Covers plain old CGI and mod_perl but allows better solutions (frameworks)
– CGI.pm has been removed from the core Perl 5 distribution (from 5.22)
– mod_perl is an Apache specific solution and things like Nginx are gaining popularity
●
Defines Middleware as a program (server), which takes an application and
provides an environment for it to run
– Both the application and the server are written in Perl
– The application is a code reference (a subroutine)
– The server prepares the input for the application and post-processes its output
Typical(? my) setup
● Nginx web server
– Provides various root level locations for websites on a server
– A location may be Perl services
● Example: http://biwatech.com:800/Eurajoki
● Configured with proxy_pass
● Perl services are provided with Starman (high-performace
preforking PSGI web server)
– Starman workers read a configuration file
● Which is Perl code and mainly links URLs to methods from application
objects
● In the case of geospatial web services and my case, this is objects from
Geo::OGC::Service::* classes
Plack
● Perl Superglue for Web frameworks and Web Servers (PSGI toolkit)
● Plack is a set of tools for using the PSGI stack. It contains middleware
components, a reference server and utilities for Web application
frameworks. Plack is like Ruby's Rack or Python's Paste for WSGI.
● Plack::Builder
– For the Starman configuration file
● Plack::Component
– Superclass of application classes (Geo::OGC::Service)
● Plack::Request
– The request coming from the client
OGC Services (from hacker point of view)
● Many ways to access: GET, POST, REST
● Need to be configured, often quite extensively
● Similar requests and responses
● CORS
– Separate user interface and data provision
● Use of XML for message exchange
● Geo::OGC::Service class
Nginx
Geo::OGC::Service
object
service.psgi
Starman
Geo::OGC::Service::X
object
Plack::Builder
Plack::Component
uses
is-a
Call from Internet
Message exchange
● Geo::OGC::Service class overrides the call method of Plack::Component class
– And returns a reference to an anonymous subroutine
● Which calls the respond method of Geo::OGC::Service with two arguments: responder and environment
● Plack gives environment to the call method
● Plack gives responder to the anonymous subroutine
● environment contains the message from the client
● The service (us) uses the responder to send a message to the client
– Either in one call (status, mime type, content) or in a delayed fashion, when it gets an object with which
●
It (we) may use a write method
●
Plack will call a close method
●
So, to send content to the client we need an object, which has these two methods, and we
can have it because of the vsistdout redirection in GDAL
OK, nice, but what concrete have I done that
others can enjoy, fix, or extend?
●
The stuff in GDAL Perl interface & requested a fix to the bug in
GDAL VSI
● Geo::OGC::Service class
– + 4 classes for common things (CORS response and constructing XML
messages)
●
Geo::OGC::WFS class
– GetCapabilities, GetFeature, Transaction
– Mostly for the PostgreSQL driver
● Geo::OGC::WMTS class
– WMTS and TMS Tile service
– Tiles can be files or made on the fly
Why?
● Because it could be done and I had the time
● To demonstrate a, perhaps, better way to build these
services using existing tools and a modular approach
● To learn PSGI and Plack
– Most of the code was originally written for plain old CGI
● To test some extensions
– WFS: ”pseudo credentials”
– On-the-fly processing of raster tiles
WFS ”pseudo credentials”
● The requirement:
– ”The users of a web mapping page should be able to submit
geographical objects without first creating a user account.
However, they should be able to edit the objects later.”
– I.e., the ”account” should be created on-the-fly
● The solution:
– Specific attributes (username, password) are not included in
the published attribute list but they can be used in the filter
and are required for a transaction
On-the-fly production and
processing of raster tiles
● Geo::OGC::Service::WMTS has a make_tile method, which gets the request and the configuration
data of the requested resource
● The resource is opened as a GDAL dataset
● The extent of the tile is computed
● If the resource is configured with processing
– The tile extent is expanded with two pixels
– Translate is called on the dataset to clip the tile from the source into memory (/vsimem/)
– Processing is applied to the memory tile dataset (DEMProcessing)
– The dataset is replaced with the result
– The tile extent is shrunk with two pixels
– Note the use of GDAL utilities as functions in Perl
● Translate is called on the dataset with a writer object obtained from the responder
● Examples
– http://www.pyhajarvensuojelu.net/valuma-aluetyo/ (click on Karttatasojen valinta → Lisätasot → Maasto)
– http://arijolma.org/SmartSea/ (experimental, will change, suboptimal WMTS + no processing based on dataset)
Latest (experimental)
● Hook for a generic processor
– Keyword (parameter) 'processor' for the constructor of
Geo::OGC::Service object, which should be a
processor object
– Geo::OGC::Service object gives the processor object
to the service object, when it creates it
– The service object calls it when preparing the response
● for example Geo::OGC::WMTS calls it with GDAL Dataset
object and Tile object
– Simple example
Availability
● Github:
– https://github.com/ajolma?tab=repositories
– https://travis-ci.org/ajolma
● CPAN
– https://metacpan.org/author/AJOLMA
● Artistic Licence 2.0
– Standard for Perl modules
Danke schön!
ari.jolma at gmail.com

Geospatial web services using little-known GDAL features and modern Perl middleware

  • 1.
  • 2.
    About me ● Academic,Engineer, Scientist ● Previously at Aalto University – Water resources, geoinformatics, environmental informatics ● Now at Finnish Environment Institute – Marine Spatial Planning ● Just another Perl hacker ● GDAL user and developer, maintain Perl interface Geo::GDAL) ● Github and CPAN: ajolma ● ari.jolma at gmail.com
  • 3.
    Overview of thetalk ● Geospatial web services – Programs (on servers) accessed by other programs (on clients) ● Actually a stack of programs and code on the server – Data discovery, data visualization, data access – Standardized by The Open Geospatial Consortium ● Little-known GDAL features – Redirection of /vsistdout/ – Utilities as functions (not so obscure but new) ● Modern Perl middleware – Software based on PSGI
  • 4.
    GDAL virtual filesystem (VSI) ● Many GDAL drivers read from and write to files ● GDAL has virtualized (added a layer of code) the file I/O ● Benefits of that are – things can be done to the data before it is read/written – sources and sinks that are not really files can be made looking like files for the drivers ● Read/write to – ZIP file – HTTP server – Memory buffer – See Even's blog from May 18, 2012
  • 5.
    Redirection of /vsistdout/ ●/vsistdout/ is GDAL virtualized stdout (uses fwrite to write to stdout) ● GDAL provides an API function to change fwrite and stdout – void VSIStdoutSetRedirection(VSIWriteFunction fct, FILE* stream) ● I built for the GDAL Perl interface a mechanism, which exploits this so that – One can use a specific perl object as the data source (sink) name when creating a dataset object – The only requirement is that the perl object in question must belong to a class, which has write and close methods – A related bug (VSIFCloseL does not flush the file, #6149) was fixed for 1.11.4, 2.0.1, and 2.1
  • 6.
    Sidenote ● Even: ”/vsistdout/redirection was developed for MapServer, for FCGI and streamable OGR output” – https://github.com/mapserver/mapserver/commit/2c 5aad92a1e441fea8b08e38c995f8c3f831473a
  • 7.
    PSGI ● Perl Web ServerGateway Interface Specification – Tatsuhiko Miyagawa, 2009 ● Inspired by Python Web Server Gateway Interface specification, Ruby's Rack specification, and JavaScript Gateway Interface specification ● Covers plain old CGI and mod_perl but allows better solutions (frameworks) – CGI.pm has been removed from the core Perl 5 distribution (from 5.22) – mod_perl is an Apache specific solution and things like Nginx are gaining popularity ● Defines Middleware as a program (server), which takes an application and provides an environment for it to run – Both the application and the server are written in Perl – The application is a code reference (a subroutine) – The server prepares the input for the application and post-processes its output
  • 8.
    Typical(? my) setup ●Nginx web server – Provides various root level locations for websites on a server – A location may be Perl services ● Example: http://biwatech.com:800/Eurajoki ● Configured with proxy_pass ● Perl services are provided with Starman (high-performace preforking PSGI web server) – Starman workers read a configuration file ● Which is Perl code and mainly links URLs to methods from application objects ● In the case of geospatial web services and my case, this is objects from Geo::OGC::Service::* classes
  • 9.
    Plack ● Perl Supergluefor Web frameworks and Web Servers (PSGI toolkit) ● Plack is a set of tools for using the PSGI stack. It contains middleware components, a reference server and utilities for Web application frameworks. Plack is like Ruby's Rack or Python's Paste for WSGI. ● Plack::Builder – For the Starman configuration file ● Plack::Component – Superclass of application classes (Geo::OGC::Service) ● Plack::Request – The request coming from the client
  • 10.
    OGC Services (fromhacker point of view) ● Many ways to access: GET, POST, REST ● Need to be configured, often quite extensively ● Similar requests and responses ● CORS – Separate user interface and data provision ● Use of XML for message exchange ● Geo::OGC::Service class
  • 11.
  • 12.
    Message exchange ● Geo::OGC::Serviceclass overrides the call method of Plack::Component class – And returns a reference to an anonymous subroutine ● Which calls the respond method of Geo::OGC::Service with two arguments: responder and environment ● Plack gives environment to the call method ● Plack gives responder to the anonymous subroutine ● environment contains the message from the client ● The service (us) uses the responder to send a message to the client – Either in one call (status, mime type, content) or in a delayed fashion, when it gets an object with which ● It (we) may use a write method ● Plack will call a close method ● So, to send content to the client we need an object, which has these two methods, and we can have it because of the vsistdout redirection in GDAL
  • 13.
    OK, nice, butwhat concrete have I done that others can enjoy, fix, or extend? ● The stuff in GDAL Perl interface & requested a fix to the bug in GDAL VSI ● Geo::OGC::Service class – + 4 classes for common things (CORS response and constructing XML messages) ● Geo::OGC::WFS class – GetCapabilities, GetFeature, Transaction – Mostly for the PostgreSQL driver ● Geo::OGC::WMTS class – WMTS and TMS Tile service – Tiles can be files or made on the fly
  • 14.
    Why? ● Because itcould be done and I had the time ● To demonstrate a, perhaps, better way to build these services using existing tools and a modular approach ● To learn PSGI and Plack – Most of the code was originally written for plain old CGI ● To test some extensions – WFS: ”pseudo credentials” – On-the-fly processing of raster tiles
  • 15.
    WFS ”pseudo credentials” ●The requirement: – ”The users of a web mapping page should be able to submit geographical objects without first creating a user account. However, they should be able to edit the objects later.” – I.e., the ”account” should be created on-the-fly ● The solution: – Specific attributes (username, password) are not included in the published attribute list but they can be used in the filter and are required for a transaction
  • 16.
    On-the-fly production and processingof raster tiles ● Geo::OGC::Service::WMTS has a make_tile method, which gets the request and the configuration data of the requested resource ● The resource is opened as a GDAL dataset ● The extent of the tile is computed ● If the resource is configured with processing – The tile extent is expanded with two pixels – Translate is called on the dataset to clip the tile from the source into memory (/vsimem/) – Processing is applied to the memory tile dataset (DEMProcessing) – The dataset is replaced with the result – The tile extent is shrunk with two pixels – Note the use of GDAL utilities as functions in Perl ● Translate is called on the dataset with a writer object obtained from the responder ● Examples – http://www.pyhajarvensuojelu.net/valuma-aluetyo/ (click on Karttatasojen valinta → Lisätasot → Maasto) – http://arijolma.org/SmartSea/ (experimental, will change, suboptimal WMTS + no processing based on dataset)
  • 17.
    Latest (experimental) ● Hookfor a generic processor – Keyword (parameter) 'processor' for the constructor of Geo::OGC::Service object, which should be a processor object – Geo::OGC::Service object gives the processor object to the service object, when it creates it – The service object calls it when preparing the response ● for example Geo::OGC::WMTS calls it with GDAL Dataset object and Tile object – Simple example
  • 18.
    Availability ● Github: – https://github.com/ajolma?tab=repositories –https://travis-ci.org/ajolma ● CPAN – https://metacpan.org/author/AJOLMA ● Artistic Licence 2.0 – Standard for Perl modules
  • 19.