Pywps a tutorial for beginners and developers
Upcoming SlideShare
Loading in...5

Pywps a tutorial for beginners and developers



PyWPS presentation for beginners and developers. Basic explanation on how to setup a process and basic configuration. Examples on how to reprogram PyWPS, use mod_python and Tomcat implementation. ...

PyWPS presentation for beginners and developers. Basic explanation on how to setup a process and basic configuration. Examples on how to reprogram PyWPS, use mod_python and Tomcat implementation. OpenLayers client. WSDL and SOAP



Total Views
Views on SlideShare
Embed Views



2 Embeds 2 1 1



Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • For updated info, please check the wiki:
    Are you sure you want to
    Your message goes here
  • Very very useful! Thanks a million! :)

    -Rudi, South Africa
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Pywps a tutorial for beginners and developers Pywps a tutorial for beginners and developers Presentation Transcript

  • PyWPS a tutorial for beginners and developers Jorge de Jesus (Plymouth Marine Laboratory) Luca Casagrande (Università degli Studi di Perugia) Jachym Čepicky (Help Service – Remote Sensing Company)
  • Before starting Please put your OSGEO livecd or usb-stick inside your laptop and start your machine.
  • Program Introduction part WPS Standard and PyWPS Newbie part Installation, setup and first process Developers part PyWPS in detail, mod_python,jython, GRASS Gallery Examples of applications using PyWPS View slide
  • Introduction Section View slide
  • Install the tutorial Open Firefox and download the script from the pyWPS main page: Make it executable (password is user) : sudo chmod +x Start the script: sudo ./ You can edit files using nano, but remember to always use sudo (password is user) .
  • Definitions Web Processing Service (WPS) is an OGC standard protocol to make GIS calculation available to the internet Open Geospatial Consortium (OGC) is a non-profit, international, voluntary consensus standards organization that is leading the development of standard for geospatial and location based service.
  • WPS standard ... provides rules for standardizing how inputs and outputs (requests and responses) for geospatial processing services , such as polygon overlay. The standard also defines how a client can request the execution of a process, and how the output from the process is handled..
  • Invoking WPS server
    • HTTP GET method with key-value-pairs encoded request (KVP)
    • HTTP POST method, using XML format of the request
    • SOAP
    For this tutorial we will use just the HTTP GET method
  • Key Value Pairs request http://localhost/cgi-bin/ http://localhost/cgi-bin/ Is the server address The ? sign indicates, that the request parameters will start service=WPS&request=GetCapabilities The KVP-encoded request. We send two request parameters to the server: service - which we set to WPS request - which is set to GetCapabilities
  • XML Request In this case, the request is encoded in XML form and send to the server directly via HTTP POST (the WPS server will read the file from standard input directly). <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <ows:GetCapabilities xmlns:ows=&quot;; > <ows:AcceptVersions> <ows:Version>1.0.0</ows:Version> </ows:AcceptVersions> </ows:GetCapabilities>
  • WPS operations
    • GetCapabilities: returns service-level metadata
    • DescribeProcess: returns a description a process including its inputs and outputs
    • Execute: returns the output(s) of a process
    • Despite the specification that requests should be case insensitive, it is recommended to use the upper camel case standard in all sorts of WPS operation requests
  • GetCapabilities Mandatory parameters: service,request The server returns basic Capabilities document. Among others it contains:
    • ServiceIdentification
    • Service provider (Name, Organization, Address, ...)
    • ProcessOfferings - List of available processes
  • DescribeProcess Mandatory parameters: version, name of the process or “all” keyword The ProcessDescriptions document contains detailed description of selected processes. Each process is identified by:
    • Title
    • Identifier
    • DataInputs
    • DataOutputs request=DescribeProcess& identifier=all
  • Execute Mandatory parameters: No mandatory parameters request=Execute& identifier=literalprocess& datainputs=[int=1;float=3.2;zeroset=0;string=spam]
  • KVP inputs encoding
    • A semicolon (;) shall be used to separate one input from the next
    • An equal sign (=) shall be used to separate an input name from its value and attributes, and an attribute name from its value
    • An at symbol (@) shall be used to separate an input value from its attributes and one attribute from another.
    • All field values and attribute values shall be encoded using the standard Internet practice for encoding URLs
    • PyWPS also supports the use of [ ] to group the datainputs as follows: datainputs=[int=1;float=3.2]
  • Description of Data Inputs and Outputs Three types of inputs and outputs are defined in the OGC standard. LiteralData, ComplexData and BoundingBox data.
  • LiteralData LiteralData can be any character string, float,date, etc normally described as Primitive datatype in the W3C XML WPS standard also allows the use of UOM (Unit of Measures), default values and AllowedValues.
  • ComplexData Complex Data data type is used for pasting complex - Vector- Raster- or other data to the server or obtain it as result of the process. There are two ways, how this complex data are handled:
    • Either you send them directly as part of the request to the server or you obtain them as part of the XML response from the server.
    • Or you send or obtain just reference to the data – URL to the file or service, where the data can be downloaded.
  • Bounding Box Data It is used to describe some sort of bounding box area . The input description must state:
    • the default coordinate reference system (CRS) used
    • other CRS supported
  • PyWPS pyWPS is:
    • A framework for the implementation of WPS 1.0.0
    • Totally written in python 2.6
    • Organized into packages and classes for easy maintenance
    • Open to new developments and functionalities that can be integrated in WPS 1.0.0
    • Is run from bash or as a cgi process
  • pyWPS is not:
    • Is not an analytical tool or engine. It does not perform any type of geospatial calculation
    • Is not a XML parser or generator. It does not validate the GMLs against given schemas (yet), it does not build GML from Python objects
    • Can't be used to build general web applications like CherryPy framework
  • PyWPS history 2006
    • First pyWPS 1.0.0 release, a project was born
    • First presentation held at FOSS4G 2006 Lausanne: &quot;GRASS goes web: pyWPS&quot;
    • pyWPS 2.0.0 gets release, supporting WPS 0.4.0
  • PyWPS history 2008 pyWPS 3.0.0 gets release:
    • Support for WPS 1.0.0
    • New simple configuration files
    • Support for multiple WPS servers with one pyWPS Installation
    • Support for internationalization
    • Simple code structure
    • New examples of processes
  • PyWPS history 2009 pyWPS 3.1.0 gets released with new features
    • Up-to-date examples
    • New generic WPS JavaScript library
    • Multiple fixes in both, source code and templates
    • New style In- and Outputs Complex object
    • Tons of bugs fixed
  • PyWPS history 2010 pyWPS is recomended as THE WPS tool in GIGAS project ( GEOSS, INSPIRE and GMES an Action in Support). As explained in the &quot;GIGAS Technology Watch Report WPS&quot; &quot;PyWPS Web Processing Service: is a Python program which implements the OGC WPS 1.0.0 standard (with a few omissions). PyWPS was chosen as it is up to date with the WPS standard and has a low footprint, making it easy to install on most Linux systems.”
  • Newbies Section
  • Installation Configuration file PyWPS instance / Wrapper Script First process
  • Installation Requirements: python – currently 2.5, 2.6 python – xml package htmltmpl engine older versions
  • For cool stuff you may need: GRASS-GIS
  • Install files can be found in: SVN access to the latest code: Latest package: svn checkout
  • Clean install: There's DEB and RPM packages. Badly maintain :( > tar -xvzf /tmp/pywps-VERSION.tar.gz > cd pywsp-VERSION > python install
  • <Installation from script, FOSS4G>
  • Testing the script by running the (/usr/bin) script If everything is ok.... > /usr/bin/ PyWPS NoApplicableCode: Locator: None; Value: No query string found. Content-type: text/xml <?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?> <ExceptionReport version=&quot;1.0.0&quot; xmlns=&quot;;... > <Exception exceptionCode=&quot;NoApplicableCode&quot;> <ExceptionText> No query string found. </ExceptionText> </Exception> </ExceptionReport>
  • Configuration file for PyWPS can be located on several places. There are global and local PyWPS configuration files. Local files overwrite the global one Global : Local: /etc/pywps.cfg /usr/local/pywps-VERSION/etc/pywps.cfg Any path defined in the PYWPS_CFG environment variable
  • pywps.cfg is a Key = Value text file 4 sections are present in the file: Remember: The file is case-sensitive [wps] [provide] [server] [grass]
  • Baisic meta information necessary to populate the WPS doc. Other variables aren't shown in this example [wps] title version abstract fees keywords lang
  • Server configuration options, path locations and URL translation and service limits Other variables aren't shown in this example Most important: outputURL, outputPath, processPath [server] maxoperations maxinputparamlength maxfilesize outputUrl outputPath processesPath
  • outputURL: URL that will be used to point to the WPS outputs outputPath: Folder where PyWPS will drop the outputs (server accessible) http://localhost/wpsoutput /var/www/html/wpsoutput /usr/local/apache/htdocs/wps/wpsoutput /var/www/html/wpsoutput /var/www/html/wpsoutput
  • processPath: Folder path with stored processes It's important that these 3 parameters are properly configured /usr/local/pywps/processes /usr/local/pywps/processes /usr/local/pywps/processes /usr/local/pywps/processes /usr/local/pywps/processes /home/user/processes
  • PyWPS can be installed once in a server, but it may be configured to run several WPS services (instances). WPS instance Process folder pywps.cfg file
  • 1) Setup a process folder 2) copy configuration file-template and edit it to desired configuration 3) We need to populate the process directory > mkdir -p /usr/local/wps/processes > cp pywps-VERSION/pywps/default.cfg /usr/local/wps/pywps.cfg > nano /usr/local/wps/pywps.cfg > cp pywps-VERSION/examples/ /usr/local/wps/processes/
  • 4) Every process in the process folder needs to be “registered in a file called We've done 50% of an instance :) > cd /usr/local/wps/processes/ > echo &quot;__all__=['ultimatequestionprocess']&quot; > __all__ it's a python array will the processe list
  • A WPS instance is just a script that alters some parameters before calling #!/bin/sh # Author: Jachym Cepicky # Purpose: CGI script for wrapping PyWPS script # Licence: GNU/GPL # Usage: Put this script to your web server cgi-bin directory, e.g. # /usr/lib/cgi-bin/ and make it executable (chmod 755 pywps.cgi) # NOTE: tested on linux/apache export PYWPS_CFG=/usr/local/wps/pywps.cfg export PYWPS_PROCESSES=/usr/local/wps/processes/ /usr/local/pywps-VERSION/ $1 wps.cgi file
  • We need to configure PYWPS_CFG and PYWPS_CFG to specify the instance We can copy the wrapper script to Apache's cgi-bin folder Assuming that Apache is configure to support script execution... > cp wps.cgi /usr/lib/cgi-bin http://localhost/cgi-bin/pywps.cgi?request=DescribeProcess & service=WPS&version=1.0.0& process=ultimatequestionprocess
  • <?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?> <wps:ProcessDescriptions xmlns:wps=.... service=&quot;WPS&quot; version=&quot;1.0.0&quot; xml:lang=&quot;eng&quot;> <ProcessDescription wps:processVersion=&quot;2.0&quot; storeSupported=&quot;true&quot; statusSupported=&quot;true&quot;> <ows:Identifier>ultimatequestionprocess</ows:Identifier> <ows:Title>Answer to Life, the Universe and Everything</ows:Title> <ows:Abstract>....</ows:Abstract> <ProcessOutputs> <Output> <ows:Identifier>answer</ows:Identifier> <ows:Title>T he numerical answer to Life, Universe and Everything </ows:Title> <LiteralOutput> <ows:DataType >integer</ows:DataType> </LiteralOutput> </Output> </ProcessOutputs> </ProcessDescription> </wps:ProcessDescriptions>
  • PyWPS's assemble factory approach User's process POST GET SOAP GetCapabilities DescribeProcess Execute 1 2 1 Load Process Check Properties WPS output 2 Load Process getInput run setOuput WPS output
  • Process as an extended class of WPSProcess with method run() that will execute the code WPSProcess class Process1 Process2 ProcessN
  • All processes have the following skeleton: WPSProcess class provides extra functionalities like: - Command line util: self.cmd() - Status setting: self.status.set(message,percentage) from pywps.Process.Process import WPSProcess class Process(WPSProcess): def __init__(self): # init process WPSProcess.__init__(self, < Process's information like: identifier, title, status >) < Inclusion of inputs and outputs to process class > def execute(self): < code >
  • Process's attributes: The only mandatory attribute is: identifier class Process(WPSProcess): def __init__(self): # init process WPSProcess.__init__(self, identifier=”firstprocess”, #same file name title=”foo”, abstract=”bacon and eggs”, version = &quot;0.1&quot;, storeSupported = &quot;true&quot;, statusSupported = &quot;true&quot;, <more WPS attributes if necessary> )
  • 3 types of Input/Output defined in WPS: Each Input/Output is a method of WPSProcess class Each Input/Output is created when class in initiated LiteralData ComplexData BBOX
  • LiteralData ComplexData BBOX self.addLiteralInput(...) self.addLiteralOutput(...) self.addComplexInput(...) self.addComplexOutput(...) self.addBBoxInput(...) self.addBBoxOutput(...)
  • Each self.add*() defines/creates an input. The class constructor accepts the WPS parameters: Only identifier and title are mandatory self.Input1 = self.addLiteralInput( identifier = &quot;input1&quot;, title = &quot;Input1 number&quot;, abstract=”foo”, minOccurs=1, type=types.IntType default=&quot;100&quot;)
  • A more “Complex” example Only identifier and title are mandatory self.dataIn = self.addComplexInput( identifier=&quot;data&quot;, title=&quot;Input vector data&quot;, abstract=”foo” formats = [{'mimeType':'text/xml'}])
  • What about outputs Identical syntax and procedure :) self.dataOut = self.addComplexOutput( identifier=&quot;output&quot;, title=&quot;Output vector data&quot;, formats = [{'mimeType':'text/xml'}]) self.Output1 = self.addLiteralOutput( identifier=&quot;output1&quot;, title=&quot;foo&quot;)
  • The add*Input and add*Output are set in the beginning of the class (__init__ method): from pywps.Process.Process import WPSProcess class Process(WPSProcess): def __init__(self): # init process WPSProcess.__init__(self, < Process's information like: identifier, title, status >) self.Input1=self.addLiteralInput(identifier=”input1”) self.dataOut =self.addComplexInput(identifier=”outputs”) <more inputs/outputs as needed> def execute(self): < code >
  • OK we have WPS inputs and output, how can I get them ?!?!?! Using the JAVA get and set “philosophy” :) Each Input has a getValue() method Each Output has a setValue() method This is done inside the execute method()
  • Please check wiki !!!! from pywps.Process.Process import WPSProcess class Process(WPSProcess): def __init__(self): # init process WPSProcess.__init__(self, < Process's information like: identifier, title, status >) self.Input1=self.addLiteralInput(identifier=”input1”) self.dataOut =self.addComplexInput(identifier=”outputs”) <more inputs/outputs as needed> def execute(self): input1=self.Input1.getValue() XMLdata=”<xml>foo</xml>” self.dataOut.setValue(XMLData) #input or file object
  • Fist Process, a returner process Process class initiation, identifier, WPS status and storeExecuteResponse definition from pywps.Process import WPSProcess class Process(WPSProcess): def __init__(self): ## # Process initialization WPSProcess.__init__(self, identifier = &quot;returner&quot;, title=&quot;Return process&quot;, abstract=&quot;&quot;&quot;This is demonstration process of PyWPS, returns the same file, it gets on input, as the output.&quot;&quot;&quot;, version = &quot;1.0&quot;, storeSupported = &quot;true&quot;, statusSupported = &quot;true&quot;)
  • Attention to indent !!! It's python !!!! ## # Adding process inputs self.dataIn = self.addComplexInput(identifier=&quot;data&quot;, title=&quot;Input vector data&quot;, formats = [{'mimeType':'text/xml'}]) self.textIn = self.addLiteralInput(identifier=&quot;text&quot;, title = &quot;Some width&quot;) ## # Adding process outputs self.dataOut = self.addComplexOutput(identifier=&quot;output&quot;, title=&quot;Output vector data&quot;, formats = [{'mimeType':'text/xml'}]) self.textOut = self.addLiteralOutput(identifier = &quot;text&quot;, title=&quot;Output literal data&quot;) ## # Adding process inputs self.dataIn = self.addComplexInput(identifier=&quot;data&quot;, title=&quot;Input vector data&quot;, formats = [{'mimeType':'text/xml'}]) self.textIn = self.addLiteralInput(identifier=&quot;text&quot;, title = &quot;Some width&quot;)
  • Hopefully this will work: def execute(self): # just copy the input values to output values self.dataOut.setValue( self.dataIn.getValue() ) self.textOut.setValue( self.textIn.getValue() ) return http://localhost/python/ service=WPS& identifier=returner& Version=1.0.0& datainputs=[text=abc;data=<xml>foo</xml>]
  • Developers Section
  • Hacking ( code for dummies) Logging (and wood) GRASS (after the wood) Mod_Python (Pythons and Horses) Tomcat server (Pythons and Cats) Mapserver support (More OGC stuff) OpenLayers (Let there be layers.....) SOAP/WSDL (Beatiful soap... so they say...) PyCallGraph (The all enchilada!!!)
  • PyWPS's has the following pseudo-code structure: Start 1. Determine request_method (GET or POST) 2. if no input: raise Exception and exit 3. try: initiate PyWPS class according to request_method parse Request do Request get Response and make proper reply 4. exception: reply Error response
  • Some “Hacking” experiments: Exit if request is POST (around line 95): Encript a WPS response:
  • In PyWPS you can use the logging module anywhere in the code. pywps.cfg file contains the path to the log file Then the file log will contain a line, like this:
  • Eclipse IDE is the default debugging platform using PyDEV tools Code to be debugged should contain a path to the PyDEV tools: After this path append, it is possible to import pydev module Now the code with pydevd is enabled for debbuging
  • Next step is to activate the debug server that will listen to the script. Now everytime that the python interperter finds: It will stop and send the variable to debug server From eclipse it will be possible to continue or stop the script
  • PyWPS doesn't come with out-of-the-box tools PyPWS is Python, so connect, connect, connect !!!!! To GRASS GIS You may work with a predefined grassLocation or a temporary one WPSProcess.__init__(self, identifier = &quot;foo&quot;, ... grassLocation = True ) Temporary grassLocation XY coordinate system
  • gisdbase base path specified in configuration file (pywps.cfg) Absolute path in grassLocation OR
  • Commands passed as an array with command + arguments:
  • A total Pythonic way to connect to GRASS!!!!!!
  • &quot; Mod_python is an Apache module that embeds the Python interpreter within the server ”. In: version 3.2 provides a script designed to be integrated into mod_python So what is the advantage ?! SPEED !!!!! 50x faster on request processing Integration with Apache's API Ability to handle request phases, filters and connections
  • Default httpd.conf for PyWPS: Inform Apache that wps is the default handler of any request Pass env variables PYWPS_PROCESSES and PYWPS_CFG
  • Mod_python can be used to restrict access to the WPS:
  • mod_python can apply filters on HTTP request/response The filter needs to be register to Apache and mod_python The filter is applied to any WPS output, encrypting the response
  • WPS client/server 'secured' interaction The server provides “getCapabilities” and “describeProcess” to anyone. The “execute” is permitted only to authorized users Note: the base authentication credentials are used to allow the geo web service to receive delegation (downloading a proxy certificate) from another web service . The authentication/authorization of the “execute” is managed through X.509 certificates which are handled through the GridSite module for Apache ( ). Who is using mod_python and PyWPS ?! GENESI-DR, (Ground European Network for Earth Science Interoperations - Digital Repositories), & INFRA-2007-1.2.1 : Scientific Digital Repositories
  • No Voodoo , Just computer science !!!! Python Code Jython Compiler Java ByteCode TomCat Instance
  • Instead of we have a new script called
  • - All PyWPS code needs to be copied to the Tomcat folder running the instance: WEB-INF Configuration file used by TomCat The twin brother or
  • Now we just need the Jython Library :) And we have just the last piece of the puzzle missing... What about PYWPS_PROCESSES ?!
  • - After start/stop of tomcat we should be able to make a request:
  • Still in the SVN tree, highly experimental !!!!! Who's using it ?! ...or will be using..... Its intended to be the default WPS service for Conceptual Schema Transformer
  • Yes, PyWPS even supports Mapserver :) Still experimental in the SVN..... ComplexData reference link outputed as a OGC service
  • According to the data type the link will point to a WMS, WFS or WCS service. So how is it set ?!
  • In the SVN tree we have a WPS client specific for Openlayers Just append the file to the HTML's script tags Now a OpenLayers.WPS class should be available
  • We have 2 major classes, WPS and process WPS API will make all the requests, parse the result and when finish will run a call back function WPS +describeProcess() +getCapabilities() +execute() +onDescrivedProcess: callback +onGotCapabilities: callback +onExecuted: callback
  • A simple example: Please check wiki for an extensive explanation !!!!
  • SOAP == Simple Object Access Protocol WSDL == Web Services Description Language OGC defines that WPS 1.0.0 should support these standards PyWPS has “some” support for SOAP PyWPS generates a simple WSDL file
  • SOAP is a messaging framework, meaning, a structured way to pass, explain and process a message.
  • Example: - Currently PyWPS will accept SOAP XML requests - BUT it will not process any header content or “special Execute tags”
  • - WSDL is a XML document describes a Web service. -Considering a WPS process, then a WSDL would some something like: WSDL Doc == GetCapabilites+ DescribeProcess+ Execute+ OGC WPS standard definition (schema)
  • WSDL file is served as follows: No specific process WSDL file request or support :(
  • SVN branch pywps-3.2-SOAP for WSDL and SOAP development Next PyWPS release will have better SOAP/WSDL support WPS 2.0.0 to have better SOAP/WSDL support Million dolar question ?! Why do we need SOAP/WSDL
  • - Orchestration and interaction with other web services - Ability to use BPEL (Bussines Procedure Language) to orchestrate services
  • -PyCallGraph is used to generate a graphic representation of code being run - Useful to check bottlenecks and code problems
  • - Major time consumption in initProcess() method. - More processes == Slower output A detailed analysis using pyCallGraphic has shown: NO MAJOR BOTTLE NECKS Execute/DescribeProcess spent most of time in process handling If PyWPS is slow, blame the process code Minimum overhead when calling Update and Exception reports
  • Gallery / Examples
  • NETMAR aims to develop a pilot European Marine Information System (EUMIS) for searching, downloading and integrating satellite, in situ and model data from ocean and coastal areas. Provide a user-configurable system offering flexible service discovery, access and chaining facilities using OGC, OPeNDAP and W3C standards. Use a semantic framework coupled with ontologies for identifying and accessing distributed data, such as near-real time, forecast and historical data.
      NETMAR – Open Service Network for Marine Environmental Data
    Lots of WPS processes!!!
  • version=1.0.0&identifier=histogramprocess& datainputs=[imageInput=]& responsedocument=histogramOutput=@asreference=true
  • version=1.0.0&identifier=histogramprocess& datainputs=[imageInput=]& responsedocument=histogramOutput=@asreference=true
  • 60 megas 700kb identifier=reducer& datainputs=[reductionFactor=0.1;imageInput=]& responsedocument=imageOutput=@asreference=true& status=true&storeExecuteResponse=true
  • Acknowledgments: Simone Gentilini (JRC). GENESI-DR Project funded by FP7 program und e r ( INFRA-2007-1.2.1) Scientific Digital Repositories & Plymouth Marine Laboratory – Remote Sensing Group & Netmar project. Project partially funded by FP7 program under (ICT-2009.6.4) Information & Communication Technologies. HS-RS Help Service – Remote Sensing