This workshop, as presented by Snowflake and SeaZone at the INSPIRE Conference 2013, aims to demonstrate how straightforward it is to deploy INSPIRE data services alongside non-INSPIRE data services for the marine and coastal environment using readily available pan-European data sets.
For more details and a one-to-one demo of the software please contact info@snowflakesoftware,com
11. www.snowflakesoftware.com www.seazone.com
Data Maintenance Infrastructure
WFSWFS WFS WFS
API ManagementAPI Management
Data Publication
Services
HydroSpatial
Hydrographical
Offices
Workflow
Transform,
Validate,
Publish
Workflow
Validate,
Load
Planned configuration
12. www.snowflakesoftware.com www.seazone.com
Database Cluster
• Configuration
– Postgres database cluster (EnterpriseDB) on Amazon
Web Services
• Benefits of Cluster:
– Can start with two database instances and increase number of
instances as demand increases
– Data automatically replicated between instances
– Can establish database instances in different geographic
regions (e.g. Europe, North America, Middle East) to ensure
QoS
13. www.snowflakesoftware.com www.seazone.com
Configuring the Data Services
• Publishing from single source to multiple
schemas (eg. INSPIRE, S-100)
• Using off-the-shelf software
• Rapid configuration and deployment of
new data services
15. www.snowflakesoftware.com www.seazone.com
Deploying Data Services
• Once schema transformation configured –
project is deployed within WFS
• But first need to configure WFS settings:
– GetCapabilities
– Encoding format (compressed/uncompressed)
– Servlet pattern
• Finally, generate WFS war and deploy to
application server
16. www.snowflakesoftware.com www.seazone.com
Deploying Data Services
Desktop
Server
HydroSpatialHydroSpatialHydroSpatialHydroSpatial
Translation
configuration
Data Request
SchemaSchema
translationtranslation
SchemaSchema
translationtranslation
Data Request
SchemaSchema
translationtranslation
SchemaSchema
translationtranslation
Database
Records
Database
Records
SQL Query
SQL Query
Database table
information
18. www.snowflakesoftware.com www.seazone.com
API Management Services
• Security:
• Firewalls, ports
• usernames/password
• Analytics and Reporting
• Billing and Payments
• Bad requests
• Malformed/Malicious
• Request the world or data
outside allowable area of
interest
Data as a service is not new! But, cloud infrastructure offers a flexible, scalable, secure and cost-effective mechanism for data providers and publishers to set up data services which they can either develop completely themselves, develop in partnership or out-source to a third-party. Seazone Solutions Ltd. have opted for the second option and have partnered with Snowflake Software to provide marine and coastal data services for INSPIRE and non-INSPIRE users
The Cloud Data Service can be broken down into five core components: Data Maintenance infrastructure : this is where data comes in. GO Publisher Workflow is used to publish the data against a common standard. The data is validated and any invalid data is rejected. In the prototype we convert an Esri FileGeodatabase directly into the Postgis publication database, but this will be replaced by a fully automated system. Database cluster : this is where the data is stored. In HydroView Now we use a PostGIS database that is deployed on the Amazon cloud via EnterpriseDB Data Services (aka API): these are GO Publisher Web Feature Services (OGC WFS 2.0 and 1.1) deployed on an Amazon Elastic Beanstalk API Management Services : this consists of two components: i) 3Scale API and ii) Snowflake Software WFS Proxy Administration Services : Test, Monitoring and Management Cloud infrastructure offers range of benefits for data as a service offerings: Flexibility Scalability: cost, extensibility, Load balancing Security
Currently hydrographical offices from around the world provide data to SeaZone, who then merge, quality check and clean up the data. This HydroSpatial database holds the complete dataset. Periodically SeaZone provides Snowflake with an export/dump from this database as an Esri FileGeodatabase. Snowflake currently loads this FileGeoDB directly into the Postgres Cloud database using OGR2OGR open source. This is a manual task and involves many steps and takes quite some time. Also, there are no real data quality checks in place.
In the planned configuration we automate this process by deploying Snowflake’s GO Publisher and GO Loader Workflow products. Workflow allows you to automate the process of taking the HydroSpatial database from Seazone, create & validate GML data and load & validate the data into the cloud database. The key benefit is that it will remove the current manual process, speeds up the current process and provides a higher quality assurance.
Benefits: scalable (currently 2 databases), but easily increased to meet demands.
Demonstration with GO Publisher: Setting up project General interface (single screen, color codes in transformation, drop-down menus, no coding) Simple translation Combine columns Constants (eg. for INSPIRE namespaces) SQL additions (eg. DECODE) Coordinate reference system transformations (WGS84 and ETRS89) Preview + validation Publishing WFS Adding WFS schema Add additional mandatory constants Create WAR file Deploy WAR in Tomcat Test WFS
Demonstration with GO Publisher: Setting up project General interface (single screen, color codes in transformation, drop-down menus, no coding) Simple translation Combine columns Constants (eg. for INSPIRE namespaces) SQL additions (eg. DECODE) Coordinate reference system transformations (WGS84 and ETRS89) Preview + validation Publishing WFS Adding WFS schema Add additional mandatory constants Create WAR file Deploy WAR in Tomcat Test WFS
Query Translation - GO Publisher WFS Being able to query data through a translation process is significantly more complex. We can illustrate this by looking at GO Publisher WFS the software architecture of GO Publisher WFS. Configuration takes of the translation takes place in exactly the same way as for GO Publisher Desktop. Once the translation is configured the user adds additional configuration to the GO Publisher project file control the WFS behaviour. GO Publisher Desktop is then used to create a Web Archive (war) file which contains the project file and the WFS software. This war file contains everything needed by an application server to deploy the WFS. The war file is uploaded to an application server which unpacks and deploys the WFS from the war file. When a client submits a query to the WFS GO Publisher translates the WFS request into a SQL query. Because the WFS request is specified in terms of the GML application schema GO Publisher must use the data translation in reverse to translate the query. For example, if we set up a translation that translates the column “NAM” to the XML element “gml:name”, when a WFS request querying against “gml:name” arrives this must be turned into a SQL query against the column “NAM”. The data returned by the SQL query is then translated into GML (using the translation in its forward direction i.e. “NAM” becomes “gml:name”) and returned to the client. Multiple translations can be set up and deployed for a single database, thus allowing the data held in the database to be accessed by different communities of users using different GML application schemas.
So now we have deployed the web services – what can go wrong?
We are using 3Scale for managing the API. 3Scale provides functionality to issue and manage API keys that authenticate users to the service. WFS Proxy: Managed by Snowflake. The proxy protects the WFS services from bad requests and will constrain customers to only the capabilities that they are signed up to. 3Scale 3Scale provides all functionality for set policy, access control and security. It provides API-keys and includes also functionality for analytics and reporting (very useful for measuring INSPIRE QoS requirements). 3Scale offers free version for 4.5 million API calls per month, all the way to an enterprise edition that handles up to 150 million API calls for USD 2,500 per month. Open Source, off-the-shelf solution