It adds a spatial layer to TOS thanks to geospatial access and treatment components
Developed on Java: Eclipse environment, UDig elements, GeoTools library, Java Topology Suite, Sextante </li></ul>
Place of an ETL in a data infrastructure Dashboards Portal
The interface elements the map window This windows enables to visualize geographic data. It is useful when controlling the results of a treatment. This windows is part of UDig Software.
The tool The business modeler The business modeler enables to model the job processes Il allows a large public to take part of of the data flow conception and to follow the advancement of development, without requiring any computer skills Modelling in this window has no impact on the job execution
The interface elements The repository metadata tab The repository contains, among other things, the metadata part The metadata part is a place where to store the data access parameters. On the image, you can notice-the different types of data sources. Note that the configuration of geographic data is not made inside the metadata part (we'll see that further in the demo)
The interface elements The graphical workspace The main window is where you create your jobs You pick your components and put them here There are different types of relations between components that won't be detailed in this keynote.
The interface elements The components palette The palette contains the different components. It's a kind of toolbox Spatial Data Integrator adds the geo part to it The palette is extensible thanks to the contributions of developers As it is opensource, you can develop your own components
The interface elements The configuration tab the bottom windows is where you configure the behaviour of each component it also enables you to parameter the execution of your job.
Configuring the data access and creating the schemas the first step consists in configuring the access to you data source.
Connecting the components inside the workspace You put and connect the components inside the workspace
Configuring the tMap component Here, the city name links the two tables. Two output flows are generated: one for inner join results, one for the outer join ones.
The job execution The job can now be executed There are two modes of execution: - statistics mode displays the number of rows for each flow - traces mode displays its content Each of these modes is executed in streaming.
Going further: detecting similarities between rows Here, we use a fuzzy logic component named tFuzzyMatch . It detects the similarities between rows coming from two different flows. It can be useful to see which rows from a reference (lookup) table correspond the most to the outer join results.
Scheduling the aggregation of data A web geographic portal demands joining periodically the data from different sources Here, it is an Access database fed by users. We'll associate its entries with the cities objects. WMS Access SHP BDCARTO Map Server Sybase XML ... Client part SCP SHP
Scheduling the aggregation of data -SDI task scheduler -crontab for Linux env -windows task scheduler
Merging layers Imagine a data infrastructure where geograhic layers are disseminated in as many files as cities. Consequently, there is one file per city. The jobs aims at merging all these files in one unique table. SHP5 SHP4 SHP3 SHP2 SHP1 SHP
Chaining the Quality Control of Digitalized Documents After having digitalized a huge mass of data, we must operate a complete control on it. The geometry of the objects and their attributes must be checked. This task is very time-consuming if we accomplish it with usual mapping softwares. checking the tables structure checking the content checking the geometric compliance comparison to the reference data
Chaining the Quality Control of Digitalized Documents With a single click, SDI enables to operate this series of controls Reports will list errors related to the objects geometric compliance or attribute values. checking the tables structure checking the content checking the geometric compliance comparison to the reference data
Chaining the Quality Control of Digitalized Documents
Chaining the Quality Control of Digitalized Documents Job comparing the Urban Planning Project Map to the Cadastral Reference Data.
Chaining the Quality Control of Digitalized Documents Tmap joining component Used function Result type row4.the_geom. symDifference (row2.the_geom) géométrique GeometryOperation.GETAREA (row4.the_geom.difference(row2.the_geom)) flottant
Migrating data into a PostgreSQL/PostGIS database At a regional scope, we want to mutualize data and integrate it into a PostgreSQL/postGIS database management system Folder tree Relational Database System
Migrating data into a PostgreSQL/PostGIS database
Other applications <ul><li>Mass geometric treatment : splitting or slicing objects using ones of a different layer
Dividing an image in multiple images, each cut using the city contour and naming each image with the name of the city it has been cut with
Using Talend with GDAL-OGR : conversion in other formats
Conclusion Links <ul><li>Learn how to use Talend </li><ul><li>A general documentation , and one dedicated to the components covering multiple use cases </li></ul><li>Learn how to use Spatial Data Integrator </li><ul><li>A wiki </li></ul><li>Meet the community of users </li><ul><li>The spatial data integrator forum host by Tale nd </li></ul></ul>
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.