1. ArcticWeb
Erin Lynch
KADME
Knowledge and Data Management Expertise
2. ArcticWeb: Geoportal to
public information
• ArcticWeb aims to simplify access to
public data by making terms searchable
from multiple sources at once.
• Multi-field data exchange: Integrate
multiple public data sources published
without standards, in multiple languages,
on a map and in a table.
Our Whereoil technology extracts and makes sense of information published by energy websites, in shared
repositories, internal file systems and project databases.
3. PSI in Norway
• Resources are not exposed
• Data found in various formats,
inconsistently, among various data
owners
• Access to all levels of public data is
political, not technical.
4. Approach
• Crawlers (scrapers) continuously harvest
predefined information
• Data scraped from Excel, CSV, HTML,
RSS, PDF and other web services
• Stored data in metastore and geostore
ArcticWeb Search
Engine
Geodata
5. ArcticWeb Search Engine
RSS Feeds
Web Services
PDF
Databases
Web
Non-Standard
Protocols HTML Pages
6. Interoperability
• Data aggregation: There is no
interoperability across the key data
owners.
• Interoperability created by caching data
in a common framework
7. Public Data via Search or by
Geographical Location
Data is correlated and overlapping. Selected areas highlighted on left,
viewable on map.
This is an example of a search-based integration of a wide-range of data sources in different standards, correlated via search terms and
geographical locations. We used Whereoil to remove the need for data providers to accept standard data formats and publish standard web
services. This system is applicable to other industries and datasets.
8. Findings
• Populate the metadata!
• Standards
• Give Data Owners clear mandate to make
data more easily available