ArcticWeb Erin Lynch KADMEKnowledge and Data Management Expertise
ArcticWeb: Geoportal to public information• ArcticWeb aims to simplify access topublic data by making terms searchablefrom multiple sources at once.• Multi-ﬁeld data exchange: Integratemultiple public data sources publishedwithout standards, in multiple languages,on a map and in a table. Our Whereoil technology extracts and makes sense of information published by energy websites, in shared repositories, internal ﬁle systems and project databases.
PSI in Norway• Resources are not exposed• Data found in various formats, inconsistently, among various data owners• Access to all levels of public data is political, not technical.
Approach• Crawlers (scrapers) continuously harvest predeﬁned information• Data scraped from Excel, CSV, HTML, RSS, PDF and other web services• Stored data in metastore and geostore ArcticWeb Search Engine Geodata
ArcticWeb Search Engine RSS Feeds Web Services PDF Databases WebNon-Standard Protocols HTML Pages
Interoperability• Data aggregation: There is no interoperability across the key data owners.• Interoperability created by caching data in a common framework
Public Data via Search or by Geographical LocationData is correlated and overlapping. Selected areas highlighted on left,viewable on map. This is an example of a search-based integration of a wide-range of data sources in different standards, correlated via search terms andgeographical locations. We used Whereoil to remove the need for data providers to accept standard data formats and publish standard web services. This system is applicable to other industries and datasets.
Findings• Populate the metadata!• Standards• Give Data Owners clear mandate to make data more easily available