SlideShare a Scribd company logo
1 of 22
Download to read offline
Implementing an Open Source
Spatiotemporal Search Platform for
Spatial Data Infrastructures
OGRS 2016, Perugia, Italy - 10/13/2016
Paolo Corti [1]
, Benjamin Lewis [1]
, Athanasios Tom Kralidis [2]
,
Ntabathia Jude Mwenda [1]
[1] Harvard Center for Geographic Analysis
[2] Open Source Geospatial Foundation
HHypermap (Harvard Hypermap)
● With Funding from the National Endowment for the
Humanities, the Harvard Centre for Geographic
Analysis (CGA) developed HHypermap, a map
services registry and search platform
● HHypermap was developed in the process of
re-engineering the search component of CGA’s public
domain SDI (WorldMap http://worldmap.harvard.edu),
based on GeoNode
● It is built on an open source software stack
A brief history
WorldMap was developed on GeoNode 1.2 and released in 2012 as a public space for
scholars and the public to upload and share spatial data.
Within a year WorldMap had 12,000 datasets and 8000 users.
Finding data became difficult and demand grew for being able to bring in and save
map service layers from other servers.
The CGA proposed to NEH to build a system for building and maintaining a
comprehensive registry of WMS and Esri REST Map services that would plug into
Worldmap.
Being able to search for data by time and space as well as by keyword was a priority
given the user base.
Note on uptake
Since the release of HHypermap on GitHub in April, a U.S. federal agency has adopted
it and Boundless is using it within its flagship platform Boundless Exchange.
The geo-visualization capabilities we developed (2D faceting in Lucene) are highly
scalable and being used to build a “Billion Object Platform” to enable interactive
exploration of a billion spatio-temporal objects.
Glossary
● Spatial Data Infrastructure (SDI)
● Catalogue Service for the Web (CSW)
● Search Engine
Spatial Data Infrastructure (SDI)
A Spatial Data Infrastructure
(SDI) is a framework of
geospatial data, metadata, users
and tools intended to provide
an efficient and flexible way to
use spatial information
Catalogue Service for the Web (CSW)
● One of the key software components of an SDI is the catalogue service
which is needed to discover, query, and manage the metadata
● Catalogue services in an SDI are typically based on the Open Geospatial
Consortium (OGC) Catalogue Service for the Web (CSW) standard
which defines common interfaces for accessing the metadata information
● Notable implementations: pycsw, GeoNetwork
Search Engine
● A search engine is a software system capable of supporting fast and
reliable search
● It provides features such as full text search, natural language processing,
weighted results, fuzzy tolerance results, faceting, hit highlighting
● Highly scalable and replicable architecture
● Notable implementations: Solr and ElasticSearch, both based on Apache
Lucene
Goals of HHypermap
● Provide a framework for building and maintaining a comprehensive registry of
web map services
● Support modern search capabilities such as spatial and temporal faceting and
instant previews via an open API
● Behind the scenes HHypermap scalably harvests OGC and Esri service metadata
from distributed servers, organizes that information, and pushes it to a search
engine
● Monitor services and layers for reliability and use to improve results ranking
● End users can search the SDI metadata using standard interfaces provided by the
internal CSW catalogue, and benefit from advanced search capabilities provided
by a more full featured, RESTful API
Catalogue Service for the Web
● The OGC Catalogue Service for the Web (CSW) standard specify the interfaces
and bindings, as well as a framework for defining the application profiles required
to publish and access digital catalogues of metadata for geospatial data and
services
● Based on the Dublin Core metadata information model, CSW supports broad
interoperability around discovering geospatial data and services spatially,
non-spatially, temporally, and via keywords or freetext
● CSW supports application profiles which allow for information communities to
constrain and/or extend the CSW specification to satisfy specific discovery
requirements and to realize tighter coupling and integration of geospatial data
and services
Limitations of CSW
CSW provides numerous benefits to SDI’s, but there are numerous opportunities to
enhance the functionality of CSW and the server implementations of CSW by adding
in standard search engine features. Some examples:
● Faceted search
● JSON representation (vs XML in CSW)
● Simplified query interface (CSW, being based on XML, can quickly become
complex)
● Text stemming (ability to detect words derived from a common root)
● Highly scalable and replicable architecture
The Need for Search Engines in Spatial Data
Infrastructure
● Numerous types of web application such as CMS, Wikis, data delivery frameworks,
all benefit from improved data discovery
● In the last few years, these applications have delegated the task of search
optimization to specific frameworks known as search engines
● Rather than implementing a custom search logic, these platforms now often add a
search engine in the stack to improve search
● Apache Solr and Elasticsearch, two popular open source search engine web
platforms, and both based on Apache Lucene, can now be part of a typical CMS
stack to support complex search criteria, faceting, result highlighting, query
spellcheck, relevance tuning and more
● As for CMS, SDI search can dramatically benefit if paired with these platforms
How a search engine works
Two distinct phases:
● Indexing: all of the documents (metadata, in the SDI context) that must be
searched are scanned, and a list of search terms (an index) is built. For each search
term, the index keeps track only of the identifiers of the documents that contain
the search term
● Searching: only the index is looked at, and a list of the documents containing the
given search term is quickly returned to the client. This indexed approach makes a
search engine extremely fast in outputting results
Benefits from search engine frameworks
● Very fast, thanks to the indexing mechanism
● Handling the ambiguities of natural languages, with stop words (words filtered out
during the processing of text), stemming (ability to detect words derived from a
common root), synonyms detection, and controlled vocabularies such as thesauri and
taxonomies
● Phrase searches and proximity searches (search for a phrase containing two different
words separated by a specified number of words)
● Weighted results
● Handling regular expressions, wildcard search, and fuzzy search to provide results for a
given term and its common variations
● Support for boolean queries (AND, OR, NOT)
● Hit highlighting
● Highly scalable and replicable
Faceted search
● Faceting is the arrangement of search results in
categories based on indexed terms
● This capability makes it possible for example, to
provide an immediate indication of the number of
times that common keywords are contained in
different metadata documents
● A typical use case for SDI is with metadata
categories, keywords and regions
● Faceting without a search engine is generally
computationally expensive in relational normalized
structures (lots of query in a RDBMS)
Temporal faceting
● Search engines can also support
temporal and spatial faceting, two
features that are extremely useful for
browsing large collections of
geospatial metadata.
● Temporal faceting can display the
number of metadata documents in a
SDI by date range as a kind of
histogram
Spatial faceting
● Spatial faceting can provide a
spatial surface representing
the distribution of layers or
features across an area of
interest
● In this example a heatmap is
generated by spatial faceting
to show the distribution of
layers in the WorldMap SDI
for a given geographic region
HHypermap: an SDI search engine based on Free and
Open Source Software
● HHypermap is an application that manages OGC web services (such as WMS,
WMTS), and Esri REST endpoints
● map service crawling, and harvesting, and uptime statistics gathering for remote
services and layers
● The aim of HHypermap is to provide a more effective search experience to
WorldMap users and also for users outside WorldMap
● WorldMap is an open source mapping platform, based on GeoNode, developed
by the CGA to lower the barrier for scholars who wish to explore, visualize, edit
and publish geospatial information
HHypermap Architecture
Built on Open Source
software:
Celery, RabbitMQ,
Django, Lucene (Solr,
Elasticsearch), MapProxy,
Memcached, OWSLib,
PostgreSQL, PostGIS,
pycsw
Future Work
● While the CSW 3.0.0 standard provides improvements to address mass
market search/discovery, the benefits of search engine implementations
combined with broad interoperability of the CSW standard present
opportunities to integrate the CSW standard with search engine
methodologies
● The authors hope that such an approach will become formalized as a
CSW Application Profile or Best Practice in order to achieve maximum
benefit and adoption in SDI activities
● This will allow CSW implementations to make better use of search engine
methodologies for improving the user search experience in SDI
workflows
Conclusion
HHypermap aims to provide a FOSS solution using modern
approaches to realize a highly scalable, flexible and robust geospatial
registry and catalogue/search platform while achieving broad
interoperability via open standards
References
● Harvard University CGA: http://gis.harvard.edu/
● WorldMap: http://worldmap.harvard.edu/
● Harvard Hypermap public registry: http://hh.worldmap.harvard.edu/
● HHypermap code repository: https://github.com/cga-harvard/HHypermap

More Related Content

What's hot

Geonode introduction
Geonode introductionGeonode introduction
Geonode introductionTek Kshetri
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityMongoDB
 
Integrating Geospatial Data to your Applications
Integrating Geospatial Data to your ApplicationsIntegrating Geospatial Data to your Applications
Integrating Geospatial Data to your ApplicationsIan Panganiban
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesField Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesNiroshan Sanjaya
 
Introduction of Open Source GIS
Introduction of Open Source GISIntroduction of Open Source GIS
Introduction of Open Source GISSANGHEE SHIN
 
Cartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in BernCartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in BernUli Müller
 
Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)Uli Müller
 
Open your data with CartoDB
Open your data with CartoDBOpen your data with CartoDB
Open your data with CartoDBJorge Sanz
 
Introduction to Open Source GIS
Introduction to Open Source GISIntroduction to Open Source GIS
Introduction to Open Source GISSANGHEE SHIN
 
Mapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX LondonMapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX LondonJoachim Van der Auwera
 
CKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みCKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みYoichi Kayama
 
MongoDB + GeoServer
MongoDB + GeoServerMongoDB + GeoServer
MongoDB + GeoServerMongoDB
 
Geonode Presentation (ppt)
Geonode Presentation (ppt)Geonode Presentation (ppt)
Geonode Presentation (ppt)Iwl Pcu
 
Using python to analyze spatial data
Using python to analyze spatial dataUsing python to analyze spatial data
Using python to analyze spatial dataKudos S.A.S
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And GisKudos S.A.S
 
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...Christopher Kotfila
 
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...Christopher Kotfila
 
CartoDB Inside Out
CartoDB Inside OutCartoDB Inside Out
CartoDB Inside OutJorge Sanz
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19ExtremeEarth
 

What's hot (20)

Geonode introduction
Geonode introductionGeonode introduction
Geonode introduction
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
 
Integrating Geospatial Data to your Applications
Integrating Geospatial Data to your ApplicationsIntegrating Geospatial Data to your Applications
Integrating Geospatial Data to your Applications
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesField Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service Technologies
 
Geonode 2.0
Geonode 2.0Geonode 2.0
Geonode 2.0
 
Introduction of Open Source GIS
Introduction of Open Source GISIntroduction of Open Source GIS
Introduction of Open Source GIS
 
Cartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in BernCartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in Bern
 
Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)
 
Open your data with CartoDB
Open your data with CartoDBOpen your data with CartoDB
Open your data with CartoDB
 
Introduction to Open Source GIS
Introduction to Open Source GISIntroduction to Open Source GIS
Introduction to Open Source GIS
 
Mapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX LondonMapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX London
 
CKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みCKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試み
 
MongoDB + GeoServer
MongoDB + GeoServerMongoDB + GeoServer
MongoDB + GeoServer
 
Geonode Presentation (ppt)
Geonode Presentation (ppt)Geonode Presentation (ppt)
Geonode Presentation (ppt)
 
Using python to analyze spatial data
Using python to analyze spatial dataUsing python to analyze spatial data
Using python to analyze spatial data
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And Gis
 
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...
 
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
 
CartoDB Inside Out
CartoDB Inside OutCartoDB Inside Out
CartoDB Inside Out
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19
 

Similar to Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures

GIS Standards and Interoperability
GIS Standards and InteroperabilityGIS Standards and Interoperability
GIS Standards and InteroperabilityNasr Khashoggi
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Anna Fensel
 
Geospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesGeospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesStephane Fellah
 
Geonetwork for Spatial Data
Geonetwork for Spatial DataGeonetwork for Spatial Data
Geonetwork for Spatial DataNizam GIS
 
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Thierry Badard
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal ServerEsri
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerHannaHorppila
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspireHelsinki2019
 
Inspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De LathouwerInspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De LathouwerInspireHelsinki2019
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadataLuis Bermudez
 
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Amazon Web Services
 
Open Source GIS
Open Source GISOpen Source GIS
Open Source GISJoe Larson
 
Integrating PostGIS in Web Applications
Integrating PostGIS in Web ApplicationsIntegrating PostGIS in Web Applications
Integrating PostGIS in Web ApplicationsCommand Prompt., Inc
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal ServerEsri
 
Urm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesUrm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesKarel Charvat
 
Seven50 Sparc Overview
Seven50 Sparc OverviewSeven50 Sparc Overview
Seven50 Sparc OverviewRoar Media
 
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...NeGD Capacity Building
 

Similar to Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures (20)

GIS Standards and Interoperability
GIS Standards and InteroperabilityGIS Standards and Interoperability
GIS Standards and Interoperability
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)
 
Geospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesGeospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL Services
 
Geonetwork for Spatial Data
Geonetwork for Spatial DataGeonetwork for Spatial Data
Geonetwork for Spatial Data
 
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal Server
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
 
Inspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De LathouwerInspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De Lathouwer
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
 
Cop gise
Cop giseCop gise
Cop gise
 
Geohosting
GeohostingGeohosting
Geohosting
 
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
 
Open Source GIS
Open Source GISOpen Source GIS
Open Source GIS
 
Integrating PostGIS in Web Applications
Integrating PostGIS in Web ApplicationsIntegrating PostGIS in Web Applications
Integrating PostGIS in Web Applications
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal Server
 
Upgrading maps with Linked Data
Upgrading maps with Linked DataUpgrading maps with Linked Data
Upgrading maps with Linked Data
 
Urm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesUrm concept for sharing information inside of communities
Urm concept for sharing information inside of communities
 
Seven50 Sparc Overview
Seven50 Sparc OverviewSeven50 Sparc Overview
Seven50 Sparc Overview
 
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
 

More from Paolo Corti

State of GeoNode 2019
State of GeoNode 2019State of GeoNode 2019
State of GeoNode 2019Paolo Corti
 
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Paolo Corti
 
Making Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data InfrastructureMaking Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data InfrastructurePaolo Corti
 
Maintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesMaintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesPaolo Corti
 
Status of WorldMap, 2016
Status of WorldMap, 2016Status of WorldMap, 2016
Status of WorldMap, 2016Paolo Corti
 
GeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze UmanitarieGeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze UmanitariePaolo Corti
 
GeoNode intro and demo
GeoNode intro and demoGeoNode intro and demo
GeoNode intro and demoPaolo Corti
 
GeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk ReductionGeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk ReductionPaolo Corti
 
L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...Paolo Corti
 
Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...Paolo Corti
 
Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1Paolo Corti
 

More from Paolo Corti (11)

State of GeoNode 2019
State of GeoNode 2019State of GeoNode 2019
State of GeoNode 2019
 
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
 
Making Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data InfrastructureMaking Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data Infrastructure
 
Maintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesMaintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queues
 
Status of WorldMap, 2016
Status of WorldMap, 2016Status of WorldMap, 2016
Status of WorldMap, 2016
 
GeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze UmanitarieGeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze Umanitarie
 
GeoNode intro and demo
GeoNode intro and demoGeoNode intro and demo
GeoNode intro and demo
 
GeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk ReductionGeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk Reduction
 
L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...
 
Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...
 
Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1
 

Recently uploaded

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures

  • 1. Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures OGRS 2016, Perugia, Italy - 10/13/2016 Paolo Corti [1] , Benjamin Lewis [1] , Athanasios Tom Kralidis [2] , Ntabathia Jude Mwenda [1] [1] Harvard Center for Geographic Analysis [2] Open Source Geospatial Foundation
  • 2. HHypermap (Harvard Hypermap) ● With Funding from the National Endowment for the Humanities, the Harvard Centre for Geographic Analysis (CGA) developed HHypermap, a map services registry and search platform ● HHypermap was developed in the process of re-engineering the search component of CGA’s public domain SDI (WorldMap http://worldmap.harvard.edu), based on GeoNode ● It is built on an open source software stack
  • 3. A brief history WorldMap was developed on GeoNode 1.2 and released in 2012 as a public space for scholars and the public to upload and share spatial data. Within a year WorldMap had 12,000 datasets and 8000 users. Finding data became difficult and demand grew for being able to bring in and save map service layers from other servers. The CGA proposed to NEH to build a system for building and maintaining a comprehensive registry of WMS and Esri REST Map services that would plug into Worldmap. Being able to search for data by time and space as well as by keyword was a priority given the user base.
  • 4. Note on uptake Since the release of HHypermap on GitHub in April, a U.S. federal agency has adopted it and Boundless is using it within its flagship platform Boundless Exchange. The geo-visualization capabilities we developed (2D faceting in Lucene) are highly scalable and being used to build a “Billion Object Platform” to enable interactive exploration of a billion spatio-temporal objects.
  • 5. Glossary ● Spatial Data Infrastructure (SDI) ● Catalogue Service for the Web (CSW) ● Search Engine
  • 6. Spatial Data Infrastructure (SDI) A Spatial Data Infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide an efficient and flexible way to use spatial information
  • 7. Catalogue Service for the Web (CSW) ● One of the key software components of an SDI is the catalogue service which is needed to discover, query, and manage the metadata ● Catalogue services in an SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard which defines common interfaces for accessing the metadata information ● Notable implementations: pycsw, GeoNetwork
  • 8. Search Engine ● A search engine is a software system capable of supporting fast and reliable search ● It provides features such as full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting ● Highly scalable and replicable architecture ● Notable implementations: Solr and ElasticSearch, both based on Apache Lucene
  • 9. Goals of HHypermap ● Provide a framework for building and maintaining a comprehensive registry of web map services ● Support modern search capabilities such as spatial and temporal faceting and instant previews via an open API ● Behind the scenes HHypermap scalably harvests OGC and Esri service metadata from distributed servers, organizes that information, and pushes it to a search engine ● Monitor services and layers for reliability and use to improve results ranking ● End users can search the SDI metadata using standard interfaces provided by the internal CSW catalogue, and benefit from advanced search capabilities provided by a more full featured, RESTful API
  • 10. Catalogue Service for the Web ● The OGC Catalogue Service for the Web (CSW) standard specify the interfaces and bindings, as well as a framework for defining the application profiles required to publish and access digital catalogues of metadata for geospatial data and services ● Based on the Dublin Core metadata information model, CSW supports broad interoperability around discovering geospatial data and services spatially, non-spatially, temporally, and via keywords or freetext ● CSW supports application profiles which allow for information communities to constrain and/or extend the CSW specification to satisfy specific discovery requirements and to realize tighter coupling and integration of geospatial data and services
  • 11. Limitations of CSW CSW provides numerous benefits to SDI’s, but there are numerous opportunities to enhance the functionality of CSW and the server implementations of CSW by adding in standard search engine features. Some examples: ● Faceted search ● JSON representation (vs XML in CSW) ● Simplified query interface (CSW, being based on XML, can quickly become complex) ● Text stemming (ability to detect words derived from a common root) ● Highly scalable and replicable architecture
  • 12. The Need for Search Engines in Spatial Data Infrastructure ● Numerous types of web application such as CMS, Wikis, data delivery frameworks, all benefit from improved data discovery ● In the last few years, these applications have delegated the task of search optimization to specific frameworks known as search engines ● Rather than implementing a custom search logic, these platforms now often add a search engine in the stack to improve search ● Apache Solr and Elasticsearch, two popular open source search engine web platforms, and both based on Apache Lucene, can now be part of a typical CMS stack to support complex search criteria, faceting, result highlighting, query spellcheck, relevance tuning and more ● As for CMS, SDI search can dramatically benefit if paired with these platforms
  • 13. How a search engine works Two distinct phases: ● Indexing: all of the documents (metadata, in the SDI context) that must be searched are scanned, and a list of search terms (an index) is built. For each search term, the index keeps track only of the identifiers of the documents that contain the search term ● Searching: only the index is looked at, and a list of the documents containing the given search term is quickly returned to the client. This indexed approach makes a search engine extremely fast in outputting results
  • 14. Benefits from search engine frameworks ● Very fast, thanks to the indexing mechanism ● Handling the ambiguities of natural languages, with stop words (words filtered out during the processing of text), stemming (ability to detect words derived from a common root), synonyms detection, and controlled vocabularies such as thesauri and taxonomies ● Phrase searches and proximity searches (search for a phrase containing two different words separated by a specified number of words) ● Weighted results ● Handling regular expressions, wildcard search, and fuzzy search to provide results for a given term and its common variations ● Support for boolean queries (AND, OR, NOT) ● Hit highlighting ● Highly scalable and replicable
  • 15. Faceted search ● Faceting is the arrangement of search results in categories based on indexed terms ● This capability makes it possible for example, to provide an immediate indication of the number of times that common keywords are contained in different metadata documents ● A typical use case for SDI is with metadata categories, keywords and regions ● Faceting without a search engine is generally computationally expensive in relational normalized structures (lots of query in a RDBMS)
  • 16. Temporal faceting ● Search engines can also support temporal and spatial faceting, two features that are extremely useful for browsing large collections of geospatial metadata. ● Temporal faceting can display the number of metadata documents in a SDI by date range as a kind of histogram
  • 17. Spatial faceting ● Spatial faceting can provide a spatial surface representing the distribution of layers or features across an area of interest ● In this example a heatmap is generated by spatial faceting to show the distribution of layers in the WorldMap SDI for a given geographic region
  • 18. HHypermap: an SDI search engine based on Free and Open Source Software ● HHypermap is an application that manages OGC web services (such as WMS, WMTS), and Esri REST endpoints ● map service crawling, and harvesting, and uptime statistics gathering for remote services and layers ● The aim of HHypermap is to provide a more effective search experience to WorldMap users and also for users outside WorldMap ● WorldMap is an open source mapping platform, based on GeoNode, developed by the CGA to lower the barrier for scholars who wish to explore, visualize, edit and publish geospatial information
  • 19. HHypermap Architecture Built on Open Source software: Celery, RabbitMQ, Django, Lucene (Solr, Elasticsearch), MapProxy, Memcached, OWSLib, PostgreSQL, PostGIS, pycsw
  • 20. Future Work ● While the CSW 3.0.0 standard provides improvements to address mass market search/discovery, the benefits of search engine implementations combined with broad interoperability of the CSW standard present opportunities to integrate the CSW standard with search engine methodologies ● The authors hope that such an approach will become formalized as a CSW Application Profile or Best Practice in order to achieve maximum benefit and adoption in SDI activities ● This will allow CSW implementations to make better use of search engine methodologies for improving the user search experience in SDI workflows
  • 21. Conclusion HHypermap aims to provide a FOSS solution using modern approaches to realize a highly scalable, flexible and robust geospatial registry and catalogue/search platform while achieving broad interoperability via open standards
  • 22. References ● Harvard University CGA: http://gis.harvard.edu/ ● WorldMap: http://worldmap.harvard.edu/ ● Harvard Hypermap public registry: http://hh.worldmap.harvard.edu/ ● HHypermap code repository: https://github.com/cga-harvard/HHypermap