Open Spatial Data: Sources and ToolsPresentation Transcript
Open spatial data: sourcesand toolsStuart MacdonaldEDINA & Data LibraryUniversity of Edinburghstuart.firstname.lastname@example.org School of Informatics Data Hack for ILW – 18 Feb. 2012
EDINA & Data LibraryEDINA and University Data Library (EDL)together are a division within InformationServices of the University of Edinburgh.EDINA is a JISC-funded National Data Centreproviding national online data resources foreducation and research.The Data Library assists Edinburgh Universityusers in the discovery, access, use andmanagement of research datasets.
Digimap started as a project under the eLib (Electronic Libraries) Programme in 1996 offering Ordnance Survey maps to 6 trial universities. The full service was launched in 2000 Scoping exercise finding: 80% of maps are used by non- geographers The UK’s National Geospatial Data Framework (NGDF) estimates that approximately 80%* of information collected in the UK today is geo-referenced.* Reid, J., 2002. geoXwalk – A Gazetteer Server and Service for UK Academia. IASSIST Quarterly, Vol. 26, Issue 3 - http://iassistdata.org/publications/iq/iq26/iqvol263reid.pdf
Slide courtesy of James Reid (EDINA, 2010) - http://prezi.com/n8ui3umrjxfh/survive-or-thrive/
“Open data is the idea that certain datashould be freely available to everyoneto use and republish as they wish,without restrictions from copyright,patents or other mechanisms ofcontrol. The goals of the open datamovement are similar to those of other"Open" movements such as opensource, open hardware, open content,and open access.” – Wikipedia(13/2/13)“Open knowledge’ is any content, This file is made available under the Creative Commonsinformation or data that people are free CC0 1.0 Universal Public Domain Dedicationto use, re-use and redistribute —without any legal, technological orsocial restriction.” – Open KnowledgeFoundation (15/2/13)
Open Knowledge Foundation (OKF)(OKF) is an internationallyreknowned non-profit organisation(2004) dedicated to promoting opendata and open content – includinggovernment data, publicly fundedresearch and public domain culturalcontent.OKF build tools, projects andcommunities with a network ofinternational partners each focusedon different aspects of openknowledge, but united by commonconcerns & goals
The Comprehensive Knowledge ArchiveNetwork (CKAN) project is a web-basedsystem for the storage and distribution ofdata, such as spreadsheets and thecontents of databases.The system is used both as a publicplatform on thedatahub.org and in variousgovernment / regional data catalogues,such as the UKs data.gov.uk, and theEuropean Commission Open Data PortalCKAN source code is available fromgithub: https://github.comURL: http://ckan.org/
CKAN provides a rich RESTful JSON API for querying andaccessing dataset information. The API provides: • Full querying / searching • Dataset listings by publisher, or by theme, etc • Recent activity and additions (also available via RSS/Atom feed) • Statistics on dataset usage • RDF version of the catalogue (via an rdf extension) • CSV & JSON dumps of entire catalogueThe API is fully documented at http://docs.ckan.org/.
CKAN has advanced geospatial features:Data Preview: Where structured data with locationinformation is loaded into CKAN’s DataStore, CKAN canplot the data on an interactive map.Data Search: CKAN can understand a location associatedwith a dataset, and use this to offer geospatial searchcapabilities via the API e.g. by specifying a bounding box.Data Discovery: To facilitate interoperability CKANincludes tools to import geo-coded metadata in a numberof formats and make it queriable (‘discoverable’)according to the INSPIRE standard.For further geospatial capabilities see:http://docs.ckan.org/en/latest/geospatial.html
Data LicensingWhen sharing data, it is important to consider how you want your data tobe reused. Applying an explicit licence removes any ambiguity over whatusers can and cannot do with your data. Lawyers can craft licences tomeet specific criteria, but there are a number of open licences developedfor use on the web that anyone can apply.Licenses designed for one type of subject matter aren’t always best suitedto licensing another type of subject matter because of differences in howcopyright law applies.Creative Commons (CC) licences were designed for generic digitalcontent and may not be best suited to licensing specific types of subjectmatter which have different intellectual property rights.Indeed Creative Commons themselves have recommended against usingtheir licences (other than CC Zero - CC0, or "no rights reserved") for dataand databases.
Open Data CommonsOpen Data Commons (ODC) have prepared a set of licences suitable fordata that are conformant with the principles set forth in the OpenKnowledge Definition. Each licence is accompanied by a statementwhich can be placed with your data on a webpage that points to yourdata.
Research Data Repositorieshttp://databib.org/ - a searchable catalog / registry / directory ofresearch data repositorieshttp://www.re3data.org/ - a global registry of research data repositoriesfrom different academic disciplineshttp://datashare.is.ed.ac.uk/An online digital repository of multi-disciplinary research datasetsproduced at the University ofEdinburgh, hosted by the DataLibrary in on a DSpace platform
Map or spatial mash-up ‘resource discovery tool’ - there are 2659 spatial mashups that utilise a whole range of Web services, and 459 mapping APIs (Feb. 2013) - http://www.programmableweb.com/tag/ma pping Open Mapping Utilities: GeoCommons - http://geocommons.com/ OpenStreetMap - http://www.openstreetmap.org/ Google MapMaker - http://www.google.com/mapmaker Platial - http://www.platial.com/UCL Centre for Advanced Spatial Analysis - http://www.casa.ucl.ac.uk/MapTube - http://www.maptube.org/ - a free resource for viewing,sharing, mixing and mashing maps created with the GMapCreatorsoftware, released by CASA.
EDINA & Open Spatial Data - APIs/Developer tools:Unlock is a set of web services intended to help researchersand developers unlock the ‘spatial’ potential in digitalresources:Unlock Places - An API helps developers to find locations andshapes of places, and re-use them in your application.•Unique UK location database compiled from OS Gazetteers•An open access, worldwide coverage location database basedon open data from geonames.orgUnlock Geocodes – Convert UK postcodes or grid references toco-ordinatesUnlock Text – Extract place names from text or metadata to findtheir location using a geo-parserhttp://unlock.edina.ac.uk/home/
ShareGeo Open – A repository of freeand reusable data sets deposited byresearchers and research institutions•Find – search for user-contributeddatasets•Re-use – download datasets for research,teaching and learning•Share – contribute your own datasets forothers to use•Open – there are Open and Digimaplicensed versions for datasets with differentoriginshttp://www.sharegeo.ac.uk/
The Digimap OpenStream service provides access to a Web MapService (WMS) offering Ordnance Survey OpenData products, including:GB OverviewMiniscale1:250000 Colour RasterVectorMap District RasterOS StreetviewUse the Digimap OpenStream API to do things like:•Mashups, combining OS Opendata with maps and data from othersources•adding OS Opendata to Google Earth.•Embed maps in your website.•provide OS mapping in your own applications.Free for academic use. Registration required.See URL: http://openstream.edina.ac.uk/registration/
Third last slide…Gogeo – an online resource discovery tool for geospatial data created by UKresearchers: http://www.gogeo.ac.uk/metadata/search/Users can also create, publish and export metadata records using the Geodoc toolAddressingHistory has an API onto historic Post Office Directory data fromEdinburgh, Glasgow and Aberdeen - see URL:http://addressinghistory.edina.ac.uk/api/.The code for the POD Parser, used to convert Post Office Directory OCR intostructured data for AddressingHistory is also available on Github here:https://github.com/gmh04/podparser/.
Final comments• There’s a generally accepted assertion that 80% of all information has a spatial reference (implicit or otherwise) – exploit!• If you’re creating open data products then Licence it!• EDINA is a great place to start looking for spatial data services and tools including APIs!
FIN! - THANK YOU Credits: Image by aroid - http://www.flickr.com/photos/selago/34843234/ - CC BY 2.0 Image by konqui - http://www.flickr.com/photos/konqui/2301314089/ - CC BY-NC 2.0 Image by mosilager - http://www.flickr.com/photos/mosilager/2260598271/ - CC BY-NC-SA 2.0 Image by racoles - http://www.flickr.com/photos/racoles/5719938981/ - CC BY-NC 2.0 Image by James Bowe - http://www.flickr.com/photos/jamesrbowe/3351247547/ (CC BY 2.0) Image by yelnoc - http://www.flickr.com/photos/yelnoc/361303918/ - CC BY-NC-SA 2.0 Image by epSos.de - http://www.flickr.com/photos/epsos/3384297473/ - CC BY 2.0 Image by bek30 - http://www.flickr.com/photos/bek30/6107854810/ - CC BY-NC 2.0 Image by karen horton - http://www.flickr.com/photos/karenhorton/3261277303/ - CC BY-NC 2.0 Image by lofaesofa - http://www.flickr.com/photos/lofaesofa/227019975/ - CC BY 2.0 Image by Psycho Delia - http://www.flickr.com/photos/24557420@N05/5588473657/ - CC BY-NC 2.0 Image by wdj(0) - http://www.flickr .com/photos/davidjoyner/534893725/ - CC BY-SA 2.0 Image by Symic - http://www.flickr.com/photos/symic/2870349309/ - CC BY-SA 2.0 Image by ~milj - http://www.flickr.com/photos/21989292@N07/4938052014/ - CC BY-NC-SA 2.0 Image by giniger - http://www.flickr.com/photos/7304492@N06/417304290/ - CC BY-NC-SA 2.0 Image by Libraryman - http://www.flickr.com/photos/libraryman/78337046/ - CC BY-NC-ND 2.0 Image by Dru! - http://www.flickr.com/photos/druclimb/470572647/ - CC BY-NC 2.0 Image by Muffet - http://www.flickr.com/photos/calliope/7102418379/ - CC BY 2.0