Experiences as a producer, consumer and observer of open data

Peter Mooney, is an Environmental Protection Agency (EPA) funded Research Fellow at the Department of Computer Science, NUI Maynooth. He has been working with the EPA on making environmental data publicly accessibly for the last ten years.

Presentation was part of The 1st Seminar of the ERC Funded Programmable City Project based at NIRSA, NUI Maynooth, Republic of Ireland.



    • Experiences as a producer, consumer, and observer of open data Dr. Peter Mooney Senior Researcher, Department Computer Science, NUI Maynooth Data Manager and IT Specialist, STRIVE Research, EPA Ireland peterm7.com/mobile/me
    • Where am I? Top Down: EPA delivering environmental data Middle: EPA-funded research projects Bottom Up: Citizen-science, Volunteered Geographic Information Meeting Place: Where Top Down, Middle, and Bottom Up meet...
    • 5 Star Open Data Where organisations should, at least, be http://5stardata.info/ Where many organisations are Where the future of Open Data on the Web will be
    • Mark Johnstone’s Open Data Cake http://www.impactandlearning.org/2013/08/open-data-and-increasing-impact-of.html
    • Making and baking the cake - you can introduce flair but you must stick to the recipe and the recipe that works for others who are making the cake or giving you ingredients Working with different ingredients (licenses, copyright, databases, systems etc) http://www.flickr.com/photos/jillclardy/8374937036/sizes/m/
    • The icing and the presentation SAFER (http://erc.epa.ie/safer) Development of web-services to meet new European Directive Air Quality Data Exchange Regulations http://www.flickr.com/photos/nmoya/9962621026/sizes/c/
    • Eating the cake Fingal Apps Competition Using Open Data in preference to all others Open Data and open standards in any teaching I do. Conducting research into and using open data http://www.flickr.com/photos/revjim/3143266465/sizes/z/
    • Thinking up cake recipes 10 years data management experience with EPA Working with data in almost all the key thematic areas How to best deliver Open Data and Services to the public and other stakeholders Ensuring everyone's requirements are specified correctly http://www.flickr.com/photos/joannatwinn/2083379033/sizes/m/
    • Cake Taster: Is the cake safe to eat? Several years research work into the quality and usability of Volunteered Geographic Information (VGI) and Open Data 10 Journal papers on the topic and subject area http://www.steamykitchen.com/2583-wedding-cake-judging-adventures.html
    • Sharing the cake …. SAFER and EPA Research Data Many mainstream EPA reporting obligations (Air Quality, Drinking Water, Greenhouse Gas Inventories) http://www.dallasartsrevue.com/art-crit/Here_Lately/PerfArt/Performance_Art.html
    • Open Data EPA Ireland http://www.flickr.com/photos/aenimation/8997912525/sizes/m/
    • EPA’s Research Data Archive 2005
    • Providing ‘open-data’ PDF reports: knowledge distribution - but not actionable http://www.theguardian.com/global-development-professionals-network/2013/oct/21/development-open-data-action “like funding James Cameron to make Avatar, and then releasing it in a black and white flip book. We are missing all the good stuff”.
    • EPA Research Programme has been operating an open access approach over the last 5 years All EPA funded research projects must provide “significant outputs” (datasets, info resources, etc) for public access via SAFER Crucially, this couples the final reports/papers with the actual data/information used to generate the findings/recommendations etc
    • EPA’s Research Data Archive 2013
    • SAFER’s open-access approach has been EPA Research Data Archive 2013 very successful EPA Research Data Archive 2006
    • EPA: Drinking Water Monitoring Results and Water Supply Details for Ireland http://erc.epa.ie/safer/resourcelisting.jsp?oID=10206&username=EPA%20Drinking%20Water Joined SAFER in 2011 12 Resources on SAFER Over 380 CSV and Excel files Years 2000 - 2011 (2012 to appear soon) Over 1,400 downloads Success Story! Distribution of the Drinking Water files via CD is now supplemented with 24-7365 access to the archive of data online on SAFER
    • EPA: Complete Archive of Air Quality Monitoring Data for Ireland http://erc.epa.ie/safer/resourcelisting.jsp?oID=10136&username=EPA%20Air%20Quality Joined SAFER in 2010 16 Resources on SAFER Over 800 CSV and Excel files Years ~1998 - 2012 (2013 to appear Q1 2014) Over 4,500 downloads Success Story! Very significant person-hours saved as public/consultants/researchers etc can access the entire archive openly 24-7365
    • EPA: Geochemistry Data for the Historic Mine Sites Project - Inventory and Risk Classification http://erc.epa.ie/safer/resourcelisting.jsp?oID=10170&username=EPA%20Historical%20Mines Joined SAFER in 2011 4 Resources on SAFER Analysis Data (4 files) and Reports available Over 190 downloads Success Story! Collaboration with Geological Survey of Ireland - SAFER chosen as the outlet for the end-products from this project
    • SAFER providing Web display of information and equivalent machine readable Web Services For consumption as Open Data Information by other web-services and mobile apps For Web-page display
    • Air Quality Index For Health - multiple representations
    • VGI Open Data http://www.flickr.com/photos/treaclepondphotos/10799092965/sizes/m/
    • Volunteered Geographic Information (Goodchild, 2007) The ‘spatial’ case of User-Generated Content (UGC) Passive and Active data streams People driven - bottom up infrastructure - linked to ubiquitious Internet + smart-technologies and the evolution of sociotechnology online Immense historical growth - potentially unbounded future growth - yielding Spatial Big Data
    • VGI/UGC with embedded spatial attributes (Tweetmap) Todd Mostak, (MIT), the Harvard Center for Geographic Analysis (CGA) http://worldmap. harvard.edu/tweetmap/.
    • OpenStreetMap - one of the best known VGI projects Gl Global Node Density Map (2013) http: //tyrasd.github.io/osm-node-density/
    • OSM continues to grow at a considerable rate http://osmstats.altogetherlost.com/index.php October 8th 2013 1,391,737 (registered contributors) ~ 2,700 contributors active per day 2.3 billion nodes 200 million polygons 2.1 million relations
    • http://tools.geofabrik.de/mc/
    • New Forest, Hampshire, England
    • http://resultmaps.neis-one.org/osm-typhoon-haiyan-2013/#8/11.238/125.005 Nov 10th - Nov 13th 2013 Number of OSM Contributors: 608 Number of Map Changes: 1314912
    • Mapping agencies and municipalities can be contributors to OSM and VGI http://engagingcities.com/article/new-york-city-and-openstreetmap-collaborating-through-open-data#.Uk2ZBu2dkzk.twitter OSM - the conduit for allowing governments connected with open data communities and for citizens to interact with their government Citizens - generating and maintaining and updating VGI become a “collaborative intelligence” or “update intelligence”
    • EuroSDR + AGILE “Crowdsourcing in National Mapping” 5 Funded Projects (Phase 2 begins Q1 2014) National Mapping Agency Driven Development of test-bed/incubator ideas on how crowdsourcing of geospatial data could be used by National Mapping Agencies: Examples: Gamification for Update of Spanish National Gazetteer, Conflation of official spatial data in Germany with OSM, place name extraction from Flickr for IGN France Gazetteer, ‘off-the-beatentrack’ tourist sites in Lithuania (National SDI)
    • COST ACTION TD1202 “Mapping and the Citizen Sensor” Accurate and timely maps are a fundamental resource but their production in a changing world is a major scientific and practical grand challenge. This EU funded COST Action seeks to rise to the challenges and enhance the role of citizen sensors in mapping activity. COST Action TD1202 is an EU funded interdisciplinary networking activity that involves ~27 countries
    • OSM Network Evolution - following well known network infrastructures http://www.theatlanticcities.com/commute/2013/03/mapping-growth-openstreetmap/4982/ Corcoran, Mooney, Bertolotto "Analysing the growth of OpenStreetMap networks" Spatial Statistics Vol 3, 21-32. 2013
    • Collaborative networks develop amongst contributors building OSM Berlin: All interactions (co-edits) - all contributors to OSM in Berlin Berlin: All interactions (co-edits) - highest ranked contributors (by Eigenvector Centrality measures) Mooney and Corcoran (2013) “Interaction and co-editing patterns between contributors to OpenStreetMap” To Appear Transactions in GIS
    • Using Spatial Simulation Models (Cellular Automata Markov Chains) to predict future OSM contribution activities Contribution of Nodes over time... Karlsruhe Stuttgart 2014 Jokar, Mooney, Helbeich (2013) in review IJGIS “Spatio-temproal analysis of patterns of contribution in OSM - a case study” Development of a Contribution Index (CI) to indicate if cells are being edited and maintained
    • VGI in Africa: Who is performing the majority of the work in VGI (OSM for example)? - Study of Edit Histories South Africa: Cape Town (15 exclusively local from 41) Kenya: Nairobi (17 from 30) DR Congo: Kinshasa (0 from 10) Rep. Sudan: Khartoum (2 from 14) Burkina Faso: Ouagadougou (0 from 14) Cameroon: Yaoundé (1 from 7) Tanzania: Dar es Salaam (12 from 25) High-frequency contributors studied “Exclusively Local” - Contributors whose mapping activities are contained exclusively in a given city or region.
    • Conclusions and Summary http://www.flickr.com/photos/troudd/10797161793/sizes/m/
    • Delivering data and services: we have always had telephones to worry about http://en.wikipedia.org/wiki/File:US_Robotics_56K_Modem_Front.JPG http://cdn3.pcadvisor.co.uk/cmsdata/features/3405369/latest_samsung_galaxy_smartphone.jpg
    • Deliver [open data] products which suit the requirements and skills of the majority of your customers or stakeholders Note: No pizzas or fajitas were made or consumed during the making of this slide
    • OpenStreetMap provides access to it’s database using three methods - different skill levels for different end applications - tools and apps are built by the community Data access API Access to the raw data in various formats Map Rendering and Map Services It’s the OSM community which have build GUI apps for OSM data upload and editing, mobile apps, rendering engines, etc.
    • What ROI are EPA getting for this investment in an open data approach? AQ PM10 Data Drinking Water 2010 553 Downloads (Nov 1st 2013) 553 x 5 mins per request = 2765 minutes 2765 / 60 = 46 hours 2138 downloads (Nov 1st 2013) 2138 X 6 mins per request = 12828 minutes 12828 / 60 = 213 hours 8 hour working day 8 hour working day TOTAL 26.5 days TOTAL 5.75 Days DW: http://erc.epa.ie/safer/iso19115/displayISO19115.jsp?isoID=268 PM10: http://erc.epa.ie/safer/iso19115/displayISO19115.jsp?isoID=69
    • Which is more important/difficult? The icing or the cake? ● Open Data isn’t the problem (icing) ● The biggest obstacles are: Information management, governance policies, data structures, ROI, resource management, sharing culture, etc (cake)
    • For organisations Open Data is a little like daily exercise Open Data must become part of organisation/office routine Organisational support (technical and people) A clear set of targets Clear and unambiguous set of reasons for undertaking open data http://www.flickr.com/photos/tulanesally/5860177655/sizes/n/
    • OPEN Data OPEN Access OPEN Information http://www.flickr.com/photos/sigurtor/10805521514/sizes/z/
    • Would Open Data work in a GitHub style of environment?
    • Would Open Data work best in server-based cataloging applications? You will find support for a range of standards. Metadata standards (ISO19115/ISO19119/ISO1 9110 following ISO19139, FGDC and Dublin Core), Catalog interfaces (OGCCSW2.0.2 ISO profile client and server, OAI-PMH client and server, GeoRSS server, GEO OpenSearch server, WebDAV harvesting, GeoNetwork to GeoNetwork harvesting support) and Map Services interfaces (OGC-WMS, WFS, WCS, KML and others) through
    • Open Data as Linked Data?
    • INSPIRE Air Quality Data: Example
    • What about BIG datasets? Web Services (Download, View, Convert, ETL) Application Programming Interfaces (API) - JSON, KML, etc Pre-cooked, dynamically updated, download ‘packages’? IT Infrastructure Considerations http://www.flickr.com/photos/chipkrieger/10781652035/sizes/z/
    • The answer is probably a combination of all of these approaches http://www.flickr.com/photos/jleighb/142336560/ http://www.flickr.com/photos/16850197@N04/6796313167/sizes/n/ http://www.flickr.com/photos/strawberrylanecakecompany/5847996107/ http://www.flickr.com/photos/happy_homebaking/459555139/sizes/m/ http://en.wikipedia.org/wiki/File:Whitweddingcake.jpg
    • Take home messages Baking the Open Data cake can be difficult - but your consumers and stakeholders will tell you how it tastes. If you haven’t started - get baking! For organisations - Open Data does have many costs. But there is substantial ROI when Open Data + Services are rolled out and adopted. VGI and Citizen Science/Participation: Not to be underestimated. Moving very quickly and gathering momentum. These communities are capable of baking their own open data cakes! Open Data - The Future is NOW! INSPIRE, Big Data, Linked Data, Urban Sensing, Web 3.0, etc…
    • The End: Thanks for listening! Email: peter.mooney@nuim.ie and p. mooney@epa.ie Web: www.peterm7.com/mobile/me http://www.flickr.com/photos/phils-pixels/10728144746/sizes/z/