Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tutorial on grid-powered data aggregation and accessing datasets

596 views

Published on

Hands-on with data infrastructures that can power your agricultural data products

  • Be the first to comment

  • Be the first to like this

Tutorial on grid-powered data aggregation and accessing datasets

  1. 1. Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products 12/12/2013 Athens, Greece Supported by EU projects
  2. 2. Tutorial on grid-powered data aggregation and accessing datasets Nikos Manolis Agro-Know Technologies
  3. 3. There is a lot of data Slide 3 of 63
  4. 4. Need for data aggregation and harmonization Slide 4 of 63
  5. 5. Objectives This presentation aims to provide information on:  How to use a service for aggregating datasets  How to get already processed datasets  How to search processed datasets with a search API • Educational – GLN API (21008 metadata recs) • Bibliographic – ABN API (451602 metadata recs) Slide 5 of 63
  6. 6. The agDataHarvester service • Implements the OAI-PMH protocol to harvest metadata records from open data providers – REST-based API – Harvested dataset available through HTTP Slide 6 of 63
  7. 7. agDataHarvester parameters { "document_type": "harvesting_target", "harvesting_target": { "name":"Repository name", "description":”Short Repository Description", "url":"OAI-PMH target URL", "type":"metadata format prefix", "frequency":hours } } Slide 7 of 63
  8. 8. param.json { "document_type": "harvesting_target", "harvesting_target": { "name":"Indian Academy of Science", "description":"Indian Academy of Science", "url":"http://repository.ias.ac.in/cgi/oai2", "type":"mets", "frequency":24 } } curl -X POST -d@param.json http://'demo001':aginfra@agro.ipb.ac.rs/agcouchdb curl -X POST -d@INDUS.json { "ok": true, "id": " 5c56a3fa18fa21d2a85fd63cc9eb78ac ", "rev": "1http://'demo001':aginfra@agro.ipb. ac.rs/agcouchdb 19ef1210376df8f1695a32b53ecb963a" } Slide 8 of 63
  9. 9. Get details on the dataset http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list?dataset.process_parameter_id= 5c56a3fa18fa21d2a85fd63cc9eb78ac Slide 9 of 63
  10. 10. Get details on the dataset http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_view/list_by_process?key=agdataharvester {"id": "6796259b52d79e4797e210c06e6a0aee", "key": "6796259b52d79e4797e210c06e6a0aee", "value": { "_id": "6796259b52d79e4797e210c06e6a0aee", "_rev": "1-d55d7bc90d26db64dae328c9328e4e4a", "document_type": "harvesting_target", "harvesting_target": { "name": “WorldBank", "description": "The World Bank - Open Knowledge Repository", "url": ""https://openknowledge.worldbank.org/oai/request", "type": “mets", "frequency": 24 }, "document_publisher": { "address": "83.212.96.169", "author": "demo001", "utc_datetime": "Wed Dec 11 11:58:45 2013", "utc_timestamp": 1386763125 } } } Slide 10 of 63
  11. 11. The agWorkflow service I want all datasets with educational resources processed by the agINFRA powered aggregation workflow ! http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list? dataset.process=agworkflow&dataset.type=oai_lom&dataset.accuracy=true I want all datasets with bibliographic resources processed by the agINFRA powered aggregation workflow ! http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list? dataset.process=agworkflow&dataset.type=oai_agris&dataset.accuracy=true Slide 11 of 63
  12. 12. Is there a way to search on available datasets ? Slide 12 of 63
  13. 13. Search API • REST-based queries over harmonized information (result of metadata processing) • Two data models supported – akif: describing educational resources for agriculture, http://domain/search-api/v1/akif/?q=* – agrif: describing bibliographic resources for agriculture (mainly from FAO’s data), http://domain/search-api/v1/agrif/?q=* Slide 13 of 63
  14. 14. Search options • Simple search http://domain/search-api/v1/akif/?q=tomato • Searching within specific fields http://BASE_URL/searchapi/v1/akif/?languageBlocks.en.description=tomato • Temporal http://BASE_URL/search-api/v1/akif/?creationDate=2013-04-16 • Fetching specific items http://BASE_URL/search-api/v1/akif/COLLECTION/20296 Slide 14 of 63
  15. 15. Managing results • Sorting results e.g ?q=*&sort_by=creationDate&sort_order=desc • Facets e.g ?facets=set&facet_size=3 • Pagination e.g ?q=sea&page_size=25&page=3 Slide 15 of 63
  16. 16. Nikos Manolis Agro-Know Technologies manolisn@agroknow.gr

×