SlideShare a Scribd company logo
agINFRA
A data infrastructure to
support agricultural scientific
communities
Andreas Drakos, University of Alcala
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014
Our project

in agINFRA we will:

share agricultural research…
…over a data e-infrastructure

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

2
Agricultural research data
• Primary data:
– Structured, e.g. datasets as tables
– Digitized : images, videos, etc.

• Secondary data (elaborations, e.g. a dendogram)
• Provenance information, incl. authors, their
organizations and projects
• Methods and procedures followed
• Reports, including papers
• Secondary documents, e.g. training resources
• Metadata about the above
• Social data, tags, ratings, etc.
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

3
agINFRA values: scientific data must be
A

| Open |

Must be open and interlinked
NOT subject to barriers, based on standard formats and avoiding building
data silos due to lack of interrelatedness and ad-hoc APIs.

B

| Meaningful | Must be meaningful through explicit semantics
Reusing the semantics already provided in mature terminologies and
ontologies that are exposed and interlinked through the Web.

C

| Reliable | Must be reliable, traceable and accessible
Any kind of research objects can be stored in the data infrastructure, and
there are NO barriers to expressing relations between these objects to
capture the context of research activities.

D

| Actionable | Must be actionable via services that empower research
Data is not useful without flexible and adaptable services that allow
researchers to act on the data in the ways they need.
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

4
There is a lot of data

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

5
CONTENT PROVIDER
WITH UNORGANISED
COLLECTION
(e.g. listed at Web
site or in DVD-ROM)

chooses sharing
compliant tool

register as
data source

hosted over agINFRA

(meta)data export in
proprietary format & ingestion in sharing
mapping to known
compliant tool
CONTENT PROVIDER
WITH CMS THAT DOES
NOT SUPPORT
SHARING (e.g.
proprietary DB)

register as
data source

hosted over agINFRA
computed over agINFRA

register as
data source
hosted over agINFRA
CONTENT PROVIDER
WITH CMS THAT
SUPPORTS SHARING
(e.g. OAI-PMH,
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014
RSS,...)

6
shares (meta)data
e.g. through OAI-PMH

computed over agINFRA
hosted over agINFRA
shares (meta)data
e.g. through OAI-PMH

computed over agINFRA

computed over agINFRA

(META)DATA
AGGREGATOR

indexed & available
through CIARD RING

served through agINFRA

shares (meta)data
e.g. through OAI-PMH

computed over agINFRA

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

7
computed over agINFRA

computed over agINFRA

…
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

hosted over agINFRA
computed over agINFRA

8
Actors over the infrastructure
Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources

Cloud / SaaS tools

APIs

LOD Vocabularies
agINFRA RDF
vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

Information services

agINFRA LOD KOSs

9
Actors over the infrastructure
Developers
Information
systems
providers

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources

Cloud / SaaS tools

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

Taxonomists

APIs

LOD Vocabularies

Data providers
agINFRA RDF
vocabularies

agINFRA LOD KOSs

Researchers

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

Information services

Policy makers 10
An existing data community

• a global community movement to make
agricultural research information and
knowledge publicly accessible to all
– http://www.ciard.net

agINFRA 2nd Review Meeting, 13th of December 2013

11
A core registry service

• CIARD RING (Routemap to Information Nodes
and Gateways)
– global registry to give access to any kind of
information sources pertaining to agricultural
research for development
– principal tool created through CIARD to allow
information providers to register their services in
various categories and facilitate discovery of
sources of agriculture-related information across
the world
agINFRA 2nd Review Meeting, 13th of December 2013

12
New agINFRA RING

agINFRA 2nd Review Meeting, 13th of December 2013

13
New agINFRA RING

agINFRA 2nd Review Meeting, 13th of December 2013

14
RING data registry usage scenario 1

• data aggregators registering their data
providers to
CIARD RING
– asking directly to
be registered there
(AGRIS)
– federating own
smaller registries
(GLN)

agINFRA 2nd Review Meeting, 13th of December 2013

15
RING data registry usage scenario 2

• new data providers using agINFRA cloud tools
can be automatically registered to CIARD RING
– cloud-hosted AgriDrupal or AgriOceanDSpace
instances for document repositories
– cloud-hosted agLR instances for learning
repositories

• agINFRA Cloud hosting services
– In collaboration with other cloud communities
(eg. OKEANOS/GRNET)
– In collaboration with CHAIN-REDS project etc.
agINFRA 2nd Review Meeting, 13th of December 2013

16
Data provider scenario 1
Data provider in
need of hosting &
storage of smallscale CMS

Use a cloud
hosted CMS
Cloud / SaaS tools

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources
APIs

LOD Vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools
agINFRA RDF
vocabularies

agINFRA LOD KOSs

sets up own CMS instance

agINFRA 2nd Review Meeting, 13th of December 2013

Information services

17
Data provider scenario 2
Data provider in
need of large scale
hosting &
replication CMS
Requests
space/accounts
in large-scale
CMS
Cloud / SaaS tools

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources
APIs

LOD Vocabularies
agINFRA RDF
vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

agINFRA 2nd Review Meeting, 13th of December 2013

Information services

agINFRA LOD KOSs

18
A semantic backbone for agINFRA

• to help all data providers declaring, publishing &
linking their metadata properties and value
spaces
– Publishing their KOSs using the VocBench and their
metadata vocabularies using Neologism
– Linking them to existing vocabularies, e.g. AGROVOC
for KOSs, Dublin Core for metadata

• guidelines & tools to support data providers in
adopting such a LOD framework
– e.g. LODE-BD recommendations

• to provide an entry point to existing relevant
vocabularies
agINFRA 2nd Review Meeting, 13th of December 2013

19
Exposing to the e-infrastructure scenario
Data provider
hosting CMS at
own or
external/commerci
al infrastructure
Interested to expose
(meta)data to einfrastructure
Cloud / SaaS tools

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources
APIs

LOD Vocabularies
agINFRA RDF
vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

agINFRA 2nd Review Meeting, 13th of December 2013

Information services

agINFRA LOD KOSs

20
agINFRA LOD layer usage scenario 1
• A data owner wants to share their data as Linked
Data
• The data owner uses non-LOD vocabularies and
KOSs and wants to publish them as LOD and link
them to existing vocabularies
• agINFRA offers tools for publishing vocabularies
and KOSs

Once the vocabularies are published, all metadata
and all concepts have URIs and can be referenced by
any other system
agINFRA 2nd Review Meeting, 13th of December 2013

21
agINFRA LOD layer usage scenario 2
• Once KOSs are published, all metadata and all
concepts have URIs and can be referenced by any
other system
• Data aggregators like AGRIS and GLN can create
mash ups between their core data and other
agricultural data types (e.g. germplasm, soil maps,
statistics, ….) by using the LOD semantic backbone as
a crosswalk between metadata formalizations and
concepts in different vocabularies

agINFRA 2nd Review Meeting, 13th of December 2013

22
agINFRA LOD layer usage scenario 2
Example: LOD-based mash-ups in AGRIS
AGRIS bibliographic metadata
Journal

AGRIS
Journals
RDF store

Topic
Geographic
metadata

Thematic
metadata

DBpedia

Scientific
names

FAO Country
Profiles

FAO
Fisheries

WorldBank
indicators by
country
Info on
journal

Info on
topic
Info on
country

agINFRA 2nd Review Meeting, 13th of December 2013

Info on
species
Specific
indicators on
country

23
Workflow architecture

File system
(DC, IEEE
LOM, MODS
XML)

Stores

Ariadne
harvester

File system
(DC, IEEE
LOM, MODS
XML)

Stores

Filtering
component
To be ported on
the Grid

MySQL

Records
with
Broken
Links

File
system
(XMLs)

Get unique ID

Identification and
de-duplication
component

Transformation
component

Stores

Duplicates

Store
metadata
in JSON

Link checking
component

PostProcessing/
Enrichment
component
Thank you!
Questions

More Related Content

What's hot

Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
SpringPeople
 
Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.
Anirudh Gangwar
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
Mark Kromer
 
Towards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF DataTowards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF Data
Linked Enterprise Date Services
 
Multidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGISMultidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGIS
The HDF-EOS Tools and Information Center
 
Cloud computing major project
Cloud computing major projectCloud computing major project
Cloud computing major project
ayk115
 
PoolParty Search Server
PoolParty Search ServerPoolParty Search Server
PoolParty Search Server
Andreas Blumauer
 
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
OW2
 
Data Visualization Project Presentation
Data Visualization Project PresentationData Visualization Project Presentation
Data Visualization Project Presentation
Shubham Shrivastava
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
Ognjen Antonic
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
Mark Kromer
 
Hadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy TableHadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy Table
Cloudera, Inc.
 
No sql databases
No sql databasesNo sql databases
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
Lucian Neghina
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
Natalino Busa
 
Solution architecture for big data projects
Solution architecture for big data projectsSolution architecture for big data projects
Solution architecture for big data projects
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
CSB_community
CSB_communityCSB_community
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
OW2
 
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
SpagoWorld
 

What's hot (20)

Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
 
Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Towards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF DataTowards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF Data
 
Multidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGISMultidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGIS
 
Cloud computing major project
Cloud computing major projectCloud computing major project
Cloud computing major project
 
PoolParty Search Server
PoolParty Search ServerPoolParty Search Server
PoolParty Search Server
 
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
 
Data Visualization Project Presentation
Data Visualization Project PresentationData Visualization Project Presentation
Data Visualization Project Presentation
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Hadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy TableHadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy Table
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 
Solution architecture for big data projects
Solution architecture for big data projectsSolution architecture for big data projects
Solution architecture for big data projects
 
CSB_community
CSB_communityCSB_community
CSB_community
 
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
 
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
 

Similar to agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

The new CIARD RING , a machine-readable directory of datasets for agriculture
The new CIARD RING, a machine-readable directory of datasets for agricultureThe new CIARD RING, a machine-readable directory of datasets for agriculture
The new CIARD RING , a machine-readable directory of datasets for agriculture
Valeria Pesce
 
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
The CIARD RING, a global directory of datasets for agriculture, by Valeria P...The CIARD RING, a global directory of datasets for agriculture, by Valeria P...
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
CIARD Movement
 
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
Christophe Guéret
 
What's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache SparkWhat's next for Big Data? -- Apache Spark
Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...
Pedro Príncipe
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk Overview
Splunk
 
Red hat infrastructure for analytics
Red hat infrastructure for analyticsRed hat infrastructure for analytics
Red hat infrastructure for analytics
Kyle Bader
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
Trivadis
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
CIARD Movement
 
Science and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in BrazilScience and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in Brazil
ATMOSPHERE .
 
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
LinDa_FP7
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
Alasdair Gray
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
AWS User Group Kochi
 
TEAMS 6, 7 and 8
TEAMS 6, 7 and 8TEAMS 6, 7 and 8
TEAMS 6, 7 and 8
plan4all
 
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
LIBER Europe
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
BigData_Europe
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
John Evans
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
eswcsummerschool
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
Jen Stirrup
 

Similar to agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 (20)

The new CIARD RING , a machine-readable directory of datasets for agriculture
The new CIARD RING, a machine-readable directory of datasets for agricultureThe new CIARD RING, a machine-readable directory of datasets for agriculture
The new CIARD RING , a machine-readable directory of datasets for agriculture
 
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
The CIARD RING, a global directory of datasets for agriculture, by Valeria P...The CIARD RING, a global directory of datasets for agriculture, by Valeria P...
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
 
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 
What's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache SparkWhat's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache Spark
 
Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk Overview
 
Red hat infrastructure for analytics
Red hat infrastructure for analyticsRed hat infrastructure for analytics
Red hat infrastructure for analytics
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 
Science and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in BrazilScience and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in Brazil
 
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
TEAMS 6, 7 and 8
TEAMS 6, 7 and 8TEAMS 6, 7 and 8
TEAMS 6, 7 and 8
 
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
 

More from Andreas Drakos

My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?
Andreas Drakos
 
USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015
Andreas Drakos
 
agINFRA vision after the end of the project
agINFRA vision after the end of the projectagINFRA vision after the end of the project
agINFRA vision after the end of the project
Andreas Drakos
 
Edrene.2014 ODS Application Profile
Edrene.2014 ODS Application ProfileEdrene.2014 ODS Application Profile
Edrene.2014 ODS Application Profile
Andreas Drakos
 
Big Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceBig Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experience
Andreas Drakos
 
AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013
Andreas Drakos
 

More from Andreas Drakos (6)

My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?
 
USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015
 
agINFRA vision after the end of the project
agINFRA vision after the end of the projectagINFRA vision after the end of the project
agINFRA vision after the end of the project
 
Edrene.2014 ODS Application Profile
Edrene.2014 ODS Application ProfileEdrene.2014 ODS Application Profile
Edrene.2014 ODS Application Profile
 
Big Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceBig Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experience
 
AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013
 

Recently uploaded

WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 

Recently uploaded (20)

WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 

agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

  • 1. agINFRA A data infrastructure to support agricultural scientific communities Andreas Drakos, University of Alcala EGI-APARSEN workshop, Amsterdam, 4-6 March 2014
  • 2. Our project in agINFRA we will: share agricultural research… …over a data e-infrastructure EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 2
  • 3. Agricultural research data • Primary data: – Structured, e.g. datasets as tables – Digitized : images, videos, etc. • Secondary data (elaborations, e.g. a dendogram) • Provenance information, incl. authors, their organizations and projects • Methods and procedures followed • Reports, including papers • Secondary documents, e.g. training resources • Metadata about the above • Social data, tags, ratings, etc. EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 3
  • 4. agINFRA values: scientific data must be A | Open | Must be open and interlinked NOT subject to barriers, based on standard formats and avoiding building data silos due to lack of interrelatedness and ad-hoc APIs. B | Meaningful | Must be meaningful through explicit semantics Reusing the semantics already provided in mature terminologies and ontologies that are exposed and interlinked through the Web. C | Reliable | Must be reliable, traceable and accessible Any kind of research objects can be stored in the data infrastructure, and there are NO barriers to expressing relations between these objects to capture the context of research activities. D | Actionable | Must be actionable via services that empower research Data is not useful without flexible and adaptable services that allow researchers to act on the data in the ways they need. EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 4
  • 5. There is a lot of data EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 5
  • 6. CONTENT PROVIDER WITH UNORGANISED COLLECTION (e.g. listed at Web site or in DVD-ROM) chooses sharing compliant tool register as data source hosted over agINFRA (meta)data export in proprietary format & ingestion in sharing mapping to known compliant tool CONTENT PROVIDER WITH CMS THAT DOES NOT SUPPORT SHARING (e.g. proprietary DB) register as data source hosted over agINFRA computed over agINFRA register as data source hosted over agINFRA CONTENT PROVIDER WITH CMS THAT SUPPORTS SHARING (e.g. OAI-PMH, EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 RSS,...) 6
  • 7. shares (meta)data e.g. through OAI-PMH computed over agINFRA hosted over agINFRA shares (meta)data e.g. through OAI-PMH computed over agINFRA computed over agINFRA (META)DATA AGGREGATOR indexed & available through CIARD RING served through agINFRA shares (meta)data e.g. through OAI-PMH computed over agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 7
  • 8. computed over agINFRA computed over agINFRA … EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 hosted over agINFRA computed over agINFRA 8
  • 9. Actors over the infrastructure Registry of Datasets and APIs collections Registry of vocabularies and tools data sources Cloud / SaaS tools APIs LOD Vocabularies agINFRA RDF vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 Information services agINFRA LOD KOSs 9
  • 10. Actors over the infrastructure Developers Information systems providers Registry of Datasets and APIs collections Registry of vocabularies and tools data sources Cloud / SaaS tools Public REST APIs Grid jobs Grid workflowss Productivity Tools Taxonomists APIs LOD Vocabularies Data providers agINFRA RDF vocabularies agINFRA LOD KOSs Researchers EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 Information services Policy makers 10
  • 11. An existing data community • a global community movement to make agricultural research information and knowledge publicly accessible to all – http://www.ciard.net agINFRA 2nd Review Meeting, 13th of December 2013 11
  • 12. A core registry service • CIARD RING (Routemap to Information Nodes and Gateways) – global registry to give access to any kind of information sources pertaining to agricultural research for development – principal tool created through CIARD to allow information providers to register their services in various categories and facilitate discovery of sources of agriculture-related information across the world agINFRA 2nd Review Meeting, 13th of December 2013 12
  • 13. New agINFRA RING agINFRA 2nd Review Meeting, 13th of December 2013 13
  • 14. New agINFRA RING agINFRA 2nd Review Meeting, 13th of December 2013 14
  • 15. RING data registry usage scenario 1 • data aggregators registering their data providers to CIARD RING – asking directly to be registered there (AGRIS) – federating own smaller registries (GLN) agINFRA 2nd Review Meeting, 13th of December 2013 15
  • 16. RING data registry usage scenario 2 • new data providers using agINFRA cloud tools can be automatically registered to CIARD RING – cloud-hosted AgriDrupal or AgriOceanDSpace instances for document repositories – cloud-hosted agLR instances for learning repositories • agINFRA Cloud hosting services – In collaboration with other cloud communities (eg. OKEANOS/GRNET) – In collaboration with CHAIN-REDS project etc. agINFRA 2nd Review Meeting, 13th of December 2013 16
  • 17. Data provider scenario 1 Data provider in need of hosting & storage of smallscale CMS Use a cloud hosted CMS Cloud / SaaS tools Registry of Datasets and APIs collections Registry of vocabularies and tools data sources APIs LOD Vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools agINFRA RDF vocabularies agINFRA LOD KOSs sets up own CMS instance agINFRA 2nd Review Meeting, 13th of December 2013 Information services 17
  • 18. Data provider scenario 2 Data provider in need of large scale hosting & replication CMS Requests space/accounts in large-scale CMS Cloud / SaaS tools Registry of Datasets and APIs collections Registry of vocabularies and tools data sources APIs LOD Vocabularies agINFRA RDF vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools agINFRA 2nd Review Meeting, 13th of December 2013 Information services agINFRA LOD KOSs 18
  • 19. A semantic backbone for agINFRA • to help all data providers declaring, publishing & linking their metadata properties and value spaces – Publishing their KOSs using the VocBench and their metadata vocabularies using Neologism – Linking them to existing vocabularies, e.g. AGROVOC for KOSs, Dublin Core for metadata • guidelines & tools to support data providers in adopting such a LOD framework – e.g. LODE-BD recommendations • to provide an entry point to existing relevant vocabularies agINFRA 2nd Review Meeting, 13th of December 2013 19
  • 20. Exposing to the e-infrastructure scenario Data provider hosting CMS at own or external/commerci al infrastructure Interested to expose (meta)data to einfrastructure Cloud / SaaS tools Registry of Datasets and APIs collections Registry of vocabularies and tools data sources APIs LOD Vocabularies agINFRA RDF vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools agINFRA 2nd Review Meeting, 13th of December 2013 Information services agINFRA LOD KOSs 20
  • 21. agINFRA LOD layer usage scenario 1 • A data owner wants to share their data as Linked Data • The data owner uses non-LOD vocabularies and KOSs and wants to publish them as LOD and link them to existing vocabularies • agINFRA offers tools for publishing vocabularies and KOSs Once the vocabularies are published, all metadata and all concepts have URIs and can be referenced by any other system agINFRA 2nd Review Meeting, 13th of December 2013 21
  • 22. agINFRA LOD layer usage scenario 2 • Once KOSs are published, all metadata and all concepts have URIs and can be referenced by any other system • Data aggregators like AGRIS and GLN can create mash ups between their core data and other agricultural data types (e.g. germplasm, soil maps, statistics, ….) by using the LOD semantic backbone as a crosswalk between metadata formalizations and concepts in different vocabularies agINFRA 2nd Review Meeting, 13th of December 2013 22
  • 23. agINFRA LOD layer usage scenario 2 Example: LOD-based mash-ups in AGRIS AGRIS bibliographic metadata Journal AGRIS Journals RDF store Topic Geographic metadata Thematic metadata DBpedia Scientific names FAO Country Profiles FAO Fisheries WorldBank indicators by country Info on journal Info on topic Info on country agINFRA 2nd Review Meeting, 13th of December 2013 Info on species Specific indicators on country 23
  • 24. Workflow architecture File system (DC, IEEE LOM, MODS XML) Stores Ariadne harvester File system (DC, IEEE LOM, MODS XML) Stores Filtering component To be ported on the Grid MySQL Records with Broken Links File system (XMLs) Get unique ID Identification and de-duplication component Transformation component Stores Duplicates Store metadata in JSON Link checking component PostProcessing/ Enrichment component