agINFRA
– open data
infrastructures for
research in agriculture

Antun Balaz
Institute of Physics Belgrade
agINFRA in a few sentences
① Based on a linked open data architecture
  harmonizing semantics and ontologies.
② Aggregating data of existing systems and taking
  advantage of advanced Grid services and
  infrastructure.
③ Devised for scalability and maximum interoperability
  by adapting existing widely used components.
④ Fostering diverse communities of heterogeneous
  providers and users.
⑤ Providing researcher-centric services.
Why agINFRA?
Why sharing data?
• Sharing research data is an intricate and difficult problem
• Not much data sharing may be taking place – with exceptions
  in some domains.
• Sharing takes different forms, from private data exchange to
  posting on-line, and including journal supplementary
  materials.
• There are few standards for giving shared data the required
  computational semantics to build automated tools.
• …however reusing data is at the core of the principles of the
  scientific method
• … and a major concern for scientists and policy makers.
What kinds of data?
• Primary data:
   – Structured data, e.g. datasets as tables
   – Digitized data: images, videos, etc.
• Secondary data
   – Elaborations of the primary, e.g. a dendogram
• Provenance information, including authors, their organizations and
  projects
• Methods and procedures followed
• Reports, including papers
• Secondary documents, e.g. training resources
• Metadata about the above
• Social data, tags, ratings, etc.
Why a data infrastructure?

• Enables relating data and combining and
  contrasting them in novel ways
• Enables scalable processing of research data
• Provides easy-to-adopt and deploy services
• Supports a data-centric, integrated view of research
• Gives a coherent support to a variety of research
  objects
Conceptual architecture
A proposal for an agINFRA
manifesto
agINFRA values
    We truly believe that scientific data

A      | Open |          Must be open and interlinked
                        NOT subject to barriers, based on standard formats and avoiding building
                        data silos due to lack of interrelatedness and ad-hoc APIs.


B      | Meaningful | Must be meaningful through explicit semantics
                          Reusing the semantics already provided in mature terminologies and
                          ontologies that are exposed and interlinked through the Web.


C      | Reliable | Must be reliable, traceable and accessible
                          Any kind of research objects can be stored in the data infrastructure, and
                          there are NO barriers to expressing relations between these objects to
                          capture the context of research activities.

D      | Actionable | Must be actionable trough services that empower research
                          Data is not useful without flexible and adaptable services that allow
                          researchers to act on the data in the ways they need.
agINFRA principles
     We trust that the following principles support the values


Infrastructure               People                              Services

                             Know and adapt to the               Use existing components
Be sustainable in the        needs of researchers                supported by strong
long term                                                        communities
                             Provide out-of-the-box,
Allow for heterogeneous      easy to adopt components            Create open services that
and rich kinds of data                                           can be easily composed
featuring semantics          Foster collaboration and
                             sharing of data, via search         Adapt services to research
Expose everything as         but also casual discovery           workflows
linked open data
For more information, visit us at:

    http://www.aginfra.eu/

agINFRA CEFood Presentation

  • 1.
    agINFRA – open data infrastructuresfor research in agriculture Antun Balaz Institute of Physics Belgrade
  • 2.
    agINFRA in afew sentences ① Based on a linked open data architecture harmonizing semantics and ontologies. ② Aggregating data of existing systems and taking advantage of advanced Grid services and infrastructure. ③ Devised for scalability and maximum interoperability by adapting existing widely used components. ④ Fostering diverse communities of heterogeneous providers and users. ⑤ Providing researcher-centric services.
  • 3.
  • 4.
    Why sharing data? •Sharing research data is an intricate and difficult problem • Not much data sharing may be taking place – with exceptions in some domains. • Sharing takes different forms, from private data exchange to posting on-line, and including journal supplementary materials. • There are few standards for giving shared data the required computational semantics to build automated tools. • …however reusing data is at the core of the principles of the scientific method • … and a major concern for scientists and policy makers.
  • 5.
    What kinds ofdata? • Primary data: – Structured data, e.g. datasets as tables – Digitized data: images, videos, etc. • Secondary data – Elaborations of the primary, e.g. a dendogram • Provenance information, including authors, their organizations and projects • Methods and procedures followed • Reports, including papers • Secondary documents, e.g. training resources • Metadata about the above • Social data, tags, ratings, etc.
  • 6.
    Why a datainfrastructure? • Enables relating data and combining and contrasting them in novel ways • Enables scalable processing of research data • Provides easy-to-adopt and deploy services • Supports a data-centric, integrated view of research • Gives a coherent support to a variety of research objects
  • 7.
  • 8.
    A proposal foran agINFRA manifesto
  • 9.
    agINFRA values We truly believe that scientific data A | Open | Must be open and interlinked NOT subject to barriers, based on standard formats and avoiding building data silos due to lack of interrelatedness and ad-hoc APIs. B | Meaningful | Must be meaningful through explicit semantics Reusing the semantics already provided in mature terminologies and ontologies that are exposed and interlinked through the Web. C | Reliable | Must be reliable, traceable and accessible Any kind of research objects can be stored in the data infrastructure, and there are NO barriers to expressing relations between these objects to capture the context of research activities. D | Actionable | Must be actionable trough services that empower research Data is not useful without flexible and adaptable services that allow researchers to act on the data in the ways they need.
  • 10.
    agINFRA principles We trust that the following principles support the values Infrastructure People Services Know and adapt to the Use existing components Be sustainable in the needs of researchers supported by strong long term communities Provide out-of-the-box, Allow for heterogeneous easy to adopt components Create open services that and rich kinds of data can be easily composed featuring semantics Foster collaboration and sharing of data, via search Adapt services to research Expose everything as but also casual discovery workflows linked open data
  • 11.
    For more information,visit us at: http://www.aginfra.eu/