How e-infrastructure can contribute to Linked Germplasm Data

1,633 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,633
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

How e-infrastructure can contribute to Linked Germplasm Data

  1. 1. How  e-­‐infrastructure  can  contribute  to   Linked  Germplasm  data   Giannis  Stoitsis,  Agro-­‐Know   stoitsis@agroknow.gr   e-­‐conference  on  Germplasm  Data     Interoperability  
  2. 2. Contents   Why  we  need  e-­‐infrastructure   What  e-­‐infrastructure  can  provide   The  agINFRA  approach     agINFRA  powered  services  for  Germplasm   data     •  What  is  next   •  •  •  • 
  3. 3. WHY  WE  NEED  E-­‐INFRASTRUCTURE  
  4. 4. agricultural  data   •  publicaKons,  thesis,  reports,  other  grey  literature   •  educaKonal  material  and  content,  courseware   •  primary  data,  such  as  measurements  &  observaKons   –  structured,  e.g.  datasets  as  tables   –  digiKzed,  e.g.  images,  videos   •  secondary  data,  such  as  processed  elaboraKons   –  e.g.  dendrograms,  pie  charts,  models   •  provenance  informaKon,  incl.  authors,  their   organizaKons  and  projects   •  experimental  protocols  &  methods   •  social  data,  tags,  raKngs,  etc.   •  …  
  5. 5. •  stats   •  gene  banks   •  gis  data   •  blogs,     •  journals   •  open  archives   •  raw  data   •  technologies   •  learning  objects   •  ………..   educators’ view
  6. 6. •  stats   •  gene  banks   •  gis  data   •  blogs,     •  journals   •  open  archives   •  raw  data   researchers’ •  technologies   view •  learning  objects   •  ………..  
  7. 7. •  stats   •  gene  banks   •  gis  data   •  blogs,     •  journals   •  open  archives   •  raw  data   •  technologies   •  learning  objects   •  ………..   practioners’ view
  8. 8. •  stats   •  gene  banks   •  gis  data   •  blogs,     •  journals   •  open  archives   •  raw  data   •  technologies   •  learning  objects   •  ………..  
  9. 9. LD for educational data/resource sharing Overview we  sKll  have  data  silos   Approaches for LD in educational data sharing •  Many  metadata  standards  (e.g.  DC,  IEEE   APIs and data (http://www.meducator.net)  On the-fly/automated integration of heterogeneous LOM,  Dw,  local  schemas)   •  Diversity  of  web  interfaces  (e.g.  REST,  OAI-­‐PMH,  SOAP,  SPI,  SQI)   •  Different  exchange  format  (e.g.  XML,  RDF,  JSON)   •  Fragmented  use  of  t cataloging (http://linkedup-project.eu)  Dataset (transformation and)exonomies   … and not here … We are still here … ?
  10. 10. we  need  ontologies  published   online  and  aligned   •  •  •  •  •  •  •    stats   gene  banks   blogs,     journals   open  archives   raw  data   learning  objects  
  11. 11. we  need  tools  to  share  data  
  12. 12. we  need  tools  to  semanKcally  annotate   data  
  13. 13. and  for  all  this  we  need  
  14. 14. data  infrastructure  for  agriculture   •  aim  is:   promo&ng  data  sharing  and   consump&on  related  to  any  research   ac&vity  aimed  at  improving   produc&vity  and  quality  of  crops   ICT  for  compu&ng,  connec&vity,  storage,   instrumenta&on      
  15. 15. what  researchers  need  in  agINFRA   …  only  a  browser  and  internet  connecKon  
  16. 16. typical  problem:  compuKng  
  17. 17. typical  problem:  hosKng  
  18. 18. what  can  be  hosted  and  executed   on  agINFRA   •  Data  storage  &  management  tools   –  APIs  for  content  disseminaKon  in  large  networks   •  Processing  &  visualisaKon  tools   •  Metadata  aggregaKon  infra   •  Search  engines  and  apps  for  insKtuKons  or   communiKes   •  Environments  for  running  experiments  e.g.   comparing  different  content  recommendaKon   algorithms  
  19. 19. h[p://aginfra.eu/en/our-­‐soluKon/api  
  20. 20. HOW  AGINFRA  CAN  SOLVE  DATA   INTEROPERABILITY  PROBLEMS    
  21. 21. WORKFLOW  FOR  METADATA   AGGREGATION  
  22. 22. metadata  aggregaKons   •  concerns  viewing  merged  collecAons  of   metadata  records  from  different  sources   •  useful:  when  access  to  specific  supersets  or   subsets  of  networked  collecAons   – records  actually  stored  at  aggregator   – or  queries  distributed  at  virtually  aggregated   collecKons   23  
  23. 23. typically  look  like  this   24   Ternier et al., 2010
  24. 24. metadata  aggregaKon  tools   More  than  a  harvester:   q Valida&on  Service   q Repository  So4ware     q Registry  Service     q Harvester   Powered by 25  
  25. 25. a  metadata  aggregaKon  workflow  that  can  be   ported  on  agINFRA   HarvesKng   ValidaKng   Transforming   OAI  target  -­‐   XMLs   Storing  and   indexing     TriplificaKon  
  26. 26. TOOLS  FOR  PUBLISHING  AND   LINKING  VOCABULARIES  
  27. 27. AGRICULTURAL  DATA  DISCOVERY   SERVICE/PORTAL  OVER  THE  CLOUD  
  28. 28. agricultural  data  discovery  modules   for  open  source  CMS   hIp://www.youtube.com/watch?v=OYlxWlyag04&feature=youtu.be  
  29. 29. LINKING  GERMPLASM  DATABASES  AND   EXPOSING  DESCRIPTIONS  AS  LINKED  DATA  
  30. 30. agINFRA  contribuKon  in  germplasm   data  interoperability     •  Define  recommendaKons  for  describing   germplasm  data   •  Define  mappings  between  different  metadata   formats   •  Provide  APIs  for  transformaKon   –  triplificaKon  of  germplasm  descripKons  
  31. 31. mapping  between  different  metadata   formats  powered  by  agINFRA  
  32. 32. publishing  germplasm  data  as   linked  data  in  agINFRA   services
  33. 33. next  steps  in  the  context  of  agINFRA   •  Develop  the  recommendaKons  for  publishing   germplasm  data   •  Deploy  transformers  and  make  them  available   in  agINFRA   •  Deploy  API  for  triplificaKon  
  34. 34.     thank  you!   stoitsis@agroknow.gr     www.agroknow.gr   www.aginfra.eu          

×