Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

iMarine catalogue of services


Published on

iMarine solutions and benefits for communities.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

iMarine catalogue of services

  1. 1. iMarine   Catalogue  of  Services   Pasquale  Pagano  (CNR)   iMarine  Technical  Director   pasquale.pagano@is?   iMarine  data  plaAorm  for  collabora?ons     7th  March  2014,  09:00  –  17:30     Food  and  Agriculture  Organiza2on  of  the   United  Na2ons  (FAO)  Headquarters      
  2. 2. The  Catalogue  of  Services   iMarine  is  exploi?ng  a  Hybrid  Data  Infrastructure   combining  over  500  soPware  components  into  a   coherent  and  centrally  managed  system  of   hardware,  soPware,  and  data  resources.   iMarine  data  plaAorm  for  collabora?ons   2  
  3. 3. Born  from  the  user  needs   3  iMarine  data  plaAorm  for  collabora?ons   I  need  to  host  my  applica?ons  in  a  secure  and  scalable  environment   I  need  to  maintain  my  database   I  need  to  backup  my  data   I  need  to  delivery  my  data  to     a  set  of  known  people     I  need  to  analyse  my  big  datasets  
  4. 4. Born  from  the  user  needs   4  iMarine  data  plaAorm  for  collabora?ons   I  need  to  manage  and  analyze  biological  and  ecological  data   I  need  to  manage  the  full  data  life-­‐cycle  from  import   to  valida?on,  cura?on,  harmoniza?on  and  publica?on   I  need  to  offer  to  my  team  a  powerful  tool   to  manage  code-­‐lists   I  need  to  store  and  analyze   geospa?al  explicit  informa?on     I  want  to  offer  a  flexible  sharing,  storage,   repor?ng,  search  and  retrieval  tool  
  5. 5. Born  from  the  user  needs   5  iMarine  data  plaAorm  for  collabora?ons   I  need  to  access  authorita?ve  biological  and  ecological  data     I  wish  to  simplify  the  access  to  my  geospa?al  data     I  need  to  mash-­‐up  sta?s?cal  and     biodiversity  data     I  need  to  reduce  the  costs  of     data  maintenance  of  my  dept.     I  need  to  validate  my  datasets  and  provide     a  standard  access  to  them  
  6. 6. User  Needs  Analysis   6  iMarine  data  plaAorm  for  collabora?ons   •  Needs   – Not  isolated   – Not  disconnected   – Not  trivial   •  Solu?ons   – Actual  but  with  an  eye  to   the  future   – Designed  for  individuals   but  looking  at  the   community  
  7. 7. Capaci?es:  Storage  as  Service   • Scalability  and   high  availability   • Across  sites   • ISO   19115/19139   Metadata   • Catalogue   • Open  source   RDBMS   • Up  to  1  TB  data   • Secure   • Fault-­‐tolerant   • Replica?on   Virtual   Workspace   Rela?onal   Databases   Large  and   Ac?ve  data   storage   Spa?al   Database   iMarine  data  plaAorm  for  collabora?ons   7  
  8. 8. Capaci?es:  Compu?ng  as  Service   Hadoop   Sta?s?cal   Manager   R  clusters   • MapReduce   • Analysis/ clustering/ modeling   • Windows  and   Linux   iMarine  data  plaAorm  for  collabora?ons   1000  CPUs  Currently  Available   8  
  9. 9. Management  and  interpreta?on  of  biological  and   ecological  data  in  the  environment   Complete  full  life-­‐cycle  data  framework,  from   observa?onal  data  to  aggregated  data  repositories   enriched  with  valida?on  and  analy?cal  tools   Storage  and  interpreta?on  of  geospa?al  explicit   informa?on,  including  WPS  processing   Flexible  sharing,  storage,  repor?ng,  search  and   retrieval,  aggrega?on  and  projec?on  facili?es   Applica?ons   iMarine  data  plaAorm  for  collabora?ons   A  BUNDLE  is   a  set  of   services  and   technologies   grouped   according  to   a  family  of   related   tasks  for     achieving  a   common   objec?ve   9  
  10. 10. Occurrence  and  Taxonomic  Data  Discovery   Occurrence  Data  Processing   Species  Distribu2on  Modeling   Species  Distribu2on  Maps  Discovery   Taxonomic  Data  Comparison   Taxonomic  Data  Matching   Code  List  Discovery   Code  List  Management   Sta2s2cal  Engine   Tabular  Data  Discovery   Tabular  Data  Enrichment     Tabular  Data  Management   Tabular  Data  Processing   Geospa2al  Data  Discovery   Geospa2al  Data  Processing   Enhanced  Documents  Management   Fact-­‐sheets  Management     Informa2on  Object  Discovery   Messaging   Shared  Workspace   Social  Networking  Facili2es   Applica?ons   10  iMarine  data  plaAorm  for  collabora?ons   A  BUNDLE  is   a  set  of   services  and   technologies   grouped   according  to   a  family  of   related   tasks  for     achieving  a   common   objec?ve  
  11. 11. iMarine  data  plaAorm  for  collabora?ons   Presence   Points   (FishBase     +     Obis)   Density  Based  Clustering   DBSCAN   (with  outliers)   Other  methods  are  also   available  …   K-­‐Means   X-­‐Means   Features  Clustering  with  StatsCube   11  
  12. 12. Data  Analysis  with  StatsCube   12   Import     CodeLists   Validate  Datasets   Analyse     And     Project  
  13. 13. Ecological  Modeling  with  BiolCube   iMarine  data  plaAorm  for  collabora?ons   13  
  14. 14. VS   FAO  Eleutheronema  tetradactylum   AquaMaps  Eleutheronema  tetradactylum   Maps  Comparison  with  GeosCube   MEAN=0.81   VARIANCE=0.02   NUMBER_OF_ERRORS=6691   NUMBER_OF_COMPARISONS=259200     ACCURACY=97.42   MAXIMUM_ERROR=1.0   MAXIMUM_ERROR_POINT=3005:363:1     COHENS_KAPPA=0.218   COHENS_KAPPA_CLASSIFICATION_LANDIS_KOCH=Fair   COHENS_KAPPA_CLASSIFICATION_FLEISS=Marginal   TREND=EXPANSION   RESOLUTION=0.5   iMarine  data  plaAorm  for  collabora?ons   14  
  15. 15. iMarine   OBIS   WoR MS   WoR DS   GBIF   CoL   ITIS   IRMN G   NCBI   MyOc ean   WOA   EuroS tat   Data. FAO   …   Data   15  iMarine  data  plaAorm  for  collabora?ons   iMarine   Registries   Valida2on   Enriching   Processing   Sharing  
  16. 16. Data   Ontologies   and  Data   Warehouses   Biological   and   Ecological   Data   GeoSpa?al   Data   Sta?s?cal   Data   Documents     iMarine  data  plaAorm  for  collabora?ons   DarwinCore  /  ISO19139   >35  M  Observa?ons  (OBIS)   ≈  120  K  Observed  Species  (OBIS)   ≈  500  K  Taxa  (WoRMS)   >600  K  Scien?fic  Names  (ITIS)   >12  K  Species  Maps  (AquaMaps)   ≈  600  Species  Extent  (FAO)   …  FishBase,  SeaLifeBase   …  CoL,  GBIF   SDMX  *   Ø  FAO  CodeLists   Ø  IRD  CodeLists     Ø  FAO  datasets   Ø  Eurostat   Ø  …   ISO19139  (OGC  W*S)   Ø  10  years  Chemical  and  Physical  variables  in  2D  space   Ø  Ice  concentra?on  and  velocity,  Chlorophyll,  Oxygen,  Nitrate,  Phosphate,   Phytoplankton  as  carbon,  Salinity,  Temperature,  …   Ø  On-­‐demand  Chemical  and  Physical  variables    in  3D  space   Ø  Apparent  Oxygen  U?liza?on,  Dissolved  Oxygen,  Salinity,  Temperature,  …   >  350   variables     16   OAI-­‐PMH,  OpenSearch   Ø  FAO  Facksheets   Ø  Aqua?c  Commons   Ø  Bioline  Interna?onal   Ø  Biodiversity  Heritage   Ø  OceanDocs   Ø  Nature,  PenSoP   Journals   Ø  …   RDF,  OWL   Ø  FAO  FLOD   Ø  Marine  Top  Level  Ontology   Ø  IRD  Ecoscope   Ø  FactForge,  Yago2   Ø  …  
  17. 17. Is  this  enough?   •  An  ecosystem  of   par?cipatory  data  e-­‐ Infrastructures     •  Regulated  by  policies   •  Enabled  by  standards   •  Promo?ng  not  only   access  but  mash-­‐up  of   heterogeneous  data   iMarine  data  plaAorm  for  collabora?ons   User  centric     17  
  18. 18. Virtual  Research  Environment     iMarine  is  user-­‐centric  and  workflow-­‐oriented  thanks  to   the  gCube  VRE  technology     Virtual  Research  Environment  (VRE)  is     •  a  distributed  and  dynamically  created  environment     •  where  subset  of  data,  services,  computa?onal,  and   storage  resources     •  regulated  by  tailored  policies   •  are  assigned  to  a  subset  of  users  via  interfaces   •  for  a  limited  2meframe   •  at  lifle  or  no  cost  for  the  providers  of     the  par?cipatory  data  e-­‐infrastructures   iMarine  data  plaAorm  for  collabora?ons   L.  Candela,  D.  Castelli,  P.  Pagano  (2013)  Virtual  Research  Environments:  An  Overview  and  a   Research  Agenda.  Data  Science  Journal,  Vol.  12   18  
  19. 19. iMarine  Technology   •  iMarine  is  powered  by  gCube   iMarine  data  plaAorm  for  collabora?ons   19   hups://  
  20. 20. iMarine  Technology   •  iMarine  is  powered  by  gCube   iMarine  data  plaAorm  for  collabora?ons   20   hups://  
  21. 21. iMarine  Technology   •  iMarine  is  powered  by  gCube   iMarine  data  plaAorm  for  collabora?ons   21   hups://  
  22. 22. iMarine  e-­‐infrastructure     iMarine  is  exploi?ng   iMarine  data  plaAorm  for  collabora?ons   22   Geographically   Distributed   Compu?ng   Infrastructure   Across   administra?ve   boundaries   Across  private   and  commercial   providers   Service   Alloca?ons,   Deployment,   Monitoring,  and   Opera?on   Uniform   resource  and   data  access   Opera?on   Built  on  SLAs   Support  monitoring,   audi?ng,  repor?ng,  and   no?fica?on   Trust   Privacy,  governance,   and  auribu?on   Security,  trusted   network  
  23. 23. Landscape       D4Science  e-­‐Infrastructure   gCube  Framework   gCube  Apps   Discussion     www.i-­‐       i-­‐           iMarine  data  plaAorm  for  collabora?ons   23  
  24. 24. Google  Analy?cs  iMarine  portal   iMarine  data  plaAorm  for  collabora?ons   24