iMarine exploitation opportunities


Published on

iMarine e-Infrastructure and Applications Catalogue

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

iMarine exploitation opportunities

  1. 1. Exploitation opportunities Pasquale Pagano (CNR) iMarine Technical Director
  2. 2. Outline The Infrastructure • Heterogeneous resources as a service • Data Bonanza • Virtual Research Environment • Software platform iMarine Catalogue • StatsCube • GeosCube • BiolCube • ConnectCube I-MARINE EXTENDED BOARD 2
  3. 3. Distinguishing capabilities of the iMarine e-infrastructure and its enabling software THE INFRASTRUCTURE I-MARINE EXTENDED BOARD 3
  4. 4. Concepts The initiative (the visionary leadership) The e-infrastructure (the operational platform) The system (the enabling sw system) I-MARINE EXTENDED BOARD 4
  5. 5. e-Infrastructure Geographically Distributed Computing Infrastructure Service Allocations, Deployment, Monitoring, and Operation Across administrative boundaries Across private and commercial providers Uniform resource and data access I-MARINE EXTENDED BOARD 5
  6. 6. Infrastructure: key characteristics • Efficient and tailored storage technologies • Computational environments dealing with the volume of the data • Elastic management of the resources, monitoring, alerting, recovery • Collaborative environment to support scientific communities • Rich portfolio of applications to perform access, validation, enriching, processing, sharing, and mash-up of data I-MARINE EXTENDED BOARD 6
  7. 7. Infrastructure: Storage as Service • Secure • Fault-tolerant • Replication • Open source RDBMS • Up to 1 TB data Virtual Workspace Relational Databases 45 TB Currently Used Spatial Database Large and Active data storage • ISO 19115/10139 Metadata • Catalogue • Scalability and high availability • Across sites I-MARINE EXTENDED BOARD 7
  8. 8. 330 Cores Currently Allocated Infrastructure: Computing as Service Hadoop • MapReduce Statistical Manager • Analysis/clustering/modeling R clusters • Windows and Linux I-MARINE EXTENDED BOARD 8
  9. 9. Infrastructure: Management as Service Operation Machine readable SLAs Machine readable monitoring, auditing, billing, reporting, and notification Machine readable resource/performance capabilities description Trust Privacy, governance, and attribution Security, trusted network I-MARINE EXTENDED BOARD 9
  10. 10. Infrastructure: Collaborative Environment The Social Portal offers a familiar view of what is happening on their VREs A single place to • Get status and updates from applications and other users they are interested in; • Get notifications about messages, jobs completion, new generated products, etc. I-MARINE EXTENDED BOARD 10
  11. 11. Infrastructure: Collaborative Environment The Social Portal offers a familiar view of what is happening on their VREs A single place to • Manage all the portal extension. W rk p Ms ags o atio sP e o s ac es eNtific n ag e Se hiny u W rk p e arc o r o s ac H m So ial oe c I-MARINE EXTENDED BOARD 11
  12. 12. Infrastructure: Collaborative Environment The Social Portal offers a familiar view of what is happening on their VREs A single place to • Manage data, store and preserve them • Share data I-MARINE EXTENDED BOARD 12
  13. 13. Google Analytics iMarine portal I-MARINE EXTENDED BOARD 13
  14. 14. Data Bonanza OBIS WoR MS … Data. FAO Validation WoR DS Private Cloud EuroS tat Sharing iMarine iMarine Registries GBIF Enriching Commercial Cloud WOA MyOc ean CoL Processing ITIS NCBI IRMN G I-MARINE EXTENDED BOARD 14
  15. 15. Data Bonanza SDMX * - FAO CodeLists - IRD CodeLists - FAO Global Aquaculture Production - FAO Global Capture Production - FAO Global Production - Eurostat - … Statistical Biodiversity Geospatial DarwinCore / ISO19139 >35 M Observations (OBIS) ≈ 120 K Observed Species (OBIS) ≈ 500 K Taxa (WoRMS) >600 K Scientific Names (ITIS) >12 K Species Distribution Maps (AquaMaps) ≈ 600 Species Extent (FAO) … FishBase, SeaLifeBase … CoL, GBIF > 300 variables ISO19139 (OGC W*S) 10 years Chemical and Physical variables in 2D space Ice concentration and velocity, Chlorophyll, Oxygen, Nitrate, Phosphate, Phytoplankton as carbon, Salinity, Temperature, … On-demand Chemical and Physical variables in 3D space Apparent Oxygen Utilization, Dissolved Oxygen, Salinity, Temperature, … I-MARINE EXTENDED BOARD 15
  16. 16. Not Only Access • Access – Retrieval of geospatial data as space/time-varying phenomena – Direct fine-grained access to feature and feature property level. • Validation – User-defined quality and dissemination level • Enriching – Generation metadata, exploitation of reference data, linking to environmental dataset • Processing – Analysis and mining exploiting e.g. R, Weka and RapidMiner statistical frameworks • Sharing – User-driven process to decide how other agents (human / machine) can access information I-MARINE EXTENDED BOARD 16
  17. 17. Features Clustering with StatsCube Presence Points (FishBase + Obis) Density Based Clustering DBSCAN (with outliers) Other methods are also available … K-Means X-Means I-MARINE EXTENDED BOARD 17
  18. 18. Ecological Modeling with BiolCube I-MARINE EXTENDED BOARD 18
  20. 20. Not Only Access, Validation, Enriching, Processing, Sharing • It is always possible to save the discovered data in various Standard formats • It is always possible to collaborate with coworkers through a dedicated workspace. • Mash-up data across diversity – Accessing statistical datasets in SDMX, geo-referencing them, describing them in ISO19139, and making them available via OGC W*S standard protocols – Accessing species observation datasets in DwC, analysing their distribution trend via R, and projecting them in geographical space – Accessing species taxonomies in DwCA and publishing them as reference data in SDMX I-MARINE EXTENDED BOARD 20
  21. 21. Data Bonanza: a common vision Integrate and harmonize crossdisciplinary data and information across information systems and workflows to support evidence-based decision making iMarine is implementing this vision through the adoption of Standards, the identification of common Methods and the implementation of Tools which enable integration and harmonization. I-MARINE EXTENDED BOARD 21
  22. 22. Is this enough? • An ecosystem of participatory data eInfrastructures • Regulated by policies • Enabled by standards • Promoting not only access but mash-up of heterogeneous data User centric I-MARINE EXTENDED BOARD 22
  23. 23. User-Centric View User-centric view of an ecosystem of participatory data e-Infrastructures to • Cope with the overwhelming amount of data and capacities • Promote re-use of data • Encourage sharing of resulting products User-centric and workflow-oriented I-MARINE EXTENDED BOARD 23
  24. 24. Virtual Research Environment iMarine is user-centric and workflow-oriented thanks to the gCube VRE technology Virtual Research Environment (VRE) is • a distributed and dynamically created environment • where subset of data, services, computational, and storage resources • regulated by tailored policies • are assigned to a subset of users via interfaces • for a limited timeframe • at little or no cost for the providers of the participatory data e-infrastructures L. Candela, D. Castelli, P. Pagano (2013) Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, Vol. 12 I-MARINE EXTENDED BOARD 24
  25. 25. Flexible Software Platform Software platform to abstract over differences in location, protocols, and models by keeping failures partial and temporary, Storage, Discovery, Indexing, Search, Execution, … reacting to and recovering from a large number of potential issues. I-MARINE EXTENDED BOARD Feature-rich Feature-rich scaling no less than the interfaced resources, It turns resources and technologies into a utility by offering a single registration, monitoring, and access facilities 25
  26. 26. Software Platform I-MARINE EXTENDED BOARD 26
  27. 27. iMarine Exploitation models Service Data hosting Infrastructure Unlimited users, Infrastructure support, helpdesk, back-up, security Validation (records) Workspace Hardware Default Processing (<1MB) Social Tool Community Management Storage 1TB Cloud Resources Validation (Datasets) Custom Data Resources Custom Processing (> 1MB) Spatial Data integration User Management Large and Active Storage Unlimited VRE’s Hour/Day Month Year 27
  28. 28. Concept map of the products I-MARINE OFFER I-MARINE EXTENDED BOARD 28
  29. 29. Application Bundles Management and interpretation of biological and ecological data in the environment Complete full life-cycle data framework, from observational data to aggregated data repositories enriched with validation and analytical tools Storage and interpretation of geospatial explicit information, including WPS processing Flexible sharing, storage, reporting, search and retrieval, aggregation and projection facilities I-MARINE EXTENDED BOARD A BUNDLE is a set of services and technologie s grouped according to a family of related tasks for ac hieving a common objective 29
  30. 30. Discussion time Thank you for your attention I-MARINE EXTENDED BOARD 30