Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OSFair2017 Workshop | EGI applications database


Published on

Marios Chatziangelou presents the EGI applications database | OSFair2017 Workshop

Workshop overview:
This collaborative workshop comes in the context of coordinating EOSC related activities across large European infrastructures at European and national level. The workshop will offer an opportunity for cross-pollination on issues ranging from open scholarship to technical service provision, training, community engagement and support. OpenAIRE NOADs, EGI NGIs, GEANT NRENs and other national e-Infrastructure representatives will discuss gaps, synergies, coordination and service integration opportunities.


Published in: Science
  • Be the first to comment

  • Be the first to like this

OSFair2017 Workshop | EGI applications database

  1. 1. EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142 Marios Chatziangelou, et al. <> Institute of Accelerating Systems and Applications (IASA), Athens, Greece EGI Applications Database
  2. 2. 2 Capabilities A community driven, central EGI service that stores and provides:  software solutions (in the form of native software and/or virtual appliances), originated from almost every scientific area/discipline  reference of scientific datasets (pilot - under development)  the programmers and scientists responsible for them  the publications derived from the registered items (SW, VA & datasets)
  3. 3. 3 Software Marketplace Registry for Software items: Applications, tools, Workflow frameworks and instances, Science Gateways, MW products) Offers release management capabilities  unlimited series of releases  light-weight & collaborative, release management process Acts as a repository for binary artifacts  unlimited number of repositories per register software  generic tarballs, RPM & DEB (32bit/64bit) binaries  multiple flavor / operating system combinations  simplified, web-based, process for uploading the binary artifacts  YUM & APT repositories for automatic distribution  artifacts populated through the UMD Community Repository
  4. 4. 4  In the context of Life Sciences Data Replication VT, AppDB is being extended into a dataset registry  Initial focus is on Life Sciences reference datasets  Integration with the Elixir Tools and Data Services Registry is in the works  Key characteristics:  Primary datasets represent original datasets, as posted by the provider  Derived datasets are based on a primary dataset but only part of the information is kept, or only part of the data entries are selected  Indicative metadata: name, description, disciplines, homepage link, licensing, and a version list  Each dataset version may host one or more locations where data can be accessed  Locations may be tagged as master or replica Reference Datasets
  5. 5. 5 Cloud Marketplace Registry for virtual appliances (VA)  a logical container of versioned image file & metadata bundles Registry for software appliances (SA)  a logical container of VAs & contextualization scripts bundles VA distribution medium  distributing endorsed VAs to the resource providers/sites Resource providers catalogue  list of the VAs which are available by each site/resource provider Virtual Organizations (VO) catalogue  list of the VAs which are available for each VO member
  6. 6. 6 The AppDB VMops dashboard The objective (EGI-Engage DoW) : “The EGI Applications Database (AppDB) will evolve from its current role as catalogue of applications and virtual machines images (VMI) to include a graphical user interface allowing authorized users to perform VM management operations Highlighted features for the end-user  Create a new topology with one (or more) VMs  Attach additional storage to the VM instances  Deploy/Un-deploy a topology  Start/Stop a topology (= all the VM instances of a topology)  Start/Stop a single VM instance  Configure VM (cloud-init & ansible)  Execute bash script on deployment time
  7. 7. 7 Person profiles  Personal details  Access group rights  Contact details and communication mechanisms  Publications  Affiliated organizations  Linked projects Personal activity:  list of registered software  list of registered Virtual & Software Appliances  list of registered Datasets  ……
  8. 8. 8 General features (1/2) dissemination of information custom RSS/Atom news feeds news e-mail subscription lists user focused communication (messaging, requests, etc) special dissemination tool for sending ad-hoc messages to scientists 'follow' button for receiving all the activity related to a registered item dissemination features customizable through user preferences sharing content with social networks information retrieval advanced searching mechanism (rated search results) 'faceted search' mechanism for refinements quality of information content tagging, ratting, commenting per registered item contact expertise information problem and comment abuse report centrally managed quality control taxonomy technical classification scientific classification tagging
  9. 9. 9 General features (2/2) AuthN/AuthZ and security advanced AuthN/AuthZ mechanisms (simpleSAML) integrated with EGI Checkin service support for multiple accounts for accessing user’s personal profile internally managed AuthZ, based on allowed actions, roles and permissions Relations… … between all the entities listed below, are possible: – software – virtual appliances – datasets – persons – virtual organizations – sites / resource providers – organizations – projects Integration with AppDB RESTfull API, supports operations following a CRUD convention. flexible API stateless authentication mechanism using Personal Access Tokens (no need for X509) or even, by adapting the AppDB Gadget (easy – copy & paste, one line of code – no technical skills required, you may get it here) AppDB already integrated with many EGI services EGI GOCDB list of sites, their metadata & downtimes Top-BDII fetching sites dynamic information EGI Checkin for AuthN and high level AuthZ attributes Perun and EGI Operations Portal for VO related details, inc. membership & roles Argo: retrieving the status of the Cloud-enabled sites
  10. 10. 10 Indicative Statistics 21 Service Providers Incl. CESGA CESNET-MetaCloud HG-09-Okeanos-Cloud FZJ etc... 36 Virtual Organizations Incl. (LTOS) cms biomed etc..... Cloud Marketplace
  11. 11. 11 Need for creating relations Entities/Digital Objects available by the service (either hosted or harvested):  Software  Datasets  Topologies  Virtual & Software Appliances  Virtual Machines  Researchers  Resource Providers (from the EGI GOCDB)  Virtual Organizations (from the EGI Ops Portal & Perun)  Publications (derived from the registered items) Globally defined entities/digital objects to create relations with:  Projects  Organizations  Publications  Contact profiles  Research Data  … etc OpenAIRE
  12. 12. 12 Integration with OpenAIRE (1/3) 1. Developed a dedicated (not publicly accessible) service for:  periodically consuming the required data over OpenAIRE OAI-PMH interface  controlling the process (big data volume + complexity)  Mapping the OpenAIRE data to the AppDB ones 2. Made the necessary enhancements to our databases for storing the fetched data/records as well as the produced relations 3. Extend our user’s interface in order to:  the end-user to be able to select/pick from a list of projects/organizations, thus avoiding the data entry  the system to make ‘suggestions’ to the end-user based on the pre-existed relations, contact-projects & contact-organizations, as those extracted from the OpenAIRE data
  13. 13. 13 Integration with OpenAIRE (2/3)  OpenAIRE Harvester UI  Projects ~33k  Organizations ~28k  Persons ~18k  Publications 0  Profile – Project relation  Creating a Profile – Org relation  VA – Project/Org relation
  14. 14. 14 Integration with OpenAIRE (3/3) Summarizing…., the AppDB acts as a consumer to the OpenAIRE repository, getting data with respect to Projects, Organizations and Contact persons Next steps….,  Consume, store and utilize data related to publications Considerations: big (very big) data volume, overcome complexity  Stabilize the process & periodicity of data harvesting Considerations: again, data volume ( takes more than a day for a single fetch)  Act as a repository (producer), populating enriched datasets back to the OpenAIRE Considerations: need to develop the necessary mechanisms
  15. 15. Thank you for your attention. Questions? EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142