The document introduces the core technical infrastructure specifications of the LoCloud project. The various technical aspects of the infrastructure are presented in detail, focusing on their advantages and limitations. The challenges of implementing this infrastructure are also described throughout this deliverable.
QEBU: AN ADVANCED GRAPHICAL EDITOR FOR THE EBUCORE METADATA SET | Paolo PASIN...FIAT/IFTA
Creation and management of metadata documents can be quite a difficult task to accomplish manually. To address this issue in the context of the EBUCore v1.3 metadata set, we propose a GUI-based metadata editor, QEbu, developed during the Multimedia Archival Techniques course, held at the Polytechnic University of Turin in collaboration with RAI.
The aim is to provide a user-friendly graphical editor to create and manage XML documents, relieving the user from the burden of worrying about the structure of the data and letting him focus on the actual content.
QEbu has been developed in C++ using the cross-platform and open source library Qt 4.8; this framework was chosen in order to exploit its natural features for developing interface-centered applications.
The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
A cluster is a type of parallel or distributed computer system, which consists of a collection of inter-connected stand-alone computers working together as a single integrated computing resource.
QEBU: AN ADVANCED GRAPHICAL EDITOR FOR THE EBUCORE METADATA SET | Paolo PASIN...FIAT/IFTA
Creation and management of metadata documents can be quite a difficult task to accomplish manually. To address this issue in the context of the EBUCore v1.3 metadata set, we propose a GUI-based metadata editor, QEbu, developed during the Multimedia Archival Techniques course, held at the Polytechnic University of Turin in collaboration with RAI.
The aim is to provide a user-friendly graphical editor to create and manage XML documents, relieving the user from the burden of worrying about the structure of the data and letting him focus on the actual content.
QEbu has been developed in C++ using the cross-platform and open source library Qt 4.8; this framework was chosen in order to exploit its natural features for developing interface-centered applications.
The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
A cluster is a type of parallel or distributed computer system, which consists of a collection of inter-connected stand-alone computers working together as a single integrated computing resource.
Cloud computing is used to define a new class of computing that is based on the network technology. Cloud computing takes place over the internet. It comprises of a collection of integrated and networked hardware, software and internet infrastructures. These infrastructures are used to provide various services to the users. Distributed computing comprises of multiple software components that belong to multiple computers. The system works or runs as a single system. Cloud computing can be referred to as a form that originated from distributed computing and virtualization.
Open source grid middleware packages – Globus Toolkit (GT4) Architecture , Configuration – Usage of Globus – Main components and Programming model - Introduction to Hadoop Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and output parameters, configuring and running a job – Design of Hadoop file system, HDFS concepts, command line and java interface, dataflow of File read & File write.
Introduction to distributed systems
Architecture for Distributed System, Goals of Distributed system, Hardware and Software
concepts, Distributed Computing Model, Advantages & Disadvantage distributed system, Issues
in designing Distributed System,
Cloud computing is Internet-based computing, whereby shared resource, software, and information are provided to computers and other devices on demand, like the electricity grid.
Grid Computing - Collection of computer resources from multiple locationsDibyadip Das
Grid computing is the collection of computer resources from multiple locations to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files.
This is one of 7 reports provided in work package 3: Micro services for small and medium institutions.
Authors:
Aitor Soroa, Arantxa Otegi(, Eneko Agirre(, Rodrigo Agerri
Universidad del País Vasco
(University of the Basque Country)
(EHU)
LoCloud - D3.7: Report on services developed for local cultural institutionslocloud
The LoCloud project is dedicated to providing services for cultural heritage institutions. These services support the easy ingestion of cultural heritage content into Europeana. In addition, the services help content providers capture metadata and enrich the data in order to increase the potential semantic linkage of the data within the World Wide Web. It is one of the major aims of LoCloud to reduce technical, semantic and skills barriers for providing content, and therefore cloud technology is used and cloud-based services have been implemented for facilitating the aggregation of digital content from small and medium cultural institutions.
This deliverable provides a summary status report of the microservices that have been developed for the LoCloud project. Microservices are small, lightweight software services that each perform a set of narrowly defined functions. The various services can be arranged in independently deployable groups and can communicate with each other via defined interfaces. The distributed architecture makes it easier to deploy new versions of services, and to scale the development. In the LoCloud project the development effort for the various services was organized around multiple development teams realizing five main groups of microservices. Among these services are geolocation enrichment tools, metadata enrichment and vocabulary services, historic place names and Wikimedia crowdsourcing services. These microservices can be used and combined as needed by the content providing institutions. All microservices have been implemented on virtual machines in a cloud testlab (using the OpenNebula cloud computing platform) and all services are also integrated in the LoCloud aggregation platform MORe.
Cloud based Projects at Belfast eScience CentreEduserv
A presentation by Terry Harmer of the Belfast eScience Centre at the Repositories and the Cloud meeting organised by Eduserv and JISC in London on Feb 23 2010.
Cloud computing is used to define a new class of computing that is based on the network technology. Cloud computing takes place over the internet. It comprises of a collection of integrated and networked hardware, software and internet infrastructures. These infrastructures are used to provide various services to the users. Distributed computing comprises of multiple software components that belong to multiple computers. The system works or runs as a single system. Cloud computing can be referred to as a form that originated from distributed computing and virtualization.
Open source grid middleware packages – Globus Toolkit (GT4) Architecture , Configuration – Usage of Globus – Main components and Programming model - Introduction to Hadoop Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and output parameters, configuring and running a job – Design of Hadoop file system, HDFS concepts, command line and java interface, dataflow of File read & File write.
Introduction to distributed systems
Architecture for Distributed System, Goals of Distributed system, Hardware and Software
concepts, Distributed Computing Model, Advantages & Disadvantage distributed system, Issues
in designing Distributed System,
Cloud computing is Internet-based computing, whereby shared resource, software, and information are provided to computers and other devices on demand, like the electricity grid.
Grid Computing - Collection of computer resources from multiple locationsDibyadip Das
Grid computing is the collection of computer resources from multiple locations to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files.
This is one of 7 reports provided in work package 3: Micro services for small and medium institutions.
Authors:
Aitor Soroa, Arantxa Otegi(, Eneko Agirre(, Rodrigo Agerri
Universidad del País Vasco
(University of the Basque Country)
(EHU)
LoCloud - D3.7: Report on services developed for local cultural institutionslocloud
The LoCloud project is dedicated to providing services for cultural heritage institutions. These services support the easy ingestion of cultural heritage content into Europeana. In addition, the services help content providers capture metadata and enrich the data in order to increase the potential semantic linkage of the data within the World Wide Web. It is one of the major aims of LoCloud to reduce technical, semantic and skills barriers for providing content, and therefore cloud technology is used and cloud-based services have been implemented for facilitating the aggregation of digital content from small and medium cultural institutions.
This deliverable provides a summary status report of the microservices that have been developed for the LoCloud project. Microservices are small, lightweight software services that each perform a set of narrowly defined functions. The various services can be arranged in independently deployable groups and can communicate with each other via defined interfaces. The distributed architecture makes it easier to deploy new versions of services, and to scale the development. In the LoCloud project the development effort for the various services was organized around multiple development teams realizing five main groups of microservices. Among these services are geolocation enrichment tools, metadata enrichment and vocabulary services, historic place names and Wikimedia crowdsourcing services. These microservices can be used and combined as needed by the content providing institutions. All microservices have been implemented on virtual machines in a cloud testlab (using the OpenNebula cloud computing platform) and all services are also integrated in the LoCloud aggregation platform MORe.
Cloud based Projects at Belfast eScience CentreEduserv
A presentation by Terry Harmer of the Belfast eScience Centre at the Repositories and the Cloud meeting organised by Eduserv and JISC in London on Feb 23 2010.
This is one of 7 reports provided in work package 3: Micro services for small and medium institutions.
Authors:
Odo Benda, Gerda Koch and Walter Koch
AIT Forschungsgesellschaft mbH
All software developed through LoCloud has extensive technical API documentation as well as end-user documentation for applications with an end-user interface including the ingestion and mapping tool MINT, the metadata repository MORE, the lightweight digital library service LoCloud Collections and the Geocoding Application. Documentation is continuously being made available on on the LoCloud support portal where it is edited in a Wiki.
The LoCloud help-desk is a contact point where LoCloud partners can submit questions and receive answers related to the services and applications that are being modified or created in the project. The help-desk is accessible through the support portal at the URL http://support.locloud.eu or can be entered directly through the URL http://support.locloud.eu/osticket
Practically, all technical partners in LoCloud staff the help-desk on a shared basis. PSNC, Athena RC/DCU and AVINET will be responsible for first-line support, i.e. quick response to questions related to usage of services and applications. The developers or owners of the various software packages will be responsible for responding to questions of a technical nature such as bug-reports, modification requests and similar.
LoCloud - D2.5: Lightweight Digital Library Prototype (LoCloud Collections Se...locloud
This report presents the prototype of LoCloud Collections system. The main aim of LoCloud Collections (initially named Lightweight Digital Library) is to
provide small cultural institutions with the possibility to host their digitized collections (metadata as well as content) very easily in the cloud, and make that data widely available on the internet, and in particular to Europeana.
The developed service prototype is available on-line at https://locloudhosting.net/ and can be used by anyone to create new digital library in just a few minutes.
This report accompanies and describes the contents of D2.4 metadata preparation toolkit that is available at: http://mint-‐projects.image.ntua.gr/locloud/. This report does not describe MINT’s functionality.
The main objective of this report is to guide content providers in learning how to prepare their metadata for efficient publication to Europeana. In the metadata preparation process, providers are checking if metadata datasets:
- are well formed (i.e. encoding)
- are in the correct format (i.e. xml, csv)
- can be harvested using the OAI-PMH protocol
- conform to a known metadata schema (e.g. CARARE, lido, EDM)
- contain all mandatory elements,
- contain valid links to the Digitized Cultural Heritage Objects
In LoCloud we provide a set of tools and services that will facilitate providers to check and improve the level of quality and thus readiness of the metadata for publication at the Europeana portal.
In Locloud, providers are using MINT to ingest metadata to MORE. In this deliverable we emphasize on the set of tools and services that exist within MINT and can be used by providers to prepare their metadata. These set of tools are
Metadata upload
- Metadata statistics.
- Datasets reports
- Mapping, Annotation and Group edit
- Metadata transformation and validation
- Additional information about the metadata preparation process including screencasts video can be found in http://support.locloud.eu/.
My presentation for Annual Award at Mathematical Institute of the Serbian
Academy of Sciences and Arts in the field of computing for PhD. students.
Cloud computing is facing some serious latency issues due to huge volumes of data that need to be transferred from the place where data is generated to the cloud. For some types of applications, this is not acceptable.
One of the possible solutions to this problem is the idea to bring cloud services closer to the edge of the network, where data origi- nates. This idea is called edge computing, and it is advertised that it dramatically reduces the network latency as a bridge that links the users and the clouds, and as such, it makes the foundation for future interconnected applications.
Edge computing is a relatively new area of research and still faces many challenges like geo-organization and a clear separation of concerns, but also remote configuration, well defined native applications model, and limited node capacity. Because of these issues, edge computing is hard to be offered as a service for future real-time user-centric applications.
This thesis presents the dynamic organization of geo-distributed edge nodes into micro data-centers and forming micro-clouds to cover any arbitrary area and expand capacity, availability, and reliability. We use a cloud organization as an influence with adaptations for a different environment with a clear separation of concerns, and native applications model that can leverage the newly formed system.
We argue that the presented model can be integrated into existing solutions or used as a base for the development of future systems.
Furthermore, we give a clear separation of concerns for the proposed model. With the separation of concerns setup, edge-native applications model, and a unified node organization, we are moving towards the idea of edge computing as a service, like any other utility in cloud computing.
The platform architecture developed by the CPaaS.io project - both the overall system architecture as well as the two implementation architectures, one based on FIWARE and the other on u2 - as presented at the first year review meeting in Tokyo on October 5, 2017.
Disclaimer:
This document has been produced in the context of the CPaaS.io project which is jointly funded by the European Commission (grant agreement n° 723076) and NICT from Japan (management number 18302). All information provided in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose. The user thereof uses the information at its sole risk and liability. For the avoidance of all doubts, the European Commission and NICT have no liability in respect of this document, which is merely representing the view of the project consortium. This document is subject to change without notice.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Digital Cultural Heritage and the new EU Framework Programmelocloud
2nd LoCloud Awareness Event at the Ministry of Education and Culture, Cyprus 5 March 2014. Presentation delivered by Marinos Ioannides, Cyprus University of Technology
Presentation given by Ole Myhre Hansen
National Archives of Norway, LoCloud coordinator
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
LoCloud geolocation enrichment tools: On the Maplocloud
Presentation given by (Stein) Runar Bergheim
Asplan Viak Internet AS, Norway
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Presentation given by Walter Koch and Gerda Koch AIT- Angewandte Informationstechnik Forschungs-GmbH, Graz, Austria
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Presentation given by Vassilis Tzouvaras
National Technical University of Athens, Greece
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Presentation given by Dr. Dimitris Gavrilis
Digital Curation Unit - IMIS, Athena Research Center
LoCloud Conference
Sharing local cultural heritage online with LoCloud services Amersfoort, Netherlands
5 February 2016
Bastille, pop band or historical icon? How linked open data helps teachers, students and researchers find the right one. Richard Leeming, BBC/RES, UK
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Beyond the space: the LoCloud Historical Place Names microservicelocloud
Presentation given by Rimvydas Laužikas, Justinas Jaronis and Ingrida Vosyliūtė
Vilnius University Faculty of Communication, Lithuania
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
LoCloud Collections, or how to make your local heritage available on-linelocloud
Presentation given by Marcin Werla
Poznań Supercomputing and Networking Center, Poland
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Presentation on new EC programmes related to the cultural heritage given by Marcel Watelet, European Commission
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Small, smaller and smallest: working with small archaeological content provid...locloud
Presentation given by Holly Wright
Archaeology Data Service University of York, UK
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Spanish collections in Locloud: a round-trip talk between european institutionslocloud
Presentation given by María Carrillo
Ministerio de Educación, Cultura y Deporte (MECD), Spain
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
From local to global: Romanian cultural values in Europeana through Locloudlocloud
Presentation given by Sorina Stanca
Cluj County Library, Romania
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Dynamics and partnerships with local associations involved in LoCloud: a case...locloud
Presentation given by Agnès Vatican, Director of the Gironde Archives and
Nathalie Gascoin, LoCloud project manager In collaboration with Julien Dutertre and James Lemaire
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Increasing Visibility of Cultural Heritage Objects: A Case of Turkish Conten...locloud
Presentation given by Bülent Yılmaz, Özgür Külcü, Tolga Çakmak
Hacettepe University Department of Information Management. Turkey
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
A house museum in the cloud: the experience of Fondazione Ranieri di Sorbello...locloud
Presentation given by Giulia Coletti
Fondazione Ranieri di Sorbello
Responsible for digital project
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
LoCloud: Enabling local digital heritage in Irelandlocloud
Presentation given by Anthony Corns and Louise Kennedy
The Discovery Programme, Ireland
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
Presentation given by Jasmina Ninkov, Predrag Djukic
Belgrade City Library
LoCloud Conference
Sharing local cultural heritage online with LoCloud services
Amersfoort, Netherlands
5 February 2016
LoCloud - D6.5 Sustainability and Exploitation Planlocloud
This report considers the sustainability of LoCloud’s outcomes and provides an exploitation plan to inform future activities.
Authors:
Silvia Alfreider (NRA)
Joachim Fugleberg (NRA)
Ole Myhre Hansen (NRA)
Kate Fernie (2Culture Associates)
Contributors: All partners
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
LoCloud - D2.1: Core Infrastructure Specifications (including Business Process Models)
1. Local content in a Europeana cloud
D2.1: Core infrastructure
Authors:
Costis Dallas (DCU)
Dimitris Gavrilis (DCU)
Stavros Angelis (DCU)
Dimitra Nefeli Makri (DCU)
Eleni Afiontzi (DCU)
LoCloud is funded by the European Commission’s
ICT Policy Support Programme
Version: Final
2. Revision History
Revision Date Author Organisation Description
1.0 31/01/2014 Dimitris Gavrilis DCU Initial draft
1.1 11/02/2014 Marcin Werla PSNC Corrections throughout the document
1.2 12/02/2014 Dimitris Gavrilis DCU More details on infrastructure specs
1.3 13/02/2014 Walter Koch AIT Added business process models
Statement of originality:
This deliverable contains original unpublished work except where clearly indicated
otherwise. Acknowledgement of previously published material and of the work of others
has been made through appropriate citation, quotation or both.
3. 1
Contents
Introduction ...................................................................................................................................................2
1. Infrastructure overview.....................................................................................................................4
2. Technical specifications .....................................................................................................................6
2.1 Infrastructure services....................................................................................................................................6
2.2 Data specifications ............................................................................................................................................7
2.3 Service execution...............................................................................................................................................7
2.4 Core Services description ...............................................................................................................................7
2.4.1 Authentication/authorization...............................................................................................................................7
2.4.2 Storage.............................................................................................................................................................................8
2.4.3 Ingest................................................................................................................................................................................8
2.4.4 Mapping..........................................................................................................................................................................8
2.4.5 Validation.................................................................................................................................................................... 98
2.4.6 Retrieval..........................................................................................................................................................................9
2.4.7 Export ..............................................................................................................................................................................9
2.4.8 Logging............................................................................................................................................................................9
2.4.9 Messaging.......................................................................................................................................................................9
2.4.10 Unique identifier..................................................................................................................................................109
3.4.11 Preservation............................................................................................................................................................ 10
2.4.12 Enrichment .............................................................................................................................................................. 10
3. Workflows............................................................................................................................................ 11
3.1 Using a native repository.............................................................................................................................11
3.2 Using a Lightweight Digital Library..........................................................................................................11
3.3 Through Europeana Local ...........................................................................................................................12
4. Business Process Models................................................................................................................. 13
4.1 Definition - Business Process Model (BPM)..........................................................................................13
4.2 The LoCloud Business Process Model .....................................................................................................13
4.3 Further use of Business Process Modelling in LoCloud ....................................................................15
References.................................................................................................................................................... 16
5. Annex I................................................................................................................................................... 17
5.1 Cloud models and types................................................................................................................................17
5.1.1 Cloud service models.............................................................................................................................................. 17
5.1.2 Cloud Types................................................................................................................................................................ 19
5.2 Cloud technologies .........................................................................................................................................21
5.2.1 Compute services..................................................................................................................................................... 21
5.2.2 Storage services and software............................................................................................................................ 21
5.2.3 Other services............................................................................................................................................................ 23
5.2.4 Frameworks and software for building cloud infrastructures ............................................................. 24
4. 2
Introduction
The present document introduces the core technical infrastructure specifications of the LoCloud
project. The various technical aspects of the infrastructure are presented in detail, focusing on their
advantages and limitations. The challenges of implementing this infrastructure are also described
throughout this deliverable.
The primary mission of the LoCloud project is to aggregate, enrich and deliver from a large number
of providers to Europeana. The main differences from other Europeana aggregation projects are:
the number of providers could scale up
multiple intermediate schemas are supported (instead of one)
content is enriched through micro-services
binary content could be ingested into the infrastructure through the Lightweight Digital
Library (instead of just metadata)
The main problems that have to be tackled are:
handling of the increasing complexity
keeping the cost at low levels
The main architectural diagram of the LoCloud infrastructure as it has been presented can be seen
in the above picture. The main components of the system can be seen clearly:
The lightweight digital library which provides the means for end users with no technical
capabilities to start cataloguing and delivering content.
The MINT tool which is responsible for mapping from native schemas to one of the
intermediate schemas available.
The enrichment micro services which are responsible for enriching metadata
The export services which are responsible for exporting metadata to Europeana and possibly
other providers (e.g. Wikis).
5. 3
All the above services are orchestrated using a cloud core services layer which requires an index
database to maintain important information about the operations described. Metadata are stored
in a cluster of storage nodes.
The overall streamline of the above tasks can be seen in the above picture and fall into four main
steps: a) content is first ingested into the infrastructure, b) metadata records are enriched through
one ore more of the enrichment micro-services, c) metadata are mapped from one of the
intermediate schemas supported to the target schema (EDM) and finally d) metadata are exported
to Europeana or other interested parties (e.g. Wikis owners who can get an enriched version of
their content back).
6. 4
1. Infrastructure overview
Viewing the core technical infrastructure at the top level, one could see three types of interactions:
With individual users though the lightweight digital library (or similar out-of-the-box
repositories). Users ingest content into the infrastructure.
With other systems and services (e.g. content providers’ systems). These systems ingest
content into the infrastructure.
With Europeana or Wikis that get content out of the infrastructure. Europeana because it
wants to integrate new content into the European Digital Library, Wikis that want to get
enriched content back from the infrastructure.
Three obvious cases that will be used throughout the project are presented by the above three
main interactions. It is possible that through the services that will be provided other uses and
interactions may surface.
The three interaction points interface with the infrastructure using an API. This core infrastructure
API is responsible for maintaining data integrity (e.g. organizing data in packages, per provider/user,
etc.), ensuring integrity among versions, maintaining information related with the tasks assigned to
each object (e.g. enrichment, mapping, etc.).
In order for the infrastructure to become operational, a number of components are required. These
components will have to handle: binary objects, metadata, indexes, state information, assets,
services, etc. Among these components, there are two that are critical in terms of scaling up:
a) the binary object store and
b) the metadata object store
7. 5
These two components have obviously different requirements but both require scaling, tackling
issues such as backups, versioning, etc. Another major requirement is the scalability of worker
nodes. Worker nodes are a generic term that refers to micro-services that operate on data or
usually on metadata. Worker nodes typically perform tasks such as indexing, enriching, searching,
relating, etc. The main characteristics of these worker nodes are:
- they operate on different types of metadata (intermediate schemas)
- they operate on either single objects or on collections of objects
- they usually operate asynchronously (because of the above)
- they have to scale (e.g. use multiple worker nodes to finish each task)
The above two main requirements (storage and worker nodes) need to have access to a certain set
of information such as:
- an object registry
- a service registry
- a workflow registry
- a state registry
- a schemas registry
The object registry is required so that a service can have access to a specific object, a collection of
objects (e.g. belonging to a provider, collection, etc.). The service registry needs to be able to
distinguish among the different services available, which intermediate schemas each can operate
on, where they reside on, etc. The workflow registry needs to have all possible and available paths
that can be used on a digital object (e.g. first de-duplicate and then thematically enrich). The
schemas registry needs to have access on the specifics of each intermediate schemas1
.
1
D.1.2 Definition of metadata schemas, http://www.locloud.eu/Media/Files/Deliverables/D1.2-Definition-of-Metadata-
Schemas
8. 6
2. Technical specifications
2.1 Infrastructure services
The proposed technical infrastructure specifications comprise of services that are responsible for
accomplishing various tasks required for achieving the goals of the project. These services are
described in the table below.
Service Description
Storage service This service provides a mechanism for persistent storage
of digital objects.
Ingest service This service is responsible for getting metadata in the
LoCloud infrastructure.
Validation service The validation service checks all intermediate schemas for
validation errors.
Retrieval service The retrieval service is responsible for retrieving records
or batches of records.
Indexing service The indexing service is responsible for indexing parts of
the ingested records (in any of the intermediate schemas)
so that they can be accessed by the infrastructure.
Mapping service The mapping service is responsible for mapping from the
intermediate schemas to the output schemas (EDM).
Export service The export service is responsible for exporting data sets
through OAI-PMH.
Messaging service The messaging service is responsible for managing inter-
services message exchange.
Logging service The logging service is responsible for keeping track of a log
for all services.
Statistics service The statistics service is responsible for keeping track of
various statistics on both the contents of packages and
their state information (e.g. how many are
ingested/published, etc.).
Enrichment service The enrichment service is responsible for handling
enrichment. It communicates through the various
enrichment micro-services.
Preservation service The preservation service is responsible for maintaining
versions, and a PREMIS log for all record actions.
Authentication/authorization service The authentication service is responsible for
authenticating end users / services and provides access to
the rest of the services and the infrastructure.
9. 7
2.2 Data specifications
The services should be implemented using a unified environment and exchange data with specific
formats. These formats are described in a previous deliverable2
and are consisted of the set of the
Intermediate Schemas:
- CARARE
- EAD
- ESE / Dublin Core
- LIDO
Each service that needs to perform a task associated with the context of a record needs to be
context aware. That means that it must be independent in terms of how to access/modify the
content of each record. The infrastructure is and should be schema agnostic.
Each digital object should be available through a single URI.
2.3 Service execution
The core services layer should provide a specification for executing these services. This should
facilitate the use of web services and linked data. The proposed approach should make use of REST
services and formats such as: XML and JSON for exchanging information and performing execution.
The service execution environment should allow each service to reside on different physical
machines / networks thus providing a fully distributed processing environment. The service
execution environment,
2.4 Core Services description
The core services of the LoCloud platform include all the services that are tied into the core of the
infrastructure and are required in order to complete the basic workflows which are: to ingest,
transform and deliver content. These services include the authentication/authorization service, the
logging and messaging services, the unique identifier services, the ingest, validation and
preservation services and finally the ingest, mapping and export services.
2.4.1 Authentication/authorization
The authentication and authorization service has two main responsibilities:
to authenticate end users (content providers) first against the core infrastructure and
secondly to the individual services.
to provide an authentication mechanism for all the web services that will comprise the
infrastructure. This is necessary due to the distributed nature of these services.
The technologies that can be taken into account are:
2
D1.2 Definition of metadata schema: http://www.locloud.eu/Media/Files/Deliverables/D1.2-Definition-of-Metadata-
Schemas
10. 8
the OASIS extensible access control mark-up language (XACML) for defining a standard
method for describing authentication and authorization requirements.
the LDAP protocol for maintaining user account information through a standard that will
allow for interoperability with heterogeneous technologies.
the Spring security framework which provides a unified mechanism for authenticating using
multiple technologies and protocols.
Some of the requirements of this service include the authentication against multiple sources,
authentication of distributed services and the compilation and execution of multiple authentication
mechanisms for each asset.
2.4.2 Storage
The storage service is responsible for providing a persistent and scalable storage infrastructure. The
storage layer is content agnostic and should provide only CRUD operations. There are various
technologies that could be examined and taken into account here, all described in Annex I. Some
important aspects of the storage layer are its ability to scale and to handle a large number of
concurrent threads (possibly with the introduction of multiple peer nodes that can be queried). The
storage layer should also be able to have a simple structure. For example:
It should be able to separate the digital object from its constituting parts
It should be able to maintain some basic admin and tech metadata information (e.g.
mimetype, size, version, created tstamp, owner information).
2.4.3 Ingest
The ingest service is responsible for getting data and metadata in the infrastructure. Before a digital
record becomes part of the LoCloud infrastructure the following steps are required:
Its provider and collection/package information must be identified
Its individual parts must be identified
All its consisting parts must be validated
It must be assigned a unique identifier
The ingest service should be able to handle lots of requests (possibly concurrently).
During the ingest phase, in case where metadata are not in any of the intermediate schemas the
MINT tool will be used to allow content providers to manually map their arbitrary schemas into one
of the intermediate ones (see Annex II).
2.4.4 Mapping
The mapping service is responsible for metadata transformations. This service provides a basic yet
important functionality as the content maintained by the infrastructure will be encoded in a variety
of intermediate schemas (currently 4) and will be required to be exported to EDM. In past projects,
the transformation to EDM isn’t always trivial as special requirements may result in different
transformations with some logic built into them.
11. 9
2.4.5 Validation
The validation service will be responsible for validating content prior to ingestion. Validation
includes a series of integrity checks and will ensure that all content received is well-formed in one
of the intermediate schemas.
2.4.6 Retrieval
The retrieval service will be responsible for retrieving content from the LoCloud infrastructure. The
retrieval service will perform queries on the content maintained by the infrastructure and will
access it through the storage service. The retrieval services’ requirements will include the hiding of
the storage service details (e.g. all objects will only include their unique identifiers and last version)
and to provide an abstraction layer for the distributed cloud storage. The retrieval service will
provide metadata in any of the intermediate schemas. The main queries that the retrieval service
will perform are:
to retrieve a single object based on its identifier
to retrieve the set of objects belonging to a provider
to retrieve the set of objects belonging to a collection
More complex queries will employ the indexing service.
2.4.7 Export
The export service is responsible for exposing parts of the content of the infrastructure through a
variety of ways. The primary method of export is though an OAI-PMH 2.0 provider. Other means of
exporting include the export through XML file exports or through REST services (e.g. back to a
Wikimedia installation). The main requirements of the export service are to be able to handle large
amounts of content and to be able to expose content using a variable scope. The scope what will be
used within LoCloud is: a) per provider, b) per collection.
2.4.8 Logging
The logging service will be responsible for maintaining a log of all users and services actions. The
logging service will have to support the distributed nature of the infrastructure services and provide
a centralized point that will maintain the logs. Regarding the structure requirements of the logging
service, most logging frameworks that will be explored (e.g. Log4J) provide: timestamp, source
information (service, IP), level and a description. There are also tools and services that allow
humans to inspect these logs.
2.4.9 Messaging
The messaging service will be responsible for providing services with messaging functionalities. The
messages will be used for inter-service communication (e.g. trigger the execution of a task or
indicate the completion of a task). The messaging service basic requirements are its robustness, its
capability to handle multiple queues and its ability to integrate easily with the services of the
infrastructure. Technologies that will be taken into account are JMS and ActiveMQ.
12. 10
2.4.10 Unique identifier
The unique identifier service is responsible for providing unique identifiers to the assets created
within the cloud infrastructure. When a new digital object is created, a unique identifier is assigned
which will be tied to it as long as it lives. This unique identifier will provide a single identity for this
object regardless where it physically resides and regardless of its internal identifier assigned by the
storage service.
3.4.11 Preservation
The preservation service is responsible for maintaining digital preservation related information for
each digital object and datastream. An audit trail log is maintained at the digital object level
describing the complete lifecycle of the digital object (and its components – datastreams). In the
past, standards such as PREMIS have been used successfully to standardize the description of all
write events (these are actions that modify the content of a digital object). The preservation service
should also provide the audit log for each file.
2.4.12 Enrichment
The enrichment service will be responsible for providing the infrastructure for metadata
enrichment. The enrichment service will invoke the various micro-services responsible for metadata
enrichment. These services include:
Historic place names
Geo-related information
Subjects
The enrichment service will be responsible for identifying the part of the metadata that could be
enriched, transmitting this information to the respective service and create the enriched version
based on the response of the respective service. The enrichment service should be able to
“understand” parts of any of the intermediate schemas (e.g. how spatial or thematic information is
encoded in each of the intermediate schemas).
13. 11
3. Workflows
In order to better understand how the different components inter-operate with each other, a
number of typical workflow executions are presented.
The main services that are involved are those of MORE plus MINT which can be viewed as a
standalone service within the infrastructure.
In these tables shown in the following sections, some typical examples are shown:
3.1 Using a native repository
Workflow Description
Ingest through existing native repository in
intermediate schema.
1. Metadata are ingested using the ingest
service through OAI-PMH or by XML file
upload.
2. Metadata are validated using the
validation service.
3. Metadata are transformed into EDM
using the mapping service.
4. Metadata are enriched [optional] using
the enrichment service.
5. Metadata are exported to Europeana
using the export API.
Ingest through existing native repository in
native schema.
1. Metadata are ingested using the ingest
service through OAI-PMH or by XML file
upload.
2. Metadata are mapped into one of the
intermediate schemas using MINT.
3. Metadata are validated using the
validation service.
4. Metadata are transformed into EDM
using the mapping service.
5. Metadata are enriched [optional] using
the enrichment service.
6. Metadata are exported to Europeana
using the export API.
3.2 Using a Lightweight Digital Library
Workflow Description
Ingest through LDL repository in intermediate
schema.
6. Metadata are ingested using the ingest
service through OAI-PMH or by XML file
upload.
7. Metadata are validated using the
validation service.
8. Metadata are transformed into EDM
using the mapping service.
14. 12
9. Metadata are enriched [optional] using
the enrichment service.
10. Metadata are exported to Europeana
using the export API.
3.3 Through Europeana Local
Workflow Description
Ingest through existing Europeana Local
repository in intermediate schema.
1. Metadata are ingested using the ingest
service through OAI-PMH.
2. Metadata are validated using the
validation service.
3. Metadata are transformed into EDM
using the mapping service.
4. Metadata are enriched [optional] using
the enrichment service.
5. Metadata are exported to Europeana
using the export API.
Ingest Europeana Local in native schema. 7. Metadata are ingested using the ingest
service through OAI-PMH.
8. Metadata are mapped into one of the
intermediate schemas using MINT.
9. Metadata are validated using the
validation service.
10. Metadata are transformed into EDM
using the mapping service.
11. Metadata are enriched [optional] using
the enrichment service.
12. Metadata are exported to Europeana
using the export API.
15. 13
4. Business Process Models
4.1 Definition - Business Process Model (BPM)
“A sequential representation of all functions associated with a specific business activity.3
”
In the LoCloud-Context this is a BPM-Diagram which depicts:
“how Local Cultural Heritage metadata and content is handled by the ‘LoCloud
Environment’”
will show data entry, data transformation, aggregation, and export to “Europeana data aggregation
and storage services”.
The Business Process Model includes both IT processes (“non-human actors”) and people
processes (“human actors”).4
Business Process Modelling is cross functional, combining the work and documentation of
more than one partner of the project.
The LoCloud Business Process Modelling is using an Open Source version of the
computerized tool “Signavio”5
, applying the BPMN (Business Processing Modelling and
Notation) method developed by OMG6
. A quick guide to BPMN is available at:
http://www.bpmnquickguide.com/viewit.html .
For the LoCloud project the developed model is accessible at:
http://test119.ait.co.at:8081/locloud/p/explorer (Folder WP2).
There is a CIDOC Working Group looking into the possibility of applying the BPMN-method
to the SPECTRUM Procedures.7
4.2 The LoCloud Business Process Model
The LoCloud Business Process Model which describes the “LoCloud Environment” is divided in 5
different pools containing the activities for:
Content Provider (human actor),
Aggregator (human actor),
Europeana (human actor),
Applications (Core Processes, non-human actor),
Micro Services (Support Processes, non-human actor).
Please note that the diagram (next page) defines three generic applications (data entry/enrichment
tool, data transformation/enrichment tool, and data provider). These applications relate to
following LoCloud Components (Chapter 2): “Lightweight Digital Library”, “Ingest”, and “Export”.
This diagram is a “reference model” for the LoCloud System which is developed within a three year
time span.
3
http://www.businessdictionary.com/definition/Business-Process-Model.html 2014-02-13
4
http://www.businessballs.com/business-process-modelling.htm 2014-02-13
5
http://www.signavio.com/ 2014-02-13
6
http://www.bpmn.org/ 2014-02-13
7
http://network.icom.museum/cidoc/working-groups/mpi-museum-process-implementation/ 2014-02-13
17. 15
This Model does not contain all details that will be found in the final system. E.g. it shows in the
“Support Process Pool” only the activity related to the “Vocabulary Micro Service”. This pool
contains also a “Data Access Micro Service” which might be useful when putting local cultural
heritage or sensitive data into a private cloud while all other processes are included in a public
cloud. This feature might not be implemented in the LoCloud Project.
The figure below shows the situation on a higher abstraction level.
4.3 Further use of Business Process Modelling in LoCloud
The BPMN method used for describing the overall “LoCloud Environment” will also be applied to
the Test-Environment set up for testing the “Vocabulary Micro Service” (WP3). The “Vocabulary
Micro Service Test System” will be generated using the “Xataface GUI-Builder”8
, and connection to
the Vocabulary Micro Service will be implemented by a “Widget” which also provides access to a
collaborative platform in case communication to thesaurus experts is needed when adding a
candidate term to a thesaurus. Such a process is more detailed described in the actual ISO Standard
25964-1, Chapter 13 “Managing thesaurus construction and maintenance”9
. The implementation of
this test system using a native “BPMN Execution Engine” as offered e.g. by Bonitasoft10
is not
foreseen but will be investigated.
8
http://www.xataface.com/ 2014-02-13
9
http://www.iso.org/iso/catalogue_detail.htm?csnumber=53657 2014-02-13
10
http://www.bonitasoft.com/ 2014-02-13
18. 16
References
Akka 2.0: http://doc.akka.io/docs/akka/2.1.1/Akka.pdf
Alexander Zahariev (2009), Google App Engine, Helsinki University of Technology
Amazon Web Services: http://aws.amazon.com/ (last access 31/01/2014)
Balla Wade Diack, Samba Ndiaye, Yahya Slimani, CAP Theorem between Claims and
Misunderstandings: What is to be Sacrificed?, International Journal of Advanced Science and
Technology, Vol. 56, 2013
Broberg J., Buyya, R., and Goscinski A., Cloud Computing: Principles and Paradigms, Wiley Press,
USA, 2011
CAP Theorem: http://www.slideshare.net/larsga/nosql-databases-the-cap-theorem-and-the-
theory-of-relativity (last access 31/01/2014)
Cassandra: http://cassandra.apache.org/
DuraCloud: http://www.duracloud.org/ (last access 31/01/2014)
Eeva Savolainen (2012), Cloud Service model, Univeristy of Helsinki, Department of Computer
Science, 2012
ElasticSearch: http://www.elasticsearch.org/ (last access 31/01/2014)
Eucalyptus: http://www.eucalyptus.com (last access 31/01/2014)
GoGrid: http://www.gogrid.com/ (last access 31/01/2014)
Google App Engine: https://developers.google.com/appengine/?csw=1 (last access 31/01/2014)
HBase: http://hbase.apache.org/
Microsoft Azure Platform: http://www.windowsazure.com/en-us/ (last access 31/01/2014)
MongoDB: http://www.mongodb.org/
Open Nebula http://opennebula.org (last access 31/01/2014)
Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal (2008), Market-Oriented Cloud Computing:
Vision, Hype, and Reality for Delivering IT Services as Computing Utilitie, High Performance
Computing and Communications, HPCC '08, 10th IEEE International Conference
Redis: http://redis.io/
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung (2003), The Google File System, SOSP’03,
Bolton Landing, New York, USA.
Types of cloud computing : Private, public and Hybrid clouds :
http://blog.appcore.com/blog/bid/167543/Types-of-Cloud-Computing-Private-Public-and-
Hybrid-Clouds (last access 31/01/2014)
Yunhee Kang, Kyung-Woo Kang (2013), An Empirical Study of Hadoop Application running on
Private Cloud Environment, Advanced Science and Technology Letters, Vol.35, pp.70-73
Qi Zhang, Lu Cheng, Raouf Boutaba (2010), Cloud computing: state-of-the-art and research
challenge, Journal of Internet Services and Applications, Vol.1, pp.7-18
VMWare vSphere (vCloud): http://www.vmware.com/ (last access 31/01/2014)
Ganeti: https://www.usenix.org/legacy/event/lisa07/tech/trotter_talk.pdf
Lauce Albertson, Comparing Open Source Private Cloud (IaaS) Platforms
(http://www.slideshare.net/OReillyOSCON/comparing-open-source-private-cloud-platforms)
19. 17
5. Annex I
5.1 Cloud models and types
The rates of the development of processing and storage technologies in combination with the
success of the Internet have led to more powerful, reliable and available computing resources.
Cloud computing leverages virtualization technology to achieve the goal of providing computing
resources as a utility. This section first displays the cloud service models, explaining the
characteristics of its model ant the differences among these. The second section displaying the
cloud types describes the possible ways in which LoCloud could be operated. It should be noted
that the described models and types do not provide a complete list of all platforms and services;
however it is an indicative list which should be considered and can be combined by LoCloud
infrastructure.
5.1.1 Cloud service models
The architecture of a cloud computing can be divided into 4 layers: the hardware layer, the
infrastructure layer, the platform layer and the application layer. Each layer of the architecture can
be implemented as a service to the layer above. Cloud computing services are divided into four
classes, according to the abstraction level of the capability provided and the service model of
providers, namely: (1) Infrastructure as a Service, (2) Platform as a Service, (3) Software as a Service
and (4) Community as a Service.
Figure 1 defines the layered structure of the cloud stack from physical infrastructure to
applications. These service model levels can also be viewed as a layered architecture where services
of a higher layer can be composed from services of the underlying layer.
Figure 1: Cloud service model [Broberg et al. , 2011]
20. 18
Infrastructure as a Service (IaaS)
Infrastructure as a Service (IaaS) refers to on-demand provisioning of infrastructural resources as a
shared service. IaaS is the hardware and software that powers the servers, storage, networks and
operating systems.
In IaaS model cloud consumers directly use infrastructure components (storage, firewalls,
networks, and other computing resources) provided by the cloud provider. Virtualization is widely
used in order to provide physical resources in an ad-hoc manner to meet current resource demand
of cloud consumers. Examples of IaaS providers include the Amazon EC2 service or the GoGrid
infrastructure.
Platform as a Service (PaaS)
Platform as a Service (PaaS) refers to providing platform layer resources, including operating system
support and software development services. PaaS offers an environment where developers can
create and deploy applications, providing a service that can be used to a complete software
development lifecycle management, from planning to design to building applications to deployment
to testing to maintenance.
PaaS clouds provider higher-level abstractions for cloud applications, which simplifies the
application development process and removes the need to manage the underlying software and
hardware infrastructure. Examples of PaaS providers include the Google App Engine or the
Microsoft Azure Platform.
Software as a Service (SaaS)
Software as a Service (SaaS) refers to providing on-demand applications over the internet. SaaS is
a software delivery method that provides access to software and its functions remotely as a Web-
based service.
In SaaS model a software provider licenses a software application to be used and purchased on
demand. Applications can be accessed through networks from various clients (web browser, mobile
phone, etc.) by application users. The most common pricing model is pay per use, which a customer
pays a static price for units they use. Example of SaaS providers includes the Salesforce platform.
Community as a Service (CaaS)
Community as a Service (CaaS) is the next layer of cloud computing, economic development and
exports and the path to a simpler society. From an information perspective, open source
technologies are assembled to organise communities in a web based platform. From an economic
development perspective, CaaS organises wisdom and effort and regional and global participation
drives export revenue.
In CaaS model several organizations from a specific group share the infrastructure. The goal of a
community cloud is to have participating organizations realize the benefits of a multi-tenancy and a
pay-as –you-go billing structure with the added level of privacy, security and policy compliance.
21. 19
5.1.2 Cloud Types
There are different types of clouds that you can subscribe to depending on user needs. The cloud
types play a great role in the way that LoCloud infrastructure will be operationalized as the cloud
type affects both the organization and the implementation of the cloud services. Figure 2 depicts
the 4 cloud types.
Figure 2: Cloud types [http://blog.appcore.com/blog/bid/167543/Types-of-Cloud-Computing-
Private-Public-and-Hybrid-Clouds]
Private Cloud
Private cloud is established for a specific group or organization and limits access to just that group.
The private cloud is usually a pool of resource inside a company; however it may be managed by
either the company or a third party.
In case of Europeana network and under the LoCloud project, an organization/consortium could be
formed that would adopt the principles of the private cloud infrastructure and it would own the
hardware hosted in a data center.
Public Cloud
A public cloud can be accessed by any subscriber with an internet connection and access to the
cloud space. The cloud infrastructure is made available to any organizations while both service
providers and company are benefit from economies of scale. With public cloud services, users don’t
need to purchase hardware, software or supporting infrastructure, as it is owned and managed by
providers.
In case of Europeana network and under the LoCloud project, the services would be developed and
deployed in the public infrastructure of existing providers. Partners, in order to run the cloud
22. 20
infrastructure, do not need any existing storage or computational resources as these are provided by
the public cloud.
Hybrid Cloud
A hybrid cloud is essentially a combination of at least two clouds, where the clouds included are a
mixture of public, private, or community. A hybrid cloud uses a private cloud foundation combined
with the strategic use of public cloud services. Most companies with private clouds will evolve to
manage workloads across data centers, private clouds and public clouds—thereby creating hybrid
clouds.
In case of Europeana network and under the LoCloud project, the hybrid cloud would benefit the
pros of both the public and the private public, making it clear which services and when run from
public and private component.
Community Cloud
A community cloud is shared among two or more organizations that have similar cloud
requirements. A community cloud is a is a multi-tenant cloud service model that is shared among
several or organizations and that is governed, managed and secured commonly by all the
participating organizations or a third party managed service provider.
In case of Europeana network and under the LoCloud project, the resources would be utilized by the
number of partners participating to the project so as to build the shared cloud infrastructure. In this
way, partners could benefit from both their services and resources and other partners’ resources,
some of which provide them with unique offerings.
23. 21
5.2 Cloud technologies
In this section the state-of-the art implementations of cloud computing are presented. Firstly, a
variety of services are described, which are distinguished into compute services, storage services
and software and other services. Moreover, technologies and frameworks for building cloud
environments are displayed.
5.2.1 Compute services
Amazon Elastic Compute Cloud (EC2) provides a virtual computing environment that enables a user
to rent virtual servers and deploy on them their own services. The user can either create a new
Amazon Machine Image (AMI) containing the applications, libraries, data and associated
configuration settings, or select from a library of globally available AMIs. The user needs to upload
the created or selected AMIs to Amazon Simple Storage Service (S3) and monitor instances of the
uploaded AMIs. Amazon EC2 charges the user for the time when the instance is alive, while Amazon
S3 charges for any data transfer (both upload and download). Amazon EC2 is one of the most
considerable services in the IaaS field, while other important players in this field are mentioned, for
example, in the following list: http://www.clouds360.com/iaas.php.
Google App Engine is a platform which allows users to run and host their web applications on
Google’s infrastructure, which are easy to build, to maintain and to scale whenever traffic and data
storage needed. Google App Engine does not require any server to be maintained and no
administrators needed. The basic idea is that the user just uploads his application and then its own
customers are ready to be served. The user, by Google App Engine, has the option to limit the
access of the application within the members of his organization or to share it with the rest of the
world. The starting packet is free of charge and additional obligation. Implementation of Google
App Engine applications is done under Python programming language. Google App Engine is able to
distribute application’s web requests across various servers, which allows starting and stopping the
servers to meet traffic demand. However, applications uploaded to engine are not portable to
other platforms. Other PaaS providers are listed in the following list:
http://www.clouds360.com/paas.php.
Windows Azure is a cloud computing platform and infrastructure, created by Microsoft, for
building, deploying and managing applications and services through a global network of Microsoft-
managed data centres. It provides both PaaS and IaaS services and supports many
different programming languages, tools and frameworks, including both Microsoft-specific and
third-party software and systems. Windows Azure presents a .NET-based hosting platform that is
integrated into a virtual machine abstraction; however, it achieves flexibility via the wide range of
language supports other than .NET framework, such as Java and PHP.
5.2.2 Storage services and software
5.2.2.1 Storage services
. Amazon provides three remarkable public cloud storage services designed for different purpose:
ESB, S3 and Glacier.
24. 22
Amazon Elastic Block Service (EBS) provides persistent block storage for use with Amazon EC2
instances. Each Amazon EBS volume is automatically replicated to protect you from component
failure, offering high availability and durability, while it offers the consistent and low-latency
performance needed to run your workloads.
Amazon Simple Storage Service (S3) provides a simple web-service interface that can be used to
store and retrieve any amount of data, at any time, from anywhere on the web. The service aims to
maximize benefits of scale and to pass it on to developers. It has a slightly higher latency than EBS
and is intended for storing data of any size.
Amazon Glacier is an extremely low-cost storage service that provides secure and durable storage
for data archiving and backup. In order to keep costs low, Amazon Glacier is optimized for data that
is infrequently accessed and for which retrieval times of several hours are suitable. It is claimed to
be up to 90% cheaper than S3.
DuraCloud is an interesting service in the domain of digital libraries designed for storage and
preservation aiming to maximize availability and durability. This means that data are not only
distributed into multiple geographical locations managed by one vendor but also to infrastructures
of different vendors. DuraCloud uses Amazon S3, Amazon Glacier and Rackspace as their underlying
storage services.
The list of other important cloud storage providers is given here:
http://www.clouds360.com/storage.php.
5.2.2.2 Frameworks for distributed file systems
Apart from the storage services, there are interesting projects providing software for building
distributed file systems on the market.
Google File System (GFS) – is a distributed file system that has been developed in order to meet the
needs of Google. GFS shares the same goals with all the previous distributed file systems such as
performance, scalability, reliability, availability. Google File System consists of a great number of
storage machines, accessed by many client machines. The files are organized hierarchically in
directories and identified by path names. A GFS cluster consists of a single master and multiple
chunkservers. A single master process and maintains the metadata while a chunkserver stores the
data in units, called chunks. The GFS provides fault tolerance as for fast recovery and chunk
replication.
Hadoop Distributed File System (HDFS) – is part of the Apache Hadoop software framework that
has been developed for large scale data analytics across clusters of computers. Hadoop Distributed
File System stores large files across multiple machines. It achieves reliability by replicating the data
across multiple servers. Similarly to Google File System, data is stored on multiple geo-diverse
nodes. The file system is built from a cluster of data nodes, each of which serves blocks of data over
the network using a block protocol specific to HDFS. Data is also provided over HTTP, allowing
access to all content from a web browser or other types of clients. Data nodes can talk to each
other to rebalance data distribution, to move copies around, and to keep the replication of data
high. Hadoop provides a distributed file system and a framework for the analysis and
transformation of very large data sets using the MapReduce paradigm.
OpenStack (Swift/Cinder) – is a collection of open source components to deliver public and private
clouds. OpenStack utilizes Python as a development language for open source project to build a
private cloud computing environment. It is composed of such components as Nova, Swift, Cinder,
25. 23
KeyStone, etc. For instance, Nova supports the virtualization of computing resources and manages
the virtual machine instance. Swift project provides storage service like S3 service of Amazon as an
object storage project of OpenStack. OpenStack provisions the computing resources dynamically as
a tool of IaaS (Infrastructure as a Service). It is provided as a service that forms the sub-domain
structure of computing resources of the virtualized computing, storage and network.
5.2.3 Other services
Apart from computing and storage services, there are many other cloud services including
searching in the cloud and distributed databases that are useful and necessary in order to deploy a
cloud infrastructure.
5.2.3.1 Searching in the cloud
In terms of search in the cloud, one interesting offering is the ElasticSearch hosting. ElasticSearch is
a flexible and powerful open source, distributed, real-time search and analytics engine, giving you
the ability to move easily beyond simple full-text search. ElasticSearch is implemented in:
6. Bonsai (https://addons.heroku.com/bonsai),
7. QBox (https://qbox.io/)
8. Found (http://www.found.no/)
9. Amazon CloudSearch (http://aws.amazon.com/cloudsearch/pricing/) service.
Considering the above options, it could be concluded that search as a service is enough expensive,
despite the fact that it is often guaranteed by the providers that the indexes are stored in main
memory (RAM), which in the case of search is the key for achieving optimum performance.
However, the biggest issue of the current search as a service offering is that many of the available
solutions do not provide satisfying index storage. Amazon CloudSearch seems to be a possibly
cheaper option suitable also for large indexes. CloudSearch costs are calculated based on data out
(retrieved items), data in are free. This makes CloudSearch potentially attractive for large indexes
with relatively low amount of retrieved items, which is a description that can fit cultural heritage
institutions.
5.2.3.2 Distributed Databases
One of the basic principles of distributed databases has been described by the CAP theorem. The
CAP theorem (CAP standing for Consistency, Availability and Partition Tolerance) states that a
distributed system is impossible to simultaneously provide all three of the following guarantees:
Consistency, which means that all nodes give always the same answer
Availability, which means that nodes always answer queries and accept updates
Partition tolerance, which means that the system continues working even if one or more
nodes go quiet.
There is a great variety of distributed databases, each one of which has different features. For
example, some distributed databases, like key-value stores, offer the highest levels of scalability in
terms of data size but are less suitable for expressing complex data while others, like graph
databases, can express more complex data relationships, but are less scalable.
[http://en.wikipedia.org/wiki/NoSQL#Classification_based_on_feature]. Some examples of
distributed databases are the following:
Redis - is an open source, BSD licensed, advanced key-value store. The user can run atomic
operations, like appending to a string; incrementing the value in a hash; pushing to a list; computing
26. 24
set intersection, union and difference; or getting the member with highest ranking in a sorted set.
Redis works with an in-memory dataset and it supports trivial-to-setup master-slave replication,
with very fast non-blocking first synchronization, auto-reconnection on net split and so forth.
Moreover, there is a useful implementation of Redis, Redis cloud, which is a fully-managed cloud
service for hosting and running redis dataset in a highly-available and scalable manner, with
predictable and stable top performance.
Cassandra - is an open source distributed database management system designed to handle large
amounts of data across many commodity servers, providing high availability with no single point of
failure. One of the most significant features of Cassandra is the linear scalability and the proven
fault-tolerance on commodity hardware. Cassandra also supports the replica across multiple data
centres, providing lower latency for the users.
Apache HBase - is an open source, non-relational, distributed database system. It is the Hadoop
database and it is used to provide real-time read and write access to the user’s data. Some of the
features of HBase are linear and modular scalability, consistency in reads and writes or support for
exporting metrics via the Hadoop metrics subsystem.
MongoDB – is an open-source document database system. Classified as a NoSQL database, MongoDB
is based on the relational database structure in favor of JSON-like documents with dynamic
schemas, making the integration of data in certain types of applications easier and faster.
Interesting features of MongoDB are the support of full indexing, the replication and high
availability, the flexible aggregation and data processing through map/reduce
5.2.4 Frameworks and software for building cloud infrastructures
There is a plenitude of tools supporting the development and the deployment of cloud
infrastructures. In the field of frameworks for building cloud environments, the following projects
should certainly be noted:
OpenStack is designed to create freely available code, standards, and common ground for the
benefit of both cloud providers and cloud customers. The goal of OpenStack is to allow organization
to create and offer cloud computing capabilities using open source software running on standard
hardware. The project boasts of compute, storage and image service component. OpenStack
Compute is open source software designed to provision and manage large networks of virtual
machines, creating a redundant and scalable cloud computing platform. OpenStack Storage is
software for creating redundant, scalable object storage using clusters of commodity servers to
store terabytes or even petabytes of data. All of the code for OpenStack is freely available under
the Apache 2.0 license. OpenStack is aiming at Virtualization Portability where user will be able to
move from virtualization technologies including those hosted in the cloud and will be able to
migrate seamlessly.
VMWare vSphere (vCloud) Suite is an integrated solution for building and operating a private cloud
based on VMware vSphere that leverages the software-defined data center architecture. VCloud
can integrate virtualized IT services with analytics-based, highly automated operations
management. Moreover, VMWare vSpere deploys infrastructure services in a fast pace with full
command of critical business and IT policies.
27. 25
Ganeti - is an open source virtualization management platform developed at Google. Ganeti is a
cluster virtual server management software tool built on top of existing virtualization
technologies and it is used internally in Google to serve in total its infrastructures. It is based
primarily on Xen virtualization platform and it makes it simple to manage many nodes and much
more instances at a time. The basic goals of Ganeti is to increase the availability, to reduce
hardware cost using cheap commodity hardware, to increase flexibility, to provide transparency
and to reduce hardware fault-tolerance. It supports different host systems and it is independent on
specific hardware. Ganeti components are the master daemon that controls overall cluster
coordination, the node daemon which controls node functions, the conf daemon which provides a
fast way to query configuration, an API daemon which provides a remote API and Htools, that
means tools responsible for auto-allocation and rebalancing.
Other well-known tools and frameworks in this filed include:
AKKA, a JVM-based toolkit and runtime that implements the actor model concurrency;
EUKALYPTUS is an open source software that promises the creation of on-premise private
clouds, with no requirements for retooling the organization's existing IT infrastructure or
need to introduce specialized hardware;
OpenNebula, an open-source cloud computing toolkit for allowing integration with different
storage and network infrastructure configurations and hypervisor technologies.
28. 26
6. Annex II
6.1 Mint
MINT is a web based platform designed and developed to facilitate aggregation initiatives for
cultural heritage content and metadata in Europe. It is employed from the first steps of such
workflows, corresponding to the ingestion, mapping and aggregation of metadata records, and
proceeds to implement a variety of remediation approaches for the resulting repository. The
platform offers a user and organization management system that allows the deployment and
operation of different aggregation schemes (thematic or cross-domain, international, national
or regional) and corresponding access rights. Registered organizations can upload (http, ftp,
OAI-PMH) their metadata records in xml or CSV serialization in order to manage, aggregate and
publish their collections.
A reference metadata model serves as the aggregation schema to which the ingested (standard
or proprietary) schemata are aligned. Users can define their metadata crosswalks with the help
of a visual mappings editor for the XSL language. Mapping is performed with simple drag-and-
drop or input operations, which are then translated to the corresponding code. The mappings
editor visualizes both the input and target XSD, in an intuitive interface that provides access
and navigation of the structure and data of the input schema, and the structure, documentation
and restrictions of the target one. It supports string manipulation functions for input elements
in order to perform 1-n and m-1 (with the option between concatenation and element
repetition) mappings between the two models. Additionally, structural element mappings are
allowed, as well as constant or controlled value (target schema enumerations) assignment,
conditional mappings (with a complex condition editor) and value mappings between input
and target value lists. Mappings can be applied to ingested records, edited, downloaded and
shared as templates between users of the platform.
Preview interfaces present to users the steps of the aggregation including the current input xml
record, the XSLT of their mappings, the transformed record in the target schema, subsequent
transformations from the target schema to other models of interest (e.g. Europeana's metadata
schema), and available html renderings of each xml record. Users can transform their selected
collections using complete and validated mappings in order to publish them in available target
schemas for the required aggregation and remediation steps.
The metadata ingestion workflow that takes place in LoCloud framework, as illustrated in
Figure, consists of three main steps. First is the Import of provider’s metadata using common
data delivery protocols, such as OAI-PMH, HTTP and FTP. Following is the Schema Mapping
procedure, during which the imported metadata can be mapped to LIDO, Carare2.0.1 or EDM
that serve as the intermediate metadata standards between provider’s metadata and
Europeana Data Model. A graphical user interface assists content providers in mapping their
metadata to the selected schema, using an underlying machine-understandable mapping
language. Furthermore, it provides useful statistics about the provider’s metadata also
supporting the share and reuse of metadata crosswalks and the establishment of template
transformations. The third step is the Transformation procedure, during which provider’s
metadata is transformed to the selected schema by using the mapping they made in the
previous step.