A Service-Oriented-Architecture for Collaborative Workflow Development and Experimentation in the Digital Humanities
2012 Leipzig eHumanities Seminar, 10 October 2012, Leipzig, Germany.
Collaborative Workflow Development and Experimentation in the Digital Humanities
1. A Service-Oriented Architecture
for Collaborative Workflow
Development and
Experimentation
eHumanities Seminar 2012
University of Leipzig
10-10-2012
Clemens Neudecker, KB @cneudecker
Zeki Mustafa Dogan, SUB-DL
Sven Schlarb, ÖNB @SvenSchlarb
Juan Garcés, GCDH @juan_garces
2. Idea
• Provide web-based versions of tools
(web services)
• Package web services, data and
documentation into ready-to-run
“components” (encapsulation)
• Chain the components to create workflows
via drag-and-drop operation
• Share and use workflows to re-run
experiments and to demonstrate results
3. Background
• High degree of diversity in research topics,
but also tools and frameworks being used
• Technical resources should be easy to
use, well documented, accessible from
anywhere
• Prevent re-inventing of the wheel
4. Requirements
• Interoperability = connect different resources
• Flexibility = easy to deploy and adapt
• Modularity = allow different combinations of tools
• Usability = simple to use for non-technical users
• Re-usability = easy to share with others
• Scalability = apt for large-scale processing
• Sustainability = resources simple to preserve
• Transparency = tools evaluated separately
• Distributed development and deployment
5. Interoperability Framework (IIF)
• Modules:
- Java Wrapper for command line tools
- Web Services (incl. format converters)
- Taverna Workflow Engine
- Client interfaces
- Repository connectors
7. IIF Command Line Wrapper
• Java project, builds using Maven2
• Creates a web service project from
a given tool description (XML)
• Web service exposes SOAP & REST
endpoints and Java API interface
• Requirements: command line call,
no direct user interaction
8.
9.
10.
11.
12. IIF Web Services
• Web services are described by a WSDL
• Input/output data structures
• Data is referenced by URL
• Annotations
• Default values
15. IIF Workflows
• What is a workflow? (Yahoo Pipes, etc.)
• Different kinds of workflows: for a single
command, application, chain of processes
• Main benefit: Encapsulation, Reuse
• Workflows as “components”: include link
to WS endpoint, sample input data and
documentation = ready-to-use resource
• Web 2.0 workflow registry: myExperiment
16.
17. Why workflows?
• “In-silico experimentation”
• Good structuring of experiment setup:
– Challenge/Research question
– Dataset definition
– Processing with algorithms
– Evaluation/Provenance
– Presentation of results
• All this can be modelled into a workflow
18. Integration into Taverna
• Web Services (SOAP and REST)
• Command line tools (SH and SSH)
• Beanshells (can import Java libraries)
• R (statistics)
• Excel, CSV
• Additional service types can be added
through dedicated plug-ins
19. Taverna flavours
• Workbench – local GUI client for Linux,
Windows, OSX
• Command line tool – run workflows from
the command line
• Server – Webapp with REST API and
Java/Ruby client libs
• Web-Wf-Designer – Javascript version for
designing workflows in a browser
23. Client interfaces
• Web service client: create a simple HTML
form from a given web service description
• Taverna client: create a simple HTML form
from a given Taverna workflow description
integration into production and
presentation environments via iframes