Your SlideShare is downloading. ×
0
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
BSB Demo Day - Schlarb - Workflow-Design
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

BSB Demo Day - Schlarb - Workflow-Design

369

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
369
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Entscheidungsfindung in der Digitalisierungdurch experimentelle Workflow-Entwicklung Sven Schlarb, Austrian National Library IMPACT Demo Day München, 11. Oktober 2011
  • 2. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.OCR: Herausforderungen …I. Bildvorverarbeitung und OCR 2
  • 3. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.OCR: Herausforderungen …II. Linguistische Nachbearbeitung (Gemischte Sprachen, Historische Varianten, etc.)Beispiel: Historische Varianten des Niederländischen Worts ‘wereld’(Welt):werelt weerelt wereld weerelds wereldt werelden weereld werrelts waerelds weerlytwereldts vveerelts waereld weerelden waerelden weerlt werlt werelds sweerelszwerlys swarels swerelts werelts swerrels weirelts tsweerelds werret vverelt werltswerrelt worreld werlden wareld weirelt weireld waerelt werreld werld vvereld weereltswerlde tswerels werreldts weereldt wereldje waereldje weurlt wald weëled 3
  • 4. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.… und eine Vielfalt an Lösungen 22 verschiedene ‘Werkzeuge’ von verschiedenen Entwicklern und aus unterschiedlichen Work Packages Unterschiedliche technische Umgebungen: – OCR (C++, C#), – Bildverarbeitung & Lexika (C, C++, DLL), – Kommandozeilenprogramme (Windows/Linux), – Java, Ruby, PHP, Perl, etc. IMPACT Interoperability Framework (IIF) 4
  • 5. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Technische Herausforderungen Skalierbarkeit – Umfang der Eingabedaten (Einzelne Seiten / tausende Bücher/Zeitungen) – Größe der Eingabedaten (z.B. sehr hochauflösende Bilder) Stabilität – Parallelisierung – Geklonte Knoten → Gleiches Verhalten? – Failover – Alternative Knoten bei Fehlern – Korrekte Funktionsweise der Einzelkomponenten Transparenz – Verständliche Fehlermeldungen während der Stapelverarbeitung auf den verschiedenen Architekturebenen (Werkzeug-, Service-, Workflowebene) 5
  • 6. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Experimentelle Workflow-Entwicklung Beispieldaten online verfügbar → Reproduzierbarkeit Workflows unmittelbar ausführbar → Vergleichbarkeit Workflow-Entwicklung als eine gemeinsame, institutionsübergreifende Aktivität → Annotation, Bewertung „Auf-einen-Blick“-Darstellung des Workflows Auffindbarkeit von Komponenten und Workflows, und Workflow- Fragmenten Zentraler Ergebnisdatenspeicher 6
  • 7. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Interoperability Framework Interoperabilität vs. Integration Web-basiert vs. lokale Applikation/Plattform Java 6 Apache Tomcat Apache Axis2 Apache Synapse (optional) Taverna Workflow Engine 7
  • 8. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Tool WrapperAnforderung: Werkzeug als Kommandozeilenprogramm verfügbarTool wrapper code im Github Repository der Open Planets Foundation (OPF) verfügbar:https://github.com/openplanets/scape/tree/master/xa-toolwrapper Minimaler Integrationsaufwand für Werkzeug-Entwickler 8
  • 9. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Service Oriented Architecture Java als Programmiersprache Standard Apache Komponenten Synapse als Enterprise Service Bus (load balancing & fail over) HTTPS Verschlüsselung & Basic Auth Minimaler Aufwand für das Komponenten-Deployment 9
  • 10. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Verknüpfung von Einzelkomponenten zu einem„Workflow“ 10
  • 11. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Workflow-Entwicklung OCR workflow = Datenverarbeitungspipeline Komponenten = Verarbeitungsschritte(knoten) 11
  • 12. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Workflow-Komponenten “Basic” workflow = Minimal-Komponente für ein IMPACT-Werkzeug Gut dokumentiert, Beispieldaten vorhanden, ausführbar
  • 13. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Workflow Management Komponenten-Verzeichnis: myExperiment Localer Client: Taverna Workbench Web Client: Projekt Website 13
  • 14. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Workflow-Verzeichnis Komponenten und Workflows veröffentlichen Bewerten, Taggen, Kommentieren, ... Verweise auf verwendete Komponenten und Workflows anderer Nutzer
  • 15. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Komponenten-Katalog?Viele Fehler unterlaufen, weil Anforderungen an Eingabe- undAusgabedaten nicht ausreichend spezifiziert (formalisiert!) sind. GetImageFromURL Tool Input and output URL String RGB Image binary image, Bitonal image Bitonal image but incompatible How to find the corresponding tool? RGB Image Bitonal image How to proceed in case of a Gap?
  • 16. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Local client: Taverna Workbench Hintergrund: Bioinformatik Entwickelt von myGrid, Manchester Verfügbar für Windows/Linux/OSX als Open Sourcehttp://www.taverna.org.uk/
  • 17. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Workflowentwicklung in Taverna Workflows lassen sich einfach aus verfügbaren Komponenten und Workflows erstellen (drag and drop) Hinweis: Komplexität limitiert → Zusammengehörende Arbeitsschritte in Komponente zusammenfassen 17
  • 18. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Web client: Taverna Server/ Workflow Parser SOAP/REST API Entfernte Workflowausführung durch Übergabe der XML-Instanz
  • 19. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Use case: Workflows für die Evaluation Werkzeug A vs. Werkzeug B (Werkzeug A(v1) vs Werkzeug A(v2)) Workflow X (Werkzeug A + B) vs Workflow Y (Werkzeug A + C) Optimaler Workflow mit Bezug auf das Quellmaterial ermitteln 25
  • 20. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Zentraler ErgebnisdatenspeicherSchnittstelle zur Speicherung von Ergebnisdaten (WebDAV) und zur Berichterstellung (Apache POI) als Workflow-Modul realisiert 26
  • 21. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Workflows in laufenden Projekten Workflows in der Digitalisierung IMPACT Workflows in der Linguistischen Analyse CLARIN Workflows in der Langzeitarchivierung SCAPE Und viele mehr ... 27
  • 22. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Kompatibilität der Workflow-Frameworks Beispiel: UIMA ↔ Taverna Eigennamenextraktion → Linguistische Analyse → Semantic Web Digitalisierung, OCR → Langzeitarchivierung 28
  • 23. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Danke! Fragen?

×