• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
IMPACT Final Conference - Clemens Neudecker
 

IMPACT Final Conference - Clemens Neudecker

on

  • 1,564 views

The IMPACT Interoperability

The IMPACT Interoperability

Statistics

Views

Total Views
1,564
Views on SlideShare
970
Embed Views
594

Actions

Likes
0
Downloads
11
Comments
0

6 Embeds 594

http://www.digitisation.eu 422
http://impactocr.wordpress.com 156
http://impact.sherrydesign.co.uk 7
http://impact.dlsi.ua.es 6
http://impact2.sherrydesign.co.uk 2
http://a0.twimg.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    IMPACT Final Conference - Clemens Neudecker IMPACT Final Conference - Clemens Neudecker Presentation Transcript

    • The IMPACT Interoperability Framework: Workflows for OCR and beyond Clemens Neudecker, KB National Library of the Netherlands 2 nd IMPACT Conference, British Library, London 24/25 October 2011
    • Background
      • > 20 individual software components for specific challenges
      • Prototyping new algorithms, improving commercial solutions
      • Different frameworks (C, C++, Java, etc.), platforms (Win/Linux)
      • Extensible with 3 rd party applications
      •  IMPACT Interoperability Framework (IIF)
    • Architecture
      • Java
      • Web Services
      • Apache
      • Taverna
      • Open Source available on https://github.com/impactcentre
      • Free Hackathon 14/15 November, University of Manchester
      • http://impact-mygrid-taverna-hackathon.wikispaces.com/
    • Integration
      • Only requirement: command line executable
      • Generic command line wrapper produces web service
      • Web service exposed as workflow module with documentation
      • Quick & easy integration: developers can focus on their application and have to worry less about integration = higher quality software
    • Workflows
      • OCR workflow = data pipeline
      • Building blocks = processing modules (nodes)
      • Integration = interaction between nodes (mashups)
      •  Collaboration with
    •  
    • Evaluation features
      • Text comparison of result with ground truth, using Levenshtein distance method
      • Word evaluation (with reading order)
      • Layout based comparison of result with ground truth, using the Page Analysis And Ground Truth Elements Framework
    • Community
      • Web2.0 style workflow registry
      • Ready-to-use and documented resources
      • Community of experts
      • Sharing of experiments and know how
    • Local client: Taverna Workbench
      • Background:
      • BioSciences
      • Developed and maintained by myGrid, UK
      • Open source
      • GUI for design and execution of web services & workflows
    • Remote client: Portal
      • SOAP/REST API
      • Remote execution of web services & workflows
    • Results Repository
      • Custom service for IMPACT:
      • automatic storage of
      • workflow outputs and
      • provenance via WebDAV
      • Fully interoperable,
      • since HTTP-based
      • Configurable s torage of
      • result sets
      • Create reports using POI
    • Scalability
      • Central ESB proxy manages multiple service copies
      • Process parallelization, Load distribution, Fail over, Security
      • Served >2M requests
      • Throughput improvements of 94% with every additional instance
      • Tested on Dutch Supercomputing Cloud (“Enlighten Your Research”)
    • Outlook
      • Online service for testing/evaluation
      • Specification & Guidelines
      • Extending the scope: Workflows for linguistic analysis: CLARIN Workflows for preservation: SCAPE
      • Even better scalability: Map/Reduce
      • Supported by a community of developers & practitioners
    •  
      • “ Anyway, the thing about progress is that is always seems greater than it really is.”
          • Ludwig Wittgenstein, Philosophical Investigations (quoting Johann Nestroy)
      xkcd.com/688