SlideShare a Scribd company logo
2nd Succeed Hackathon 
10-11 April 2014, University of Alicante 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
2nd Hackathon, 10-11 April, Alicante 
Background 
 Libraries, archives, museums are digitising large quantities of (mainly historic) 
documents like books, newspapers, journals. 
 To make these digital documents searchable, images must first be converted to 
electronic text with help of OCR (optical character recognition) 
 Off the shelf OCR software is not suitable for processing historical documents – 
problems e.g. with old fonts, historic language variation, quality of paper originals 
 2009-2012: EU project IMPACT (IMProving ACcess to Text) 
(www.impact-project.eu) 
 2011: Official launch of the IMPACT Centre of Competence 
(www.digitisation.eu) 
 2013-2014: EU project SUCCEED 
(www.succeed-project.eu) 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
2nd Hackathon, 10-11 April, Alicante 
Tools, tools, tools 
“I have this great tool, it does XYZ… 
…if you call it with the right parameters… 
…and run it in this environment… 
…with these minor tweaks…” 
Succeed maintains an extensive list of tools for digitisation: 
http://succeed-project.eu/publications/available-tools/index-succeed 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
2nd Hackathon, 10-11 April, Alicante 
Interoperability 
VS 
VS 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
Succeed 2nd hackathon
2nd Hackathon, 10-11 April, Alicante 
Bintray/Maven integration 
All software binaries can be directly downloaded from 
https://bintray.com/impactocr/maven 
You can also integrate them into your Maven project by adding this to your pom.xml: 
<repositories> 
<repository> 
<id>impactocr</id> 
<url>http://dl.bintray.com/impactocr/maven/</url> 
</repository> 
</repositories> 
<dependency> 
<groupId>eu.impact_project.iif.ws</groupId> 
<artifactId>generic-soap-client</artifactId> 
<version>0.7.0</version> 
</dependency> 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
2nd Hackathon, 10-11 April, Alicante 
More and more... 
OCR evaluation tool 
(https://github.com/impactcentre/ocrevalUAtion) 
= compare ground truth with OCR result 
PAGE generator 
(https://github.com/psnc-dl/page-generator) 
= generate training data for Tesseract OCR from PAGE xml 
Franken+ 
(https://github.com/idhmc-tamu/FrankenPlus) 
= create new training sets for Tesseract OCR 
Format converter 
(https://github.com/subugoe/format-converter) 
Convert between different text formats, e.g. ALTO, TEI, FRXML 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
2nd Hackathon, 10-11 April, Alicante 
Last but not least... 
Be creative and have fun coding and 
experimenting! 
Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.

More Related Content

Similar to Succeed 2nd hackathon

Accessible project newsletter 5
Accessible project newsletter 5Accessible project newsletter 5
Accessible project newsletter 5
AEGIS-ACCESSIBLE Projects
 
AEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile applicationAEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile application
AEGIS-ACCESSIBLE Projects
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)
cneudecker
 
LoCloud Annual Publishable Summary 2014-15
LoCloud Annual Publishable Summary 2014-15LoCloud Annual Publishable Summary 2014-15
LoCloud Annual Publishable Summary 2014-15
locloud
 
ECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming artsECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming arts
Paolo Nesi
 
AEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web applicationAEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web application
AEGIS-ACCESSIBLE Projects
 
IMPACT Demo Dag at KB
IMPACT Demo Dag at KBIMPACT Demo Dag at KB
IMPACT Demo Dag at KB
cneudecker
 
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. 2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
IMPACT Centre of Competence
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
Sandro D'Elia
 
VERITAS newsletter n° 4
VERITAS newsletter n° 4VERITAS newsletter n° 4
VERITAS newsletter n° 4
AEGIS-ACCESSIBLE Projects
 
MICO — Towards Contextual Media Analysis
MICO — Towards Contextual Media AnalysisMICO — Towards Contextual Media Analysis
MICO — Towards Contextual Media Analysis
Thomas Kurz
 
New Goals of PARES: Spanish Archives Web Portal
New Goals of PARES: Spanish Archives Web PortalNew Goals of PARES: Spanish Archives Web Portal
New Goals of PARES: Spanish Archives Web Portal
ICARUS - International Centre for Archival Research
 
An Experimental Workflow Development Platform for Historical Document Digitis...
An Experimental Workflow Development Platform for Historical Document Digitis...An Experimental Workflow Development Platform for Historical Document Digitis...
An Experimental Workflow Development Platform for Historical Document Digitis...
cneudecker
 
CV - Resume
CV - ResumeCV - Resume
CV - Resume
Francesco
 
IMPACT HPC Cloud Day
IMPACT HPC Cloud DayIMPACT HPC Cloud Day
IMPACT HPC Cloud Day
cneudecker
 
The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011
David F. Flanders
 
Franco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowd
Franco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowdFranco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowd
Franco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowd
EOSC-hub project
 
CreativiTIC - Presentation Business Brochure!
CreativiTIC - Presentation Business Brochure!CreativiTIC - Presentation Business Brochure!
CreativiTIC - Presentation Business Brochure!
Jorge R. López Benito
 
I1 anna lobovikov katz-elaich-eng
I1 anna lobovikov katz-elaich-engI1 anna lobovikov katz-elaich-eng
I1 anna lobovikov katz-elaich-eng
evaminerva
 
Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016
Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016
Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016
Istvan Rath
 

Similar to Succeed 2nd hackathon (20)

Accessible project newsletter 5
Accessible project newsletter 5Accessible project newsletter 5
Accessible project newsletter 5
 
AEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile applicationAEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile application
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)
 
LoCloud Annual Publishable Summary 2014-15
LoCloud Annual Publishable Summary 2014-15LoCloud Annual Publishable Summary 2014-15
LoCloud Annual Publishable Summary 2014-15
 
ECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming artsECLAP White paper, social network for Cultural Heritage on Peforming arts
ECLAP White paper, social network for Cultural Heritage on Peforming arts
 
AEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web applicationAEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web application
 
IMPACT Demo Dag at KB
IMPACT Demo Dag at KBIMPACT Demo Dag at KB
IMPACT Demo Dag at KB
 
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. 2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
 
VERITAS newsletter n° 4
VERITAS newsletter n° 4VERITAS newsletter n° 4
VERITAS newsletter n° 4
 
MICO — Towards Contextual Media Analysis
MICO — Towards Contextual Media AnalysisMICO — Towards Contextual Media Analysis
MICO — Towards Contextual Media Analysis
 
New Goals of PARES: Spanish Archives Web Portal
New Goals of PARES: Spanish Archives Web PortalNew Goals of PARES: Spanish Archives Web Portal
New Goals of PARES: Spanish Archives Web Portal
 
An Experimental Workflow Development Platform for Historical Document Digitis...
An Experimental Workflow Development Platform for Historical Document Digitis...An Experimental Workflow Development Platform for Historical Document Digitis...
An Experimental Workflow Development Platform for Historical Document Digitis...
 
CV - Resume
CV - ResumeCV - Resume
CV - Resume
 
IMPACT HPC Cloud Day
IMPACT HPC Cloud DayIMPACT HPC Cloud Day
IMPACT HPC Cloud Day
 
The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011
 
Franco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowd
Franco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowdFranco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowd
Franco Niccolucci: Example of an EOSCpilot Science Demonstrator - TextCrowd
 
CreativiTIC - Presentation Business Brochure!
CreativiTIC - Presentation Business Brochure!CreativiTIC - Presentation Business Brochure!
CreativiTIC - Presentation Business Brochure!
 
I1 anna lobovikov katz-elaich-eng
I1 anna lobovikov katz-elaich-engI1 anna lobovikov katz-elaich-eng
I1 anna lobovikov katz-elaich-eng
 
Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016
Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016
Eclipse DemoCamp Budapest 2016 November: Best of EclipseCon Europe 2016
 

More from cneudecker

EuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State LibraryEuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State Library
cneudecker
 
ALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für VolltexteALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für Volltexte
cneudecker
 
OCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für ZeitungenOCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für Zeitungen
cneudecker
 
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?
cneudecker
 
Multimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical NewspapersMultimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical Newspapers
cneudecker
 
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
cneudecker
 
AI for digitized cultural heritage
AI for digitized cultural heritageAI for digitized cultural heritage
AI for digitized cultural heritage
cneudecker
 
Kuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher IntelligenzKuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher Intelligenz
cneudecker
 
Überblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-DÜberblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-D
cneudecker
 
The many uses of digitized newspapers
The many uses of digitized newspapersThe many uses of digitized newspapers
The many uses of digitized newspapers
cneudecker
 
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
cneudecker
 
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
cneudecker
 
OCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documentsOCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documents
cneudecker
 
Text and Data Mining
Text and Data MiningText and Data Mining
Text and Data Mining
cneudecker
 
Formate für Volltexte
Formate für VolltexteFormate für Volltexte
Formate für Volltexte
cneudecker
 
Extrablatt: The Latest News on Newspaper Digitisation in Europe
Extrablatt: The Latest News on Newspaper Digitisation in EuropeExtrablatt: The Latest News on Newspaper Digitisation in Europe
Extrablatt: The Latest News on Newspaper Digitisation in Europe
cneudecker
 
Reise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 MinutenReise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 Minuten
cneudecker
 
Europeana Newspapers in a Nutshell
Europeana Newspapers in a NutshellEuropeana Newspapers in a Nutshell
Europeana Newspapers in a Nutshell
cneudecker
 
lab.sbb.berlin
lab.sbb.berlinlab.sbb.berlin
lab.sbb.berlin
cneudecker
 
Named Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana NewspapersNamed Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana Newspapers
cneudecker
 

More from cneudecker (20)

EuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State LibraryEuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State Library
 
ALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für VolltexteALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für Volltexte
 
OCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für ZeitungenOCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für Zeitungen
 
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?
 
Multimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical NewspapersMultimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical Newspapers
 
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
 
AI for digitized cultural heritage
AI for digitized cultural heritageAI for digitized cultural heritage
AI for digitized cultural heritage
 
Kuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher IntelligenzKuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher Intelligenz
 
Überblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-DÜberblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-D
 
The many uses of digitized newspapers
The many uses of digitized newspapersThe many uses of digitized newspapers
The many uses of digitized newspapers
 
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
 
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
 
OCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documentsOCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documents
 
Text and Data Mining
Text and Data MiningText and Data Mining
Text and Data Mining
 
Formate für Volltexte
Formate für VolltexteFormate für Volltexte
Formate für Volltexte
 
Extrablatt: The Latest News on Newspaper Digitisation in Europe
Extrablatt: The Latest News on Newspaper Digitisation in EuropeExtrablatt: The Latest News on Newspaper Digitisation in Europe
Extrablatt: The Latest News on Newspaper Digitisation in Europe
 
Reise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 MinutenReise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 Minuten
 
Europeana Newspapers in a Nutshell
Europeana Newspapers in a NutshellEuropeana Newspapers in a Nutshell
Europeana Newspapers in a Nutshell
 
lab.sbb.berlin
lab.sbb.berlinlab.sbb.berlin
lab.sbb.berlin
 
Named Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana NewspapersNamed Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana Newspapers
 

Recently uploaded

Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
WhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring AppsWhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring Apps
HackersList
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
maigasapphire
 

Recently uploaded (20)

Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
WhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring AppsWhatsApp Spy Online Trackers and Monitoring Apps
WhatsApp Spy Online Trackers and Monitoring Apps
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
 

Succeed 2nd hackathon

  • 1. 2nd Succeed Hackathon 10-11 April 2014, University of Alicante Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  • 2. 2nd Hackathon, 10-11 April, Alicante Background  Libraries, archives, museums are digitising large quantities of (mainly historic) documents like books, newspapers, journals.  To make these digital documents searchable, images must first be converted to electronic text with help of OCR (optical character recognition)  Off the shelf OCR software is not suitable for processing historical documents – problems e.g. with old fonts, historic language variation, quality of paper originals  2009-2012: EU project IMPACT (IMProving ACcess to Text) (www.impact-project.eu)  2011: Official launch of the IMPACT Centre of Competence (www.digitisation.eu)  2013-2014: EU project SUCCEED (www.succeed-project.eu) Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  • 3. 2nd Hackathon, 10-11 April, Alicante Tools, tools, tools “I have this great tool, it does XYZ… …if you call it with the right parameters… …and run it in this environment… …with these minor tweaks…” Succeed maintains an extensive list of tools for digitisation: http://succeed-project.eu/publications/available-tools/index-succeed Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  • 4. 2nd Hackathon, 10-11 April, Alicante Interoperability VS VS Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  • 6. 2nd Hackathon, 10-11 April, Alicante Bintray/Maven integration All software binaries can be directly downloaded from https://bintray.com/impactocr/maven You can also integrate them into your Maven project by adding this to your pom.xml: <repositories> <repository> <id>impactocr</id> <url>http://dl.bintray.com/impactocr/maven/</url> </repository> </repositories> <dependency> <groupId>eu.impact_project.iif.ws</groupId> <artifactId>generic-soap-client</artifactId> <version>0.7.0</version> </dependency> Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  • 7. 2nd Hackathon, 10-11 April, Alicante More and more... OCR evaluation tool (https://github.com/impactcentre/ocrevalUAtion) = compare ground truth with OCR result PAGE generator (https://github.com/psnc-dl/page-generator) = generate training data for Tesseract OCR from PAGE xml Franken+ (https://github.com/idhmc-tamu/FrankenPlus) = create new training sets for Tesseract OCR Format converter (https://github.com/subugoe/format-converter) Convert between different text formats, e.g. ALTO, TEI, FRXML Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  • 8. 2nd Hackathon, 10-11 April, Alicante Last but not least... Be creative and have fun coding and experimenting! Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.