Your SlideShare is downloading. ×
0
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Computer Lexica in OCR and Retrieval
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Computer Lexica in OCR and Retrieval

499

Published on

Presentada en "Sesión de demostración de IMPACT en la BNE". Octubre. Biblioteca Nacional de España

Presentada en "Sesión de demostración de IMPACT en la BNE". Octubre. Biblioteca Nacional de España

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
499
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Computer Lexica in OCR and Retrieval Katrien Depuydt, Jesse de Does (Instituut voor Nederlandse Lexicologie, Leiden)
  • 2. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Can we handle ‘de wereld’ (‘the world’)’? werreid4 March 2009 presentation The Hague 2
  • 3. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.OCR:Abbyy Finereader SDK with built in standard Dutch dictionaryOCR:Abbyy Finereader SDK combining built in modernDutch dictionary withIMPACT external historical lexicon of Dutch: werreld IMPACT <Demo Day BL, 12 July 2011> 3
  • 4. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. werelt weerelt wereld weerelds wereldt werelden weereld werrelts waerelds weerlyt wereldts vveerelts waereld weerelden waerelden weerlt werlt werelds sweerels zwerlys swarels swerelts werelts swerrels weirelts tsweerelds werret vverelt werlts werrelt worreld werlden wareld weirelt weireld waerelt werreld werld vvereld weerelts werlde tswerels werreldts weereldt wereldje waereldje weurlt wald weëledRETRIEVAL: key in modern WERELD and find all IMPACT <Demo Day BL, 12 July 2011> 4
  • 5. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.The long s problem: An example …. OCR at start of project A. De eerde was de gevaarlykflti om de verlei¬ . ding aan t Hof; de tweede de ftillie en veiligde; de derde de zwaarde, daar hy byna drie millioenen harde en onbefchaafde Menfchen beftieren moest.IMPACT workshop, Bratislava, May 7, 2010 5
  • 6. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. The long s problem: An example ….OCR at start of project Results April 2010A. De eerde was de gevaarlykflti om de verlei¬ A. De eerste was de gevaarlykste om de verlei-ding aan t Hof; de tweede de ftillie en veiligde; ding aan t Hof; de tweede de stilste en veiligste;de derde de zwaarde, daar hy byna drie millioenen de derde de zwaarste, daar hy byna drie millioenenharde en onbefchaafde Menfchen beftieren moest. harde en onbeschaafde Menschen bestieren moest. IMPACT workshop, Bratislava, May 7, 2010 6
  • 7. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. The long s problem: An example ….OCR at start of project Results April 2010A. De eerde was de gevaarlykflti om de verlei¬ A. De eerste was de gevaarlykste om de verlei-ding aan t Hof; de tweede de ftillie en veiligde; ding aan t Hof; de tweede de stilste en veiligste;de derde de zwaarde, daar hy byna drie millioenen de derde de zwaarste, daar hy byna drie millioenenharde en onbefchaafde Menfchen beftieren moest. harde en onbeschaafde Menschen bestieren moest. Workaround: “integrated postcorrection” tell the engine that “eerfte” is OK and postcorrect it afterwards with the lexicon. In this way we keep it from turning to “eerde” (earth) instead of “eerste” (first) IMPACT workshop, Bratislava, May 7, 2010 7
  • 8. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Overview What is a computer lexicon Lexica in IMPACT Tools for lexicon building and applying lexica Some results Searching DemonstrationIMPACT <Demo Day BL, 12 July 2011> 8
  • 9. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. What is a computer lexicon?IMPACT <Demo Day BL, 12 July 2011> 9
  • 10. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Computer lexicon vs electronic dictionary (1)An electronic dictionary is: Digitised full text (no pictures) For human use Ideally: searchable with explicitely coded material (XML), such as alemma, part of speech (PoS), meaning, quotes etc. Examples: OED online, WNT onlineIMPACT <Demo Day BL, 12 July 2011> 10
  • 11. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Dictionary XML (example)IMPACT <Demo Day BL, 12 July 2011> 11
  • 12. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.IMPACT <Demo Day BL, 12 July 2011> 12
  • 13. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Computer Lexicon vs Electronic Dictionary (2) A computer lexicon is: Always in a structured digital format (XML, relational database) Main purpose: computer application Explicitely coded information (e.g. lemma wereld, part of speech noun, morphology werelden, werelds … , syntax) Examples of use: Linguistic enrichment of text material ‘Advanced’ searching (words with all spelling variant and inflections) Automatic summarization, keyword extraction…IMPACT <Demo Day BL, 12 July 2011> 13
  • 14. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.IMPACT <Demo Day BL, 12 July 2011> 14
  • 15. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Lexica in IMPACTIMPACT <Demo Day BL, 12 July 2011> 15
  • 16. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.The OCR lexicon An OCR lexicon is A checked list of words in a language Based on a corpus (collection) of dated texts (selection!) Preferably with frequency information Preferably from the same time period or of the same text type as the texts you wish to digitizeIMPACT <Demo Day BL, 12 July 2011> 16
  • 17. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. OCR lexicon: example 1550-1750 > 1900 song 820 television 418 rihte 818 electronic 375 theire 818 video 194 manye 818 hormone 176 sume 815 jazz 162 Do 814 eco 142 Whiche 811 software 136 fyrst 811 vitamin 128 while 811 movie 121 Water 810 taxi 113 wt 809 isotopic 108 shalbe 808 electronics 95 thingis 807 radar 86 again 806 basically 71 sona 806 sabotage 71 wa 805 homozygote 70 mode 804 psychedelic 67 work 802 phonemic 66 between 801 insulin 64 law 799 zap 64 moder 798 antibody 61 mis 798 fungicidal 61 softe 798IMPACT <Demo Day BL, 12 July 2011> 17
  • 18. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.The IR lexiconIR lexicon: most important information categories word forms (lists of words) + - frequency information - quotes (dated sources) from corpora or electronic dictionaries - MODERN LEMMA (// entrance dictionary) linked to spelling variants and inflected forms of the same word The modern lemma is used for searching in texts Standard use in corpus linguistics and modern historical lexicographyIMPACT <Demo Day BL, 12 July 2011> 18
  • 19. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.<?xml version=1.0?><!DOCTYPE lexicon SYSTEM NL_Structure.dtd><lexicon><lexical_entry><lemma_id>219490</lemma_id><modern_lemma>aantuilen</modern_lemma><gloss></gloss><POS>VRB</POS><ne_label></ne_label><language_id></language_id><portmanteau_lemma_id></portmanteau_lemma_id><wordform><form_representation><wordform_id>850026</wordform_id><written_form>tuyld</written_form><attestation><id>92141</id><token_id></token_id><quote>Verhael ick (<I>t.w. een als vrouw verkleede man</I>) haer mijn min in Vrouwelijcker schynen:Sy acht het boertery, en tuyld daer weer op an, Vermits een Vrou niet op een Vrou verlieven kan,</quote><derivation_id>0</derivation_id><document_id>204</document_id><start_pos>119</start_pos><end_pos>124</end_pos></attestation></form_representation></wordform> <Demo Day BL, 12 July 2011> IMPACT 19
  • 20. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Tools for lexicon building and application of lexica IMPACT <Demo Day BL, 12 July 2011> 20
  • 21. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Types variation (spelling, inflection…) uytterlijcste uyterlijkste duyterlijke uiterlyke uyterlijcke uiterlijke uyterlijck uiterlyken uiterlijkste uiterlicke wterlicke wterlijcke ulterlijk uiterlyk uiterlijk uyterlick wterlicken duyterlijcke uiterlijken uiterlijks wterlijck uytterlicke uitterlijke ujterlijke uytterlijk uyterlycke uyterlicken uijterlicke duiterlijcke wtterlijcke wterlyke wtterlijk uuterlick uuterlic uyterlijkeI uyterlijcken uyterlicke duiterlyke wterlijke vuyterlijcke uuterlycke uuterlicke wterlijken uyterlijcksten uuyterlicke uuyterlick uuyterlycke uytterlijcke uytterlycke uytterlick vuytterlicke uiterlijker uyterlyck uterliek wterlijcken uiterlijkst uitterlijk uytterlijcken uyterlyk wterlick uutterlijck uuyterlicken uyttelijck uijterlijk uytterlijck uuterlijck uiterlick uitterlyk uuyterlic uuyterlyck uuyterlijck uiterlijck uytterlyck uterlyc wterlijk (patterns to predict variation) werelt weerelt wereld weerelds wereldt werelden weereld werrelts waerelds weerlyt wereldts vveerelts waereld weerelden waerelden weerlt werlt werelds sweerels zwerlysII swarels swerelts werelts swerrels weirelts tsweerelds werret vverelt werlts werrelt worreld werlden wareld weirelt weireld waerelt werreld werld vvereld weerelts werlde tswerels werreldts weereldt wereldje waereldje weurlt wald weëled (a number are predictable with patterns, others need to be taken from a lexicon ) IMPACT <Demo Day BL, 12 July 2011> 21
  • 22. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Neil Fitzgerald, 7th July 2011 22
  • 23. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Computer lexica For OCR and OCR post correction Improving searchability of historic text material by building a lexicon with variants by using a modern lemma as a search entry Tools for lexicon building Tools for application of lexicon in search engines Lexicon cookbookIMPACT <Demo Day BL, 12 July 2011> 23
  • 24. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Tools (more specific)- Lexicon building from corpus material and dictionaries- Use of lexica in search engines- Tool to extract spelling variation patterns from historical material- Tool to relate previously unrecognised spelling variations to their standard form- Tool to deduct previously unrecognised inflected forms to their basic formIMPACT <Demo Day BL, 12 July 2011> 24
  • 25. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Spelling variation tools (pattern-based) Language-independent approach: Supervised rule (pattern) induction from pairs (“modern” word, historical word), yielding patterns like aa/ae, s/z, …. Pattern weights are computed from example materialAdditional approaches possible, eg. : Use of aligned data (parallel historical text and modern version)IMPACT workshop, Bratislava, May 7, 2010 25
  • 26. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Lemmatization Reduction of historical word forms to modern lemma Historical word standard (“modern”) spelling lemma form (pattern matching) (lemmatizer) Dystels (1) distels (2) distel When we have a perfect or near-perfect modern full form lexicon, the second step is simply lexicon lookup. But:1) We will not have full form information for many lemmata (especially the historical ones)2) Even lemmata present in modern language may have historical inflected forms different from the present-day paradigmIMPACT workshop, Bratislava, May 7, 2010 26
  • 27. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Lemmatization and reverse lemmatizationWe also need a lemmatization process for these situations A typical lemmatizer assigns some standard form (infinitive, nominative, stem) to inflected forms. Usually based on patterns relating the inflected form to the standard form.But: Matching these patterns can be hard to combine with matching both spelling variation patterns and OCR errors (bok/bokken/bokkeu) We adopt the solution of actually expanding the “hypothetical modern full form lexicon” containing the most plausible possible paradigmatic expansions of lemmata This construction is carried out by means of a statistical reverse lemmatizerIMPACT workshop, Bratislava, May 7, 2010 27
  • 28. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Attestation From hypothetical (non-witnessed) lexicon content to attested word forms in “real” text Automatic selection of candidate attestations Manual work: verification and correction Two approaches Dictionary based (INL): Woordenboek der Nederlandsche Taal Corpus based (LMU, INL): Dutch DBNL corpusIMPACT workshop, Bratislava, May 7, 2010 28
  • 29. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. IMPACT Dictionary Attestation Tool Lexicon building at work: Verifying attestations in historical dictionariesTask Find the variants of a headword as they occur in the quotationsheadword work • We are working on what works. • Depart from me, ye that worke iniquity. Quotations • She worcketh knittinge of stockings. variants IMPACT workshop, Bratislava, May 7, 2010 29
  • 30. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. IMPACT Dictionary Attestation Tool Task Find the variants of a headword as they occur in the quotations Automatically (preprocessing)Electronic • match literallyhistorical e.g: work work, Workdictionary • match using existing lexica and lists Database with lemmata and quotatioms e.g: work works, worked, wrought • approximate matching e.g: work worke By hand (using the tool) • correct automatic mismatches e.g: works words, worms • find missed matches e.g: work worketh, wrowght IMPACT workshop, Bratislava, May 7, 2010 30
  • 31. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. IMPACT Attestation Tool Up-to-date overview of what is done and needs to be donTool Done by this user so farLemma headwordQuotationsSorted by uncertainty IMPACT workshop, Bratislava, May 7, 2010 31
  • 32. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. IMPACT Lexicon ToolTask Find and verify attestations in a historical corpus Automatically (preprocessing = apply lemmatizer) • match literally e.g: work work, Work • match using existing lexica and lists e.g: work works, worked, wrought • matching using spelling variation module e.g: uiterlijk uyterlick By hand (using the tool) • assign correct lemma e.g: was (N) zijn (V) • group tokens belonging together e.g: konings zoon koningszoon • select attestations IMPACT workshop, Bratislava, May 7, 2010 32
  • 33. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Corpus-based lexicon building: Impact LexiconToolIMPACT workshop, Bratislava, May 7, 2010 33
  • 34. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.General vocabulary vs. Named entities Tools for lexicon building described so far: applicable to general lexicon Tools for NE recognition, classification and variant matching - library requirement - distinguish general vocabulary from NE’s - avoid unpleasant mixups like Abimelech apemelk! (b/p; i/e; e/0; k/ch)IMPACT workshop, Bratislava, May 7, 2010 34
  • 35. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Improvement of state of the art / innovation We use existing computational linguistic approaches, but figure out how to apply them to historical language We develop a workflow to deal with the problems posed by historical language, figuring out how all pieces fit together Data selection and acquisition Manual work Computational linguistics toolsIMPACT workshop, Bratislava, May 7, 2010 35
  • 36. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.languages in IMPACT Dutch, German, English, Spanish, French Polish, Czech, Slovene and Bulgarian- Cross language perspective paper- Parallel OCR and IR experiments- GT datasets- Language tools: language independent- Except from 3 core languages: proof of concept lexicaIMPACT <Demo Day BL, 12 July 2011> 36
  • 37. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.OCR evaluation results(preliminary!)
  • 38. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.1. Czech Co jest konstituce?, čili, Krátký, prostonárodní wýklad hlawnějších zásad konstitucí ewropejských, 1848 Ferina Lišák z Kuliferdy a na Klukově, čili, Kratičká historye zlopověstných kousků starého Reinecke, 1848 Homerowa Iliada, 1802 Na den narození neimocněišího, a neijasněišího cysare rímského, téz dědičného rakauského a krále ceského, Frantiska II., w Praze 12. den mesyce Unora, léta 1805, 1805 Plody sborů učenců řeči českoslowanské prešporského, 1836 Rozprawy o gmenách, počátkách i starožitnostech národu Slawského a geho kmeni /, 1830 Sokol, 1872 Základowé pitwy (Anatomie), čili, Soustawnj rozbor a popis těla lidského a gednotliwých geho částek, 1840
  • 39. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 40. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.2.Dutch 18th and 19th century books, newspapers, parliamentary papers Provinciale Overijsselsche en Zwolsche courant : staats-, handels-, nieuws- en advertentieblad, 1852-1852 Rechtsgeleerd advis in de zaak van den gewezen stadhouder, en over deszelfs schryven aan de gouverneurs van de Oost- en West-Indische bezittingen van den staat [...]. Ingelevert [...] op den 7 january 1796. / By B. Voorda et al, 1796-1796 Verhaal van het levensgevaar, waar in zig drie Rotterdamsche burgers [...] bevonden hebben, te Utrecht, 1784-1784 Vrijmoedige aanmerkingen, over de uitsluiting van allen die door publieke armkassen bedeeld worden, als stemgerechtigden [...] bij eene oproeping van het Nederlandsche volk tot eene Nationaale Conventie, 1795-1795
  • 41. Precision: 0.8432889410216431 , Recall: 0.843331934927516IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 42. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 43. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.English 16th-19th century material Sources for lexicon building: OED, ECCO
  • 44. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 45. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.French17th century books Conduite du jugement naturel où tous les bons esprits de lun et lautre sexe pourront facilement puiser la pureté de la science, par M. Jacques Forton, sieur de S. Ange,..., 1653 Dissertation de la philosophie en général, 1668 La Dialectique du sieur de Launay, contenant lart de raisonner juste sur toute sorte de matières..., 1673 Lettre de M. Gadroys à M. de La Grange Trianon,... pour servir de réponse à celle que M. de Castelet a écrite contre les raisons de M. Descartes touchant le flux et le reflux de la mer. - Seconde lettre de M. Gadroys... [au même, sur le même sujet.], 1677 Traitez de métaphysique démontrée selon la méthode des géomètres. [Par le sieur de La Coudraye.], 1693
  • 46. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 47. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.German Das Buch des heyligen Römischen Reichs unnderhalltunge, 1501 Die Poesie ihr Wesen und ihre Formen mit Grundzügen der vergleichenden Literaturgeschichte, 1884 Echo Deß Hochzeitlichen Te Deum Laudamus, 1722 Ergebnisse der Erhebungen über die Beschäftigung gewerblicher Arbeiter an Sonn- und Festtagen, Bd.:1, Gruppe I bis VII der Gewerbestatistik, Berlin, 1887, 1887 Quedlinburgisches Kreis-Tags-Memorial, 1673 Von der Regierung der Kirche und den unterschiedlichen Würden der Geistlichkeit *(full title in comments), 1779 Warhaffter und grundlicher Bericht uß was Ursachen Martinus du Voysin (zu Basel verburgerter Krämer) inn der Statt Surseew im Aargöw, ..., den 13. Tag Octobris deß 1608. Jars erstlich enthauptet, und volgends verbrennt worden, 1609
  • 48. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 49. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Polish Adwersaria, albo terminata sprawy wojennej, która się toczyła w wołoskiej ziemi z tureckim cesarzem, 1621 Chorągiew Sarmacka w Wołoszech, to jest pospolite ruszenie i szczęśliwy powrót Polaków z Wołoch w roku 1621, 1621 Diariusz wiadomości od wyjazdu króla z Wilna do Smoleńska, 1610 Discurs o cenie pieniedzy teraznieyszey y o niektorych skutkach iey…, 1632 Nowe Ateny, albo Akademia wszelkiey scyencyi pełna, na różne tytuły iak na classes podzielona, mądrym dla memoryału, idiotom dla nauki, politykom dla praktyki, melancholikom dla rozrywki erygowana ... . Część 3 albo Supplement., 1746 Pasja żołnierzy obojga narodów w stolicy moskiewskiej krótko opisana, 1613 Powodzenia niebezpiecznego ale szczęśliwego wojska j. k. m. w Multanach opisanie, 1601 Relacja chwalebnej ekspedycji Jana Kazimierza, króla polskiego i szwedzkiego, 1650 Wyprawa i wyjazd sułtana Amurata, cesarza tureckiego, na wojnę do Korony Polskiej, 1634 Wyprawa i wyjazd sułtana Amurata, cesarza tureckiego, na wojnę do Korony Polskiej_BW, 1634 Żałosne opisanie upadku króla hiszpańskiego na morzu i na lądzie, 1589
  • 50. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 51. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Slovene Genovefa, 1841 Gosp. Krištofa Šmida korarja avgustanskiga, zgodBe S. Pisma za mlade ljud..., 1850 Kmetijske in rokodelske novice, 1844 Kratkozhasne uganke, 1788 Kuharske Bukve, 1799 Marianske Kempensar, ali Dvoje bukuvze, 1769 Novice kmetijskih, rokodelnih in narodskih reči, 1851 Sgodbe svetiga pisma za mlade ljudi, 1830 Ta male katechismus, 1768 Vezhna pratika od gospodarstva, 1789 Zerkviza na skali, 1855
  • 52. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 53. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.Retrieval demonstrator Indexing and retrieval library (java) implemented on the lucene search engine Lexicon in MySQL database OCR with Finereader SDK and external dictionary interface of about 2000 images of the Dutch Ground Truth selection Page XML output [in framework] NE tagging Indexing and retrieval while using lexicon and NE taggingIMPACT <Demo Day BL, 12 July 2011> 53
  • 54. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 55. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 56. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 57. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 58. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 59. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 60. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 61. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 62. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 63. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 64. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.
  • 65. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.

×