Datech2014 - Session 5 - Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST)
Upcoming SlideShare
Loading in...5
×
 

Datech2014 - Session 5 - Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST)

on

  • 414 views

Presentation of the paper Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST) by Maximilian Hadersbeck, Alois Pichler, Florian Fink and Øyvind Liland Gjesdal in DATeCH ...

Presentation of the paper Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST) by Maximilian Hadersbeck, Alois Pichler, Florian Fink and Øyvind Liland Gjesdal in DATeCH 2014. #digidays

Statistics

Views

Total Views
414
Views on SlideShare
349
Embed Views
65

Actions

Likes
0
Downloads
2
Comments
0

4 Embeds 65

http://www.digitisation.eu 58
https://twitter.com 5
http://www.slideee.com 1
http://newsblur.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Datech2014 - Session 5 - Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST) Datech2014 - Session 5 - Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST) Presentation Transcript

  • Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST) Presentation DaTECH2014 Florian Fink Centrum f¨ur Informations- und Sprachverarbeitung (CIS) LMU May 19, 2014 Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST1/20
  • Ludwig Wittgenstein The Austrian philosopher Ludwig Wittgenstein (1889 - 1951) left behind 20,000 pages of his philosophical manuscripts and typescripts – the so-called Wittgensteins Nachlass. • In 2000 the Wittgenstein Archives at the University Bergen (WAB) published this Nachlass as an electronic edition called Bergen Electronic Edition (BEE). • In 2009, WAB made additionally 5000 pages freely available on the web. • In 2010 a cooperation between Dr. Alois Pichler and the Center for Information and Language Processing started on the Nachlass. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST2/20
  • Wittgenstein in Co-Text The The Widgenstein in Co-Text project is a cooperation of the Wittgenstein Archive Bergen and the Center of Information- and Language Processing, Ludwig-Maximilians University of Munich. Its goal is to discover new tools that help researchers from different fields to explore and research the works of Ludwig Widgenstein. • Dr. Maximilian Hadersbeck (LMU M¨unchen) • Dr. Alois Pichler (Wittgenstein Archive Bergen) • Øyvind Liland Gjesdal (Wittgenstein Archive Bergen) • Florian Fink (LMU M¨unchen) Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST3/20
  • Wittgenstein Advanced Search Tools (WAST) In order to provide new possibilities for researchers, the Wittgenstein Advanced Search Tools (WAST) offer local grammar based search capabilities on the Nachlass. • Simple interface for even complex queries • Lemmatized vs simple full text search capabilities • Integration of Semantic and syntactic informations • Integration of part-of-speech information • Presentation of the search results within the original documents Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST4/20
  • The front end of WAST I The web-based front-end of WAST provides access to the text and displays the results of queries. The results are shown in their normalized form, showing the sentences that contain the query and a small snipped of the original document. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST5/20
  • The front end of WAST II Each sentence of the result set contains various links to external Wittgenstein resources. • The according sentence in the Wittgenstein Source of the University of Bergen • Pundit. • A highlighted sentence within the original facsimile. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST6/20
  • Highlighting of the facsimile The highlighter is used to directly show the result of queries in scans of the original page of the according facsimile. This enables researchers to see the context of his search in a greater context. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST7/20
  • Expanding of the search results The normalized view can also be expanded to further study the context of the results. He is able to examine the text and is able to see changes and alternatives interpolated by Ludwig Wittgenstein. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST8/20
  • Searching the Nachlass The tool that searches over the Nachlass of Ludwig Wittgenstein is called Wittfind. It is used to search the text for words and phrases within sentences, since Ludwig Wittgenstein sees them as central to the meaning of words (Tractatus logico philosophicus [22, 3.3]): Nur der Satz hat Sinn; nur im Zusammenhang des Satzes hat ein Name Bedeutung1 1 Only propositions have sense; only in the nexus of a proposition does a name have meaning Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST9/20
  • Wittfind To find the phrases Wittfind internally executes search graphs on the text. These search graphs facilitate the use of concatenations, alternatives and sequences of token in the queries. You can specify queries that • search for a token A or B • search for a token A followed by a token B • search for a token A followed by zero or more sequences of a token B Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST10/20
  • Search graphs The phrase search of Wittfind is comparable to algorithms that search for regular expressions in text. They are transformed to nondeterministic finite automata (NFA) and used for the searching phase. Whereas the automata of regular expressions match on a single character basis, the automata of the search graphs match on the token of the text. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST11/20
  • Pattern matching I Wittfind is able to resort on different policies and resources in its search phase. The syntax of the token matching builds upon the syntax of Unitex, but expands it where appropriate. • simple string matching • matching on part-of-speech tags • matching on regular expressions • matching on special token classes (words, numbers, etc.) Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST12/20
  • Pattern matching II Apart from simple string-based matching of the token, Wittfind uses a background dictionary for additional morphological, syntactical and semantic Information. • lemmatized matching (“dachte” matches queries for its lemma “denken”) • inverse lemmatized matching (“gedacht” matches “dachte” because they share a common lemma “denken”) • matching on morphological and syntactical forms (verbs, adjectives, etc.) • matching based on the semantic of the token as it is provided in the dictionary. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST13/20
  • Queries While it is possible to directly draw search graphs and execute them2, Wittfind also accepts flat queries. These queries are transformed to search graphs automatically. The algorithm used to accomplish this is similar to the algorithms, that create NFA’s from regular expressions. 2 There is an experimental online graph editor that allows you to do exactly this. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST14/20
  • Wittfind client-server Wittfind uses additional lexical resources in form of a background dictionary. In order to avoid redundant loading of these resources for each query Wittfind is split in one server and client part. • The server loads all resources and texts once and optimizes their usage. • The client applies a preprocessing step to the query and sends it to the server to execute it. • The client applies the matches from the server to the text and displays the results of the query. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST15/20
  • Expanding WAST to other texts WAST as presented here evolved around the work of Ludwig Wittgenstein and many of its features come directly from the ideas and needs of Wittgenstein researchers. But most of the tools are modular and able to work on any text. The core tool of the local grammar search Wittfind can be easily used to search any compatible collection of texts. Since it returns the sentences in the original document without modifying it, any tool that works on the original text should work on the results as well. To use all features on needs: • A TEI-5 compatible text (optionally with part-of-speech tags) • An extensive full form dictionary Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST16/20
  • Further work on WAST The development on WAST has always been supported by various students of the CIS. Some centered their bachelor or master thesis around different aspects of the tool set. There are other projects finished and about to finish. • Wittfind webservice • Facsimile Highlighter (M. Lindinger) • Facsimile Reader based on the Highlighter • Large Helppage with a collection of example search queries (A. Krey) • Semantic Search offering simple usage of the different semantic classes in WAST including specialized classes to exploit the theory of color of Ludwig Wittgenstein (A. Krey) • Online graph editor (Y. Kalasouskaya) Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST17/20
  • Online graph editor The online graph editor is a tool that allows people to draw search graphs and execute them directly on the Nachlass. Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST18/20
  • Urls • Wittgenstein Archive Bergen http://wab.uib.no/ • Wittgenstein Source http://www.wittgensteinsource.org • Pundit http://feed.thepund.it/ • WAST http://wittfind.cis.lmu.de • CIS http://www.cis.lmu.de • Wittgenstein in Co-Text http://www.cis.uni-muenchen.de/ forschung/ehumanities/research-group-co/index.html • Graph search presentation: http://www.cis.uni-muenchen.de/ kurse/max/scholarship/finkwf.pdf • Wittgenstein scholarship: http://wastwiki.cis.uni-muenchen.de/wiki/Scholarship Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST19/20
  • Thanks for your attention! Florian Fink Wittgenstein’s Nachlass: WiTTFind and Wittgenstein Advanced Search Tools (WAST20/20