This document describes ACAD, an automatic coherence analysis tool for Dutch texts. ACAD allows users to formulate sophisticated search queries across multiple Dutch corpora to analyze coherence relations and connectives. It aims to make analyses more reproducible and transparent. ACAD's search interface Cesar translates queries into XQuery and controls output. It can search corpora like SoNaR and formats like Folia. ACAD's goals are to build this search interface and extend available corpora like newspaper texts and WhatsApp data. Future work includes manuals, investigating other connectives, constructions, and languages. Resulting annotated corpora will be released.
Just Call Vip call girls Mysore Escorts โ๏ธ9352988975 Two shot with one girl (...
ย
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
1. Automatic Coherence Analysis of Dutch
Henk van den Heuvel
Jet Hoek, Micha Hulsbosch, Erwin Komen, Ted Sanders, Wilbert Spooren
Student assistants: Iris Hofstra, Patrick Sonsma
2. Coherence relations in discourse
A CR consists of two discourse segments and, optionally, a connective
[S1 The temperature rose] [connective because] [S2 the sun was shining]
4. The advantage of using automatic analyses
โข Less dependent on manual analyses
- higher reliability
- larger samples
- larger number of genres
ACAD: Automatic Coherence Analysis of Dutch
5. Goals of ACAD
โข Build a search interface, on the basis of existing Clariah
components
- corpora like SoNaR, VU-DNC, CGN
- parsers like Alpino
- formats like Folia
- search facilities like CorpusStudio
โข Make it possible to formulate sophisticated search queries for
computationally uninitiated discourse analysts
- translated into XQuery in the backend
โข Make analyses reproduceable (and consequently more
transparent)
โข Extend the available corpora
- newspaper texts (NRC and NRC.nl) from different genres (hard news,
opinion, background stories) on related topics
- WhatsAppdata of different age groups (13/14, 20-25)
ACAD: Automatic Coherence Analysis of Dutch
12. ACAD: where do we go from here?
โข Need for manuals to instruct the computationally
uninitiated discourse analyst
โข The potential of ACAD
- investigate other connectives (contrastive, conditional, additive)
- investigate other issues, e.g.,
- prototypical positioning of various connectives (i.e., before both
segments or between the two segments)
- what omdat-segments have Verb-second?
- investigate constructions rather than words
- investigate other languages
โข Resulting corpora with CMDI metadata released for VLO
- Newspaper texts (NRC and NRC.nl) from different genres (hard news,
opinion, background stories) on related topics (2011)
- Two WhatsApp datasets of different age groups (13/14, 20-25)