Discourse annotation for arabic 3


Published on

Published in: Technology, Spiritual
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Discourse annotation for arabic 3

  1. 1. Imam UniversityCollege of Computer and Information systemsPrepared by:Al-harbi.AAl-Gumlas.HAl-Otaibi.E
  2. 2. • Introduction : Discourse usually refers to a form of written text or spokenlanguage. A text is not only a sequence of sentences or clauses, butrather it is a coherent object that has many cohesive deviceslinking its units (words, clauses and sentences).
  3. 3. • Discourse Relations There are two types of discourse relations:(i) Relations that are signalled explicitly via so called discourseconnectives.(ii) Relations that can be inferred from the context without anysignaling. Discourse relations are semantic relations.
  4. 4. • Discourse Relations
  5. 5. • Discourse Connectives Types : Simple ConnectivesEx: because - after – and Paired ConnectivesEx: if – Then - although - Modified ConnectivesEx: even if - and also Combined ConnectivesEx: except after - and but
  6. 6. Related Work :. Several textual corpora of Arabic exist.. Some of them are available with Part-of-Speech and syntactic annotationsuch as the Arabic Treebank (ATB) The Prague Arabic Dependency Treebank(PADT), which is smaller in scale than the ATB, contains multilevelannotations, including morphological and analytical level of linguisticrepresentation.. Also, a recent effort by Dukes and Habash (2010) has produced TheQuranic) has produced The Quranic Arabic Corpus, a free annotated linguisticresource which provides morphological annotation and syntactic analysis ofthe Holy Quran.
  7. 7. • Collecting Arabic Connectives.They are collected a large set of Arabic discourse connectives using textanalysis and corpus-based techniques.Example :A. [ ] DC[ ] Arg1‚ [ ] Arg2.B. [al-sy¯arh mtt.wrh ˇgd¯an.] Arg1 [lknh¯a] DC [b¯ahz. ah alt-mn] Arg2C. [The car is so modern.] Arg1 [but] DC [it is too expensive] Arg2.
  8. 8. • Annotation Scheme. Annotation is based on lexicalized grammar theory.1. The anchor of the annotation is the lexical item – a discourseconnective (DC).2. The Arg2 label is assigned to the argument with which theconnective was syntactically associated.3. The Arg1 label, can refer to an abstract object at any distance fromthe connective.
  9. 9. Theories of Discourse Structure. Linguists attempted to produce reasonable generalized theories to representdiscourse structure.. Theories of discourse structure differ in their focus according to the type ofdiscourse such as :written text or dialogue, the type of organization such as intentionalorganization (speaker’s plan).. One of the most popular discourse theories is :RSTRhetorical Structure Theory
  10. 10. RST. RST is a theory of how coherence in text is achieved. RST was originally developed as part of studies ofcomputer-based text generation. RST is designed to explain the coherence of texts, seen as a kind offunction, linking parts of a text to each other.
  11. 11. RSTRelation Name Nucleus SatelliteBackground text whoseunderstanding is beingfacilitatedtext for facilitatingunderstandingElaboration basic information additional informationPreparation text to be presented text which prepares the readerto expect and interpret the textto be presented.
  12. 12. RST ExampleWith just those relations, we can illustrate the analysisof a text.
  13. 13. applications• Question-Answering and Information Extractionsystems• Speech Recognition.• Text Generation.• Essay Scoring.• Text Summarization.
  14. 14. Dicourse Annotation tool for English andArabic
  15. 15. Thank you