Representing discourse and argumentation as an application of Web Science


Published on

Discourse on the Web currently can not be appropriately representation, which hampers searching and querying. Based on insights from Web Science, DERI Galway has developed three different approaches for representing and mining of discourse.

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Representing discourse and argumentation as an application of Web Science

  1. 1. Digital Enterprise Research Institute Representing discourse and argumentation as an application of Web Science Benjamin Heitmann, Dr. Conor Hayes Digital Resources for the Humanities and Arts Conference 2009  Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Chapter
  2. 2. Introduction Digital Enterprise Research Institute  The Web mirrors most areas of today’s society (e.g.: entertainment, science and humanities)  Current Web does not capture structure of critique, argumentation, interpretation  Representing types and granularity of discourse and links is necessary  DERI has 3 approaches to discourse representation  Foundation: Web Science as an interdisciplinary approach to understanding and engineering the Web (started by Tim Berners-Lee) Benjamin.Heitmann slide 2 of 18
  3. 3. Outline Digital Enterprise Research Institute  Motivation: Knowledge representation techniques to enable more sophisticated searching and querying of discourse on the Web  Introducing Web Science: an interdisciplinary approach to understanding the Web and its evolution  Applying the Web Science method: three approaches for discourse representation Benjamin.Heitmann slide 3 of 18
  4. 4. Discourse and argumentation on the Web Digital Enterprise Research Institute  TheWeb doesn't properly capture the dynamic argumentation structures in discourse Primary research research  Current search only captures: Text paper weblog reference  Plain text Argument  General links Counter- reference  Citations argument  No search for: Evaluation  Relations between concepts reference Motivation  Negative relations reference Conclusion  Semantics of argumentation: – Argument, counter-argument publication frequency increases – Condition, evidence, solution Benjamin.Heitmann slide 4 of 18
  5. 5. Representing the structure of discourse Digital Enterprise Research Institute  Knowledge on the Web is not sufficiently connected  No standard vocabularies for representation of discourse structure and link granularity  Queries are un-intuitive and imprecise, no negative queries  Links are un-typed, and only on document level  No semantics of relationships Source: “Clickstream Data Yields High-Resolution Maps of Science,” Bollen, Van de Sompel, et al. PLoS ONE (2009) Benjamin.Heitmann slide 5 of 18
  6. 6. Insights from Web Science Digital Enterprise Research Institute The “Web Science” idea was started by Tim Berners- Lee and researchers from Southampton (see sources) 1. Understanding the current Web requires an interdisciplinary and holistic view of the Web on a whole 2. On the Web, engineering and social factors will influence each other and create a feedback loop 3. Properties of the Web are based on emergent behaviour, which can be empirically measured Benjamin.Heitmann slide 6 of 18
  7. 7. A Systems-level view of the Web Digital Enterprise Research Institute  Classical reductionist approach does not work  Understand- ing the current Web requires an inter- disciplinary view  No delegation of research on one area to only one discipline © Web Science Research Initiative Benjamin.Heitmann slide 7 of 18
  8. 8. The Web Science Process Model Digital Enterprise Research Institute On the Web, engineering and © Lawrence Lessig social factors will influence each other. Increase in complexity: result is transition from micro to macro effects Source: CACM Web Science Article Example: Evolution of Blogs  Independent blogs: Track-backs, Comments, Spam  Twitter: Microblogging, HashTags, Location aware  Facebook: Lifestreaming, Privacy Benjamin.Heitmann slide 8 of 18
  9. 9. Emergent properties of the Web Digital Enterprise Research Institute  Empirical properties:  In- and out-degree distribution of links  Power laws  Growth: 7 million new pages Source: “Graph structure in the Web”, Broder, a day in 2005 Kumar et al.  Emergent patterns:  Popular tags (folksonomies) on Web 2.0 sites © Clay Shirky  Emerging of an editorial elite on Wikipedia Benjamin.Heitmann slide 9 of 18
  10. 10. Approaches for discourse representation Digital Enterprise Research Institute  The Web Science method and discourse representation:  Interdisciplinary: theoretical foundation is based on Speech act theory and Language Game theory  Expect a feedback loop between Semantic Web solutions and usage patterns of community  Empirical approach: CORAAL: use knowledge extraction and integration on large data collections  Normative (engineering) approaches: – SIOC Argumentation vocabulary: light-weight and community-driven – SALT: annotation of argumentation semantics Benjamin.Heitmann slide 10 of 18
  11. 11. CORAAL: empirical discourse analysis Digital Enterprise Research Institute  Knowledge extraction and integration  Pattern discovery  Use emergent patterns in large document collections  Go beyond text based search:  Answer negative queries  Detect relations between concepts  UsesNatural Language Processing  No mark-up required Benjamin.Heitmann slide 11 of 18
  12. 12. CORAAL screen shot of results for the search term “breast cancer”
  13. 13. SIOC argumentation vocabulary Digital Enterprise Research Institute  Light-weight and informal  Express structure of argumentation:  Who is participating?  Where are the elements of the discourse distributed?  How are the elements connected?  Extensibility enables community involvement Benjamin.Heitmann slide 13 of 18
  14. 14. SIOC argumentation vocabulary
  15. 15. SALT: Semantically Annotated LaTex Digital Enterprise Research Institute  Enables mark-up of documents for claim identification  Exposes the semantics of the argumentation. Examples:  Claims, explanations  Rhetorical structure (abstract, contribution, evaluation)  Argument, counter-argument  Creates PDF with content and structure Benjamin.Heitmann slide 15 of 18
  16. 16. discourse representation in SALT
  17. 17. Summary Digital Enterprise Research Institute  Representing discourse allows intuitive querying and searching of the argumentation semantics  The Web Science method provides insights to representing discourse:  Use interdisciplinary approach; Expect feedback loop between technical and social factors; Detect emergent properties and patterns  Three approaches at DERI for representing discourse:  CORAAL: empirical, knowledge extraction+integration  SIOC argumentation vocabulary: light weight, bottom up  SALT: annotate argumentation semantics in publications Benjamin.Heitmann slide 17 of 18
  18. 18. Questions? and Sources! Digital Enterprise Research Institute These slides:  Web Science:“Web science: an interdisciplinary approach to understanding the web”, Hendler, Shadboldt, Hall, Berners-Lee, Weitzner, Communications of the ACM (2008)  CORAAL: demo at “CORAAL-Dive into publications, Bathe in the Knowledge,” Novacek, Groza, et al., Journal of Web Semantics, Elsevier (2009)  SIOC argumentation vocabulary:“Expressing Argumentative Discussions in Social Media Sites”, Lange, Bojars, et al., Workshop on Social Data on the Web at the International Semantic Web Conference (2008)  SALT:“SALT-Semantically Annotated LaTex for Scientific Publications,” Groza, Handschuh, et al., European Semantic Web Conference (2007) Benjamin.Heitmann slide 18 of 18