SlideShare a Scribd company logo
1 of 18
Autor Conducător științific
Universitatea
Politehnica
București
Facultatea de
Automatică și
Calculatoare
Catedra de
Calculatoare
A Tool for Discourse Analysis and
Visualization
• Costin-Gabriel Chiru • Ştefan Trăuşan-MatuCostin-Gabriel CHIRU & Stefan TRAUSAN-MATU
Content
• Introduction
• Theoretical Ideas
• The Application with the different views
• Conclusions
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Introduction
• Purpose? Develop a method and a tool for
analyzing and visualizing different type of
discourses.
– Now the analysis methods are biased towards one of
the two types of texts: narrations or conversations.
• How? Combining the cognitive and socio-cultural
paradigms using the concept of voice from the
Polyphonic Theory and the ideas related to
identifying polyphonic threads.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Theoretical Ideas
• Existing theories from discourse analysis are either:
– Cognitive paradigm (NLP) – knowledge is situated in
individuals’ minds (Hobbs, Grosz) focused on how
utterances build up in order to create a hierarchically
organized discourse. Problem: lacking to capture the
complex interactions between these utterances.
– Socio-cultural paradigm – knowledge is socially
constructed (Bakhtin, Vygotsky, Trausan-Matu) focused
on collaboration aspects – “rather than speaking about
‘acquisition of knowledge,’ many people prefer to view
learning as becoming a participant in a certain discourse”
[Sfard, 2000] – better suited today in the context of Web
2.0 and of the wide use of chats, forums, blogs and wikis.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Polyphony & Inter-animation
• Bakhtin (1973) introduced the Polyphony Theory stating
that in any text there are multiple voices that influence
each other  inter-animation of the ideas presented by
them.
• Voice = position taken by one or more of the participants
– Current implementations: a participant or an utterance.
– We considered that a voice = an idea that is rhythmically
repeated.
• Voices identification – based on Tannen’s Theory:
“Repetition is a resource by which conversationalists
together create a discourse, a relationship, a world.” –
identification of the words that can be used to express the
same idea (through lexical chains) – works for both
conversations and narrations.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
The System
• Extract the voices from a discourse (lexical chains) – their importance
determines which of them will become voices, but the evaluation is
left to the user.
• Extract the paronyms of each concept (different type of voice) - also
needed as a way to counter-balance the spelling errors and to
identify rhetoric constructions (solution-resolution, presentation-
representation, log-blog, etc.).
• Provide the user the possibility to decide what needs to be
investigated: exact repetitions chains, chains of conceptually-related
words, paronyms chains, or any combination of them.
• Is able to analyze both types of text, and for visualization we offer
four different views of the discourse: view file (the implicit
visualization method), a word-level representation of the voices, a
sentence-level representation and the visualization of the most
important moments of the discourse.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
View File Visualization
21.09.2012 A Tool for Discourse Analysis and Visualization ITS 2012
semantically
related words
repetitions paronyms
Word-Level Representation (1)
21.09.2012 A Tool for Discourse Analysis and Visualization 1
Allows the
user to
choose
one/more
voices and
to visualize
their flow
in the text
Concepts
and their
frequency
Considered voices –
each voice has a
different color
Word-Level Representation (2)
• Allows identifying:
– Which voice is stronger and in what areas (“cell” is
stronger than all the others voices);
– if a voice is present in the whole text (e.g. “cell”), or is a
local artifact (e.g. “genome”);
– if a voice is more or less focused: for example “rna” and
“cytoplasm”, each having exactly 7 occurrences, can be
differentiated - a higher density (rhythmicity) for “rna”
than for “cytoplasm”;
– the voices that “work” together – that can be found in the
same area of text (e.g. “cell” and “cytoplasm”) and the
ones that exclude each other – that are not found in the
same areas of discourse (e.g. “rna” and “genome”);
– Collocations – concepts that work together but cannot be
related to each other (e.g. “world war”, “cold war”).
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Word-Level Representation (3)
21.09.2012 A Tool for Discourse Analysis and Visualization
Cold war is a
collocation and
from these two
words, “war”
fits in the
context (“cold”
is only a
modifier) since
it appears in
areas where
“cold” is not
present, unlike
the opposite
situation.
EIDWT 2012
Sentence-Level Representation (1)
• Presents the distribution of the sentences that contain the
voices considered important by the user.
• Influenced by the view offered by the “Google books”, but
enhanced with the option to:
– Make multiple searches (multiple voices) in order to be able to
compare them;
– Also allowing to observe the semantically related terms or
paronyms.
• Utility: detection of the rhythmicity of a given voice;
identify the voices that are stronger or more focused than
others and their types (global or local), identify the voices
that work together and the ones that exclude each other,
identify the moments of shifting from a topic to another,
identify the topic drifts (off-topic areas).
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Sentence-Level Representation (2)
21.09.2012 A Tool for Discourse Analysis and Visualization 1
1. Stronger
(“Christian”
> “biblical”)
2. More
focused
(“biblical” >
“Allah”)
3. Type –
global
(“Christian”,
“humans”,
“creation”,
“life”) or
local
(“biblical”,
“Allah”,
“Muslim”,
“Satanism”)
4. Voices that work
together (“Allah”
and “Muslim”) and
exclude each other
(“Muslim”,
“biblical”,
“Satanism” and
“Buddha”)
5. Moments of
shifting from a
topic to another
(“Satanism”,
“Muslim”, and
“biblical”)
6. Disambiguation
Visualization of the Most Important
Moments of the Discourse
• Is a consequence of the interactions that can be observed between
different voices.
• We started from the important voices and investigated the areas
were these voices inter-animate (influence each other or co-
participate to the utterance meaning)  the important moments of
the discourse.
• Analyzing the observed types of interactions, we propose a
classification of these moments in 5 different classes:
– Pivotal moments (one voice substitutes the other),
– Moments of convergence (all voices die out),
– Singular moments (all voices but one die out),
– Moments of divergence (multiple voices meet and then are present in
different areas), and
– Meeting points (multiple voices are constantly debating throughout
the discourse).
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
System Evaluation
• The results of this system depend very much on the purpose of its use and
on the voices considered important by the user.
• Used to analyze CSCL chats consisting of 4 participants debating about
which is the best tool for collaborative learning (chat, blog, forum, wiki)
(Trausan-Matu & Rebedea, 2010; Chiru et al., 2011).
• The conversations have been assessed by two CSCL experts and after that
they were automatically evaluated with our system, considering the inter-
animation of the voices of chat, blog, forum and wiki.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Convergence
Moments
Singular
moments
Meeting
Points
Reviewer
1
Reviewer
2
Average
Reviews
Chat 1 0 1 5 7.8 7.7 7.75
Chat 2 1 0 15 10 9.3 9.65
Chat 3 0 0 8 9 8.9 8.95
Chat 4 1 0 7 8.4 8.6 8.5
Chat 5 0 0 11 9 9 9
Results Interpretation
21.09.2012 A Tool for Discourse Analysis and Visualization
• The chats that were considered to be good by the experts had more
important moments than the others.
• The number of meeting points seems to be a good estimator of the
conversations quality, this criterion alone being able to rank them in the
same way as the experts did.
• Differences between the grades offered by the experts and the number of
meeting points found in chats signal that this cannot be the only criterion
 the other important moments that were identified have their own role
in this evaluation.
• Their importance depends on the task of the analyzed discourse:
– if a solution is needed at the end of a discourse, the convergence moments
should have higher importance;
– if the best solution from multiple options is searched, then the singular
moments should receive special treatment.
EIDWT 2012
Conclusions (1)
• We have built an application for discourse
analysis and visualization that is:
– An adaptation of the Polyphony Theory, since is
based on the voice concept;
– Flexible, since the user has the possibility to select
what information to be shown;
– Domain independent (examples from fields such
as history, religion, CSCL, or medicine);
– Language independent, as long as there is a mean
to extract the voices from the discourse.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Conclusions (2)
• This application could be used both at:
– Inter-document level (for comparing discourses from a
corpora that debate the same topics) and
– Intra-document level for evaluating
• the “strength” of different voices from the discourse;
• how focused these voices are;
• their types: local or global;
• the voices that can/cannot be used in the same areas of discourse;
• the areas where the topic drifts is present;
• for disambiguating polysemous words by considering the context
provided by the voices that are found in its vicinity;
• for identifying the most important moments from a discourse,
which could also give information about the areas where specific
topics are debated, about the collocations, syntagms and idioms
that are encountered in that discourse.
21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
Questions
21.09.2012 A Tool for Discourse Analysis and Visualization
Thank you!
EIDWT 2012

More Related Content

Similar to A tool for discourse visualization and analysis

Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Traian Rebedea
 
Meeting 6-discourse-analysis
Meeting 6-discourse-analysisMeeting 6-discourse-analysis
Meeting 6-discourse-analysisfrozgh1
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methodsAkanshShandilya
 
Dr. N K Swain’s research prescription for LIS novices
Dr. N K Swain’s research prescription for LIS novices Dr. N K Swain’s research prescription for LIS novices
Dr. N K Swain’s research prescription for LIS novices Prof. Nirmal Kumar Swain
 
ESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using Ontologies
ESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using OntologiesESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using Ontologies
ESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using Ontologieseswcsummerschool
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1RIILP
 
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesJarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesPalGov
 
Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012Timothy Cole
 
Communicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability PhrasesCommunicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability Phrasesipcc-media
 
Teaching genre in the writing center 1
Teaching genre in the writing center   1Teaching genre in the writing center   1
Teaching genre in the writing center 1Ron Martinez
 
Managing the repertoire final
Managing the repertoire finalManaging the repertoire final
Managing the repertoire finalMaidah Masood
 
discourse analysis in data analysis .pptx
discourse analysis in data analysis .pptxdiscourse analysis in data analysis .pptx
discourse analysis in data analysis .pptxMsHumaJaved
 
content analysis and discourse analysis
content analysis and discourse analysiscontent analysis and discourse analysis
content analysis and discourse analysisRudy Banuta
 
Discourse Analysis Weeks 1,2,3 and 4.pdf
Discourse Analysis  Weeks 1,2,3 and 4.pdfDiscourse Analysis  Weeks 1,2,3 and 4.pdf
Discourse Analysis Weeks 1,2,3 and 4.pdfAmadStrongman
 
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Sina Institute
 
Community case study - metoderammeverk for PHD
Community case study - metoderammeverk for PHDCommunity case study - metoderammeverk for PHD
Community case study - metoderammeverk for PHDMarius Rohde Johannessen
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word CloudsMarina Santini
 
Lecture 1st-Introduction to Discourse Analysis._023928.pptx
Lecture 1st-Introduction to Discourse Analysis._023928.pptxLecture 1st-Introduction to Discourse Analysis._023928.pptx
Lecture 1st-Introduction to Discourse Analysis._023928.pptxGoogle
 
Free the Patterns! The Vital Challenge to the Pattern Community
Free the Patterns! The Vital Challenge to the Pattern CommunityFree the Patterns! The Vital Challenge to the Pattern Community
Free the Patterns! The Vital Challenge to the Pattern CommunityDouglas Schuler
 

Similar to A tool for discourse visualization and analysis (20)

Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
 
Meeting 6-discourse-analysis
Meeting 6-discourse-analysisMeeting 6-discourse-analysis
Meeting 6-discourse-analysis
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
 
Dr. N K Swain’s research prescription for LIS novices
Dr. N K Swain’s research prescription for LIS novices Dr. N K Swain’s research prescription for LIS novices
Dr. N K Swain’s research prescription for LIS novices
 
ESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using Ontologies
ESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using OntologiesESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using Ontologies
ESWC SS 2012 - Tuesday Tutorial Elena Simperl: Creating and Using Ontologies
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
 
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesJarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
 
Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012Cole using oa-intro-dlf2012
Cole using oa-intro-dlf2012
 
Communicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability PhrasesCommunicating Uncertainty with Probability Phrases
Communicating Uncertainty with Probability Phrases
 
Teaching genre in the writing center 1
Teaching genre in the writing center   1Teaching genre in the writing center   1
Teaching genre in the writing center 1
 
Ontology
OntologyOntology
Ontology
 
Managing the repertoire final
Managing the repertoire finalManaging the repertoire final
Managing the repertoire final
 
discourse analysis in data analysis .pptx
discourse analysis in data analysis .pptxdiscourse analysis in data analysis .pptx
discourse analysis in data analysis .pptx
 
content analysis and discourse analysis
content analysis and discourse analysiscontent analysis and discourse analysis
content analysis and discourse analysis
 
Discourse Analysis Weeks 1,2,3 and 4.pdf
Discourse Analysis  Weeks 1,2,3 and 4.pdfDiscourse Analysis  Weeks 1,2,3 and 4.pdf
Discourse Analysis Weeks 1,2,3 and 4.pdf
 
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
 
Community case study - metoderammeverk for PHD
Community case study - metoderammeverk for PHDCommunity case study - metoderammeverk for PHD
Community case study - metoderammeverk for PHD
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Lecture 1st-Introduction to Discourse Analysis._023928.pptx
Lecture 1st-Introduction to Discourse Analysis._023928.pptxLecture 1st-Introduction to Discourse Analysis._023928.pptx
Lecture 1st-Introduction to Discourse Analysis._023928.pptx
 
Free the Patterns! The Vital Challenge to the Pattern Community
Free the Patterns! The Vital Challenge to the Pattern CommunityFree the Patterns! The Vital Challenge to the Pattern Community
Free the Patterns! The Vital Challenge to the Pattern Community
 

More from University Politehnica Bucharest

Identification and Classification of the Most Important Moments in Students’ ...
Identification and Classification of the Most Important Moments in Students’ ...Identification and Classification of the Most Important Moments in Students’ ...
Identification and Classification of the Most Important Moments in Students’ ...University Politehnica Bucharest
 
Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...
Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...
Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...University Politehnica Bucharest
 
Determine the time period when a text was written using time series analysis
Determine the time period when a text was written using time series analysisDetermine the time period when a text was written using time series analysis
Determine the time period when a text was written using time series analysisUniversity Politehnica Bucharest
 
Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...University Politehnica Bucharest
 
Hearthstone helper using optical character recognition techniques for cards d...
Hearthstone helper using optical character recognition techniques for cards d...Hearthstone helper using optical character recognition techniques for cards d...
Hearthstone helper using optical character recognition techniques for cards d...University Politehnica Bucharest
 
Movie recommender system using the user's psychological profile
Movie recommender system using the user's psychological profileMovie recommender system using the user's psychological profile
Movie recommender system using the user's psychological profileUniversity Politehnica Bucharest
 
Tracing the paths between concepts in large bio medical corpora
Tracing the paths between concepts in large bio medical corporaTracing the paths between concepts in large bio medical corpora
Tracing the paths between concepts in large bio medical corporaUniversity Politehnica Bucharest
 
The collection and analysis of public data - Bucharest case study
The collection and analysis of public data - Bucharest case studyThe collection and analysis of public data - Bucharest case study
The collection and analysis of public data - Bucharest case studyUniversity Politehnica Bucharest
 
Unsupervised system for automatic grading of bachelor and master thesis
Unsupervised system for automatic grading of bachelor and master thesisUnsupervised system for automatic grading of bachelor and master thesis
Unsupervised system for automatic grading of bachelor and master thesisUniversity Politehnica Bucharest
 
Tweets topic modelling across different countries prezentarea
Tweets topic modelling across different countries   prezentareaTweets topic modelling across different countries   prezentarea
Tweets topic modelling across different countries prezentareaUniversity Politehnica Bucharest
 
Nlp based heuristics for assessing participants in cscl chats
Nlp based heuristics for assessing participants in cscl chatsNlp based heuristics for assessing participants in cscl chats
Nlp based heuristics for assessing participants in cscl chatsUniversity Politehnica Bucharest
 
2012 Presidential Elections on Twitter - An Analysis of How the US and French...
2012 Presidential Elections on Twitter - An Analysis of How the US and French...2012 Presidential Elections on Twitter - An Analysis of How the US and French...
2012 Presidential Elections on Twitter - An Analysis of How the US and French...University Politehnica Bucharest
 

More from University Politehnica Bucharest (20)

Time series analysis for sales prediction
Time series analysis for sales predictionTime series analysis for sales prediction
Time series analysis for sales prediction
 
Identification and Classification of the Most Important Moments in Students’ ...
Identification and Classification of the Most Important Moments in Students’ ...Identification and Classification of the Most Important Moments in Students’ ...
Identification and Classification of the Most Important Moments in Students’ ...
 
Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...
Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...
Digital Services Development Using Statistics Tools to Emphasize Pollution Ph...
 
Identifying cyclic words with the help of google
Identifying cyclic words with the help of googleIdentifying cyclic words with the help of google
Identifying cyclic words with the help of google
 
Expression of Political Opinions in Press
Expression of Political Opinions in PressExpression of Political Opinions in Press
Expression of Political Opinions in Press
 
Determine the time period when a text was written using time series analysis
Determine the time period when a text was written using time series analysisDetermine the time period when a text was written using time series analysis
Determine the time period when a text was written using time series analysis
 
Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...
 
Hearthstone helper using optical character recognition techniques for cards d...
Hearthstone helper using optical character recognition techniques for cards d...Hearthstone helper using optical character recognition techniques for cards d...
Hearthstone helper using optical character recognition techniques for cards d...
 
Movie recommender system using the user's psychological profile
Movie recommender system using the user's psychological profileMovie recommender system using the user's psychological profile
Movie recommender system using the user's psychological profile
 
Tracing the paths between concepts in large bio medical corpora
Tracing the paths between concepts in large bio medical corporaTracing the paths between concepts in large bio medical corpora
Tracing the paths between concepts in large bio medical corpora
 
The collection and analysis of public data - Bucharest case study
The collection and analysis of public data - Bucharest case studyThe collection and analysis of public data - Bucharest case study
The collection and analysis of public data - Bucharest case study
 
Archaisms and neologisms identification in texts
Archaisms and neologisms identification in textsArchaisms and neologisms identification in texts
Archaisms and neologisms identification in texts
 
Unsupervised system for automatic grading of bachelor and master thesis
Unsupervised system for automatic grading of bachelor and master thesisUnsupervised system for automatic grading of bachelor and master thesis
Unsupervised system for automatic grading of bachelor and master thesis
 
Tweets topic modelling across different countries prezentarea
Tweets topic modelling across different countries   prezentareaTweets topic modelling across different countries   prezentarea
Tweets topic modelling across different countries prezentarea
 
Sentiment based text segmentation
Sentiment based text segmentationSentiment based text segmentation
Sentiment based text segmentation
 
Creativity detection in texts
Creativity detection in textsCreativity detection in texts
Creativity detection in texts
 
Nlp based heuristics for assessing participants in cscl chats
Nlp based heuristics for assessing participants in cscl chatsNlp based heuristics for assessing participants in cscl chats
Nlp based heuristics for assessing participants in cscl chats
 
Metaphor detection
Metaphor detectionMetaphor detection
Metaphor detection
 
2012 Presidential Elections on Twitter - An Analysis of How the US and French...
2012 Presidential Elections on Twitter - An Analysis of How the US and French...2012 Presidential Elections on Twitter - An Analysis of How the US and French...
2012 Presidential Elections on Twitter - An Analysis of How the US and French...
 
Identifying gender differences in cscl chat conversation
Identifying gender differences in cscl chat conversationIdentifying gender differences in cscl chat conversation
Identifying gender differences in cscl chat conversation
 

Recently uploaded

TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 

Recently uploaded (20)

TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 

A tool for discourse visualization and analysis

  • 1. Autor Conducător științific Universitatea Politehnica București Facultatea de Automatică și Calculatoare Catedra de Calculatoare A Tool for Discourse Analysis and Visualization • Costin-Gabriel Chiru • Ştefan Trăuşan-MatuCostin-Gabriel CHIRU & Stefan TRAUSAN-MATU
  • 2. Content • Introduction • Theoretical Ideas • The Application with the different views • Conclusions 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 3. Introduction • Purpose? Develop a method and a tool for analyzing and visualizing different type of discourses. – Now the analysis methods are biased towards one of the two types of texts: narrations or conversations. • How? Combining the cognitive and socio-cultural paradigms using the concept of voice from the Polyphonic Theory and the ideas related to identifying polyphonic threads. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 4. Theoretical Ideas • Existing theories from discourse analysis are either: – Cognitive paradigm (NLP) – knowledge is situated in individuals’ minds (Hobbs, Grosz) focused on how utterances build up in order to create a hierarchically organized discourse. Problem: lacking to capture the complex interactions between these utterances. – Socio-cultural paradigm – knowledge is socially constructed (Bakhtin, Vygotsky, Trausan-Matu) focused on collaboration aspects – “rather than speaking about ‘acquisition of knowledge,’ many people prefer to view learning as becoming a participant in a certain discourse” [Sfard, 2000] – better suited today in the context of Web 2.0 and of the wide use of chats, forums, blogs and wikis. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 5. Polyphony & Inter-animation • Bakhtin (1973) introduced the Polyphony Theory stating that in any text there are multiple voices that influence each other  inter-animation of the ideas presented by them. • Voice = position taken by one or more of the participants – Current implementations: a participant or an utterance. – We considered that a voice = an idea that is rhythmically repeated. • Voices identification – based on Tannen’s Theory: “Repetition is a resource by which conversationalists together create a discourse, a relationship, a world.” – identification of the words that can be used to express the same idea (through lexical chains) – works for both conversations and narrations. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 6. The System • Extract the voices from a discourse (lexical chains) – their importance determines which of them will become voices, but the evaluation is left to the user. • Extract the paronyms of each concept (different type of voice) - also needed as a way to counter-balance the spelling errors and to identify rhetoric constructions (solution-resolution, presentation- representation, log-blog, etc.). • Provide the user the possibility to decide what needs to be investigated: exact repetitions chains, chains of conceptually-related words, paronyms chains, or any combination of them. • Is able to analyze both types of text, and for visualization we offer four different views of the discourse: view file (the implicit visualization method), a word-level representation of the voices, a sentence-level representation and the visualization of the most important moments of the discourse. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 7. View File Visualization 21.09.2012 A Tool for Discourse Analysis and Visualization ITS 2012 semantically related words repetitions paronyms
  • 8. Word-Level Representation (1) 21.09.2012 A Tool for Discourse Analysis and Visualization 1 Allows the user to choose one/more voices and to visualize their flow in the text Concepts and their frequency Considered voices – each voice has a different color
  • 9. Word-Level Representation (2) • Allows identifying: – Which voice is stronger and in what areas (“cell” is stronger than all the others voices); – if a voice is present in the whole text (e.g. “cell”), or is a local artifact (e.g. “genome”); – if a voice is more or less focused: for example “rna” and “cytoplasm”, each having exactly 7 occurrences, can be differentiated - a higher density (rhythmicity) for “rna” than for “cytoplasm”; – the voices that “work” together – that can be found in the same area of text (e.g. “cell” and “cytoplasm”) and the ones that exclude each other – that are not found in the same areas of discourse (e.g. “rna” and “genome”); – Collocations – concepts that work together but cannot be related to each other (e.g. “world war”, “cold war”). 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 10. Word-Level Representation (3) 21.09.2012 A Tool for Discourse Analysis and Visualization Cold war is a collocation and from these two words, “war” fits in the context (“cold” is only a modifier) since it appears in areas where “cold” is not present, unlike the opposite situation. EIDWT 2012
  • 11. Sentence-Level Representation (1) • Presents the distribution of the sentences that contain the voices considered important by the user. • Influenced by the view offered by the “Google books”, but enhanced with the option to: – Make multiple searches (multiple voices) in order to be able to compare them; – Also allowing to observe the semantically related terms or paronyms. • Utility: detection of the rhythmicity of a given voice; identify the voices that are stronger or more focused than others and their types (global or local), identify the voices that work together and the ones that exclude each other, identify the moments of shifting from a topic to another, identify the topic drifts (off-topic areas). 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 12. Sentence-Level Representation (2) 21.09.2012 A Tool for Discourse Analysis and Visualization 1 1. Stronger (“Christian” > “biblical”) 2. More focused (“biblical” > “Allah”) 3. Type – global (“Christian”, “humans”, “creation”, “life”) or local (“biblical”, “Allah”, “Muslim”, “Satanism”) 4. Voices that work together (“Allah” and “Muslim”) and exclude each other (“Muslim”, “biblical”, “Satanism” and “Buddha”) 5. Moments of shifting from a topic to another (“Satanism”, “Muslim”, and “biblical”) 6. Disambiguation
  • 13. Visualization of the Most Important Moments of the Discourse • Is a consequence of the interactions that can be observed between different voices. • We started from the important voices and investigated the areas were these voices inter-animate (influence each other or co- participate to the utterance meaning)  the important moments of the discourse. • Analyzing the observed types of interactions, we propose a classification of these moments in 5 different classes: – Pivotal moments (one voice substitutes the other), – Moments of convergence (all voices die out), – Singular moments (all voices but one die out), – Moments of divergence (multiple voices meet and then are present in different areas), and – Meeting points (multiple voices are constantly debating throughout the discourse). 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 14. System Evaluation • The results of this system depend very much on the purpose of its use and on the voices considered important by the user. • Used to analyze CSCL chats consisting of 4 participants debating about which is the best tool for collaborative learning (chat, blog, forum, wiki) (Trausan-Matu & Rebedea, 2010; Chiru et al., 2011). • The conversations have been assessed by two CSCL experts and after that they were automatically evaluated with our system, considering the inter- animation of the voices of chat, blog, forum and wiki. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012 Convergence Moments Singular moments Meeting Points Reviewer 1 Reviewer 2 Average Reviews Chat 1 0 1 5 7.8 7.7 7.75 Chat 2 1 0 15 10 9.3 9.65 Chat 3 0 0 8 9 8.9 8.95 Chat 4 1 0 7 8.4 8.6 8.5 Chat 5 0 0 11 9 9 9
  • 15. Results Interpretation 21.09.2012 A Tool for Discourse Analysis and Visualization • The chats that were considered to be good by the experts had more important moments than the others. • The number of meeting points seems to be a good estimator of the conversations quality, this criterion alone being able to rank them in the same way as the experts did. • Differences between the grades offered by the experts and the number of meeting points found in chats signal that this cannot be the only criterion  the other important moments that were identified have their own role in this evaluation. • Their importance depends on the task of the analyzed discourse: – if a solution is needed at the end of a discourse, the convergence moments should have higher importance; – if the best solution from multiple options is searched, then the singular moments should receive special treatment. EIDWT 2012
  • 16. Conclusions (1) • We have built an application for discourse analysis and visualization that is: – An adaptation of the Polyphony Theory, since is based on the voice concept; – Flexible, since the user has the possibility to select what information to be shown; – Domain independent (examples from fields such as history, religion, CSCL, or medicine); – Language independent, as long as there is a mean to extract the voices from the discourse. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 17. Conclusions (2) • This application could be used both at: – Inter-document level (for comparing discourses from a corpora that debate the same topics) and – Intra-document level for evaluating • the “strength” of different voices from the discourse; • how focused these voices are; • their types: local or global; • the voices that can/cannot be used in the same areas of discourse; • the areas where the topic drifts is present; • for disambiguating polysemous words by considering the context provided by the voices that are found in its vicinity; • for identifying the most important moments from a discourse, which could also give information about the areas where specific topics are debated, about the collocations, syntagms and idioms that are encountered in that discourse. 21.09.2012 A Tool for Discourse Analysis and Visualization EIDWT 2012
  • 18. Questions 21.09.2012 A Tool for Discourse Analysis and Visualization Thank you! EIDWT 2012

Editor's Notes

  1. based on the idea that