SlideShare a Scribd company logo
1 of 10
Download to read offline
Interaction-level relations for Opinion Analysis
               Putting forth the benefits of Textometry

               Sentiment Analysis Symposium 2011
               Manhattan Conference Center, New York, USA



                Marguerite Leenhardt - PhD Student in Applied Linguistics, NLP, Textometry   SYLED/CLA2T - Paris 3 University
                mleenhardt@le-semiopole.fr




                                                                                                                                April 12th, 2011

mardi 12 avril 2011
TEXTOMETRY ?

              - branch of statistical study of linguistic data                           TWOFOLD TEXT SEGMENTATION PROCESS
                                                                                      GENERATES THE DATASET’S CANVAS/FRAMEWORK

              - text considered as possessing its own internal structure

              - bypassing information extraction step (qualitative                                         CONTENTS
                                                                                                           textual sequences organized in
              coding)                                                        CORPUS                        sentences, paragraphs, ...

                 > applying statistical and probabilistic calculations to                   b.   b.
                 the units that make up comparable texts in a corpus                   a.   b.   b.
                 > mostly based on hypergeometric model and                                 b.
                 proximity algorithms                                                                          CONTAINERS
                                                                                                               annotation systems (e.g.
                 > reveals structures that would remain hidden due to                                          sentence or paragraph segmentation
                                                                                            d.   d.            markers considered a specific type of
                 the quantity of data                                                                          annotation on contents)
                                                                                       c.   d.   d.
              - robust method processing data without external                              d.
              ressources constraints (lexicons, dictionnaries, ontologies)

              - analyzing objects distribution within the corpus
              framework




mardi 12 avril 2011
IDENTIFYING MAJOR TRENDS AND OPPOSITIONS IN A DATASET

              - Corpus Cocoon : online media analysis following a product launch - 40 000 words

              - Factorial Correspondence Analysis is used to determine distance between textual objetcs compared on the basis of
              proximity algorithm (positioning sets of elements in the corpus space)

              - Closest objects heavily cite the press release ; blogs cite Named Entities (brand and product) but diverge from the press
              release.
                        !




                                       AFC output to compare user’s comments on different web supports ; french corpus


mardi 12 avril 2011
INTERACTION-LEVEL RELATIONS : WHY ?

              - textual interactions as the main material for Opinion Mining/Sentiment Analysis

              - contextual analysis as an important challenge (Pang & Lee, 2008) and a major ressource for
              interpretation (Somasundaran, 2010) : interactional features are informative on a global scale (discourse
              ≠ interaction)

              - Textometry as a means to go beyond the local context boundaries by taking global dimensions into
              account : text is considered a component in and of itself (bottum-up approach)

              - «A lot of information is often not captured in the handbuilt model and lost.» (Boiy et al., 2007)

              - qualitative coding should not be the first approach but a second step after mining corpus-based
              knowledge




mardi 12 avril 2011
INTERACTION-LEVEL RELATIONS : HOW ?
     annotating interactional relations between
     user’s contributions in a given discussion
     > linking and specifying containers




                                              > Corpus enhanced with qualitative information
                                              > Acquiring information on the context : conversational tree
                                              > Determining zones of intensity in a discussion feed (computer-
                                              assisted task)




                                                                                                                 Named Entities Recognition +
                                                                                                                 matchnig paraphrases
                                              > Analyzing linguistic specificness of linked containers vs. the
                                              whole corpus                                                       Corpus-driven lexical ressource
                                              > Building corpus-driven linguistic ressources (textometric        (LR) for thematic analysis
                                              objects)                                                           Corpus-driven lexical ressource
                                                                                                                 (LR) for opinion




mardi 12 avril 2011
PROJECTING THE CORPUS-DRIVEN LINGUISTIC RESSOURCES FOR OPINION

              - Corpus Cocoon : the LR is projected on the dataset’s canvas/framework to highlight distribution of opinions
              amongst UGCs (adaptation of the Appraisal Theory scale for opinion orientation)

              - Distributional Inventory is used to identify major trends in opinion expression ; here, most of UGCs are not
              relevant as they only cite the brand in congratulation messages to the bloggers who posted on the product launch.




                                                                                                      !
                                                      Opinion distribution amongst user’s comments




mardi 12 avril 2011
«I» NETWORK IN THE ORANGE CORPUS




mardi 12 avril 2011
«FORFAIT» IN THE ORANGE CORPUS




mardi 12 avril 2011
ORANGE LEXICO-SEMANTIC NETWORK




mardi 12 avril 2011
Merci !
           Marguerite Leenhardt PhD student
           mleenhardt@le-semiopole.fr




mardi 12 avril 2011

More Related Content

Viewers also liked

Chesterfield
ChesterfieldChesterfield
ChesterfieldGabirice
 
Motivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their TextbooksMotivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their Textbooksjabdurrashid
 
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéSSzakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS987987
 
Legalis Munkavegzes
Legalis MunkavegzesLegalis Munkavegzes
Legalis Munkavegzes987987
 

Viewers also liked (6)

Chesterfield
ChesterfieldChesterfield
Chesterfield
 
Motivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their TextbooksMotivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their Textbooks
 
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéSSzakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS
 
Legalis Munkavegzes
Legalis MunkavegzesLegalis Munkavegzes
Legalis Munkavegzes
 
Daniel 2 B
Daniel 2 BDaniel 2 B
Daniel 2 B
 
Amina
AminaAmina
Amina
 

Recently uploaded

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024Borja Sotomayor
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfPondicherry University
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...EduSkills OECD
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonhttgc7rh9c
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Orientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdfOrientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdfElizabeth Walsh
 
PUBLIC FINANCE AND TAXATION COURSE-1-4.pdf
PUBLIC FINANCE AND TAXATION COURSE-1-4.pdfPUBLIC FINANCE AND TAXATION COURSE-1-4.pdf
PUBLIC FINANCE AND TAXATION COURSE-1-4.pdfMinawBelay
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111GangaMaiya1
 
How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17Celine George
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxAdelaideRefugio
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....Ritu480198
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital ManagementMBA Assignment Experts
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningMarc Dusseiller Dusjagr
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxCeline George
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Celine George
 

Recently uploaded (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Orientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdfOrientation Canvas Course Presentation.pdf
Orientation Canvas Course Presentation.pdf
 
PUBLIC FINANCE AND TAXATION COURSE-1-4.pdf
PUBLIC FINANCE AND TAXATION COURSE-1-4.pdfPUBLIC FINANCE AND TAXATION COURSE-1-4.pdf
PUBLIC FINANCE AND TAXATION COURSE-1-4.pdf
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111
 
How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17
 

Interaction-level relations for Opinion Analysis Putting forth the benefits of Textometry

  • 1. Interaction-level relations for Opinion Analysis Putting forth the benefits of Textometry Sentiment Analysis Symposium 2011 Manhattan Conference Center, New York, USA Marguerite Leenhardt - PhD Student in Applied Linguistics, NLP, Textometry SYLED/CLA2T - Paris 3 University mleenhardt@le-semiopole.fr April 12th, 2011 mardi 12 avril 2011
  • 2. TEXTOMETRY ? - branch of statistical study of linguistic data TWOFOLD TEXT SEGMENTATION PROCESS GENERATES THE DATASET’S CANVAS/FRAMEWORK - text considered as possessing its own internal structure - bypassing information extraction step (qualitative CONTENTS textual sequences organized in coding) CORPUS sentences, paragraphs, ... > applying statistical and probabilistic calculations to b. b. the units that make up comparable texts in a corpus a. b. b. > mostly based on hypergeometric model and b. proximity algorithms CONTAINERS annotation systems (e.g. > reveals structures that would remain hidden due to sentence or paragraph segmentation d. d. markers considered a specific type of the quantity of data annotation on contents) c. d. d. - robust method processing data without external d. ressources constraints (lexicons, dictionnaries, ontologies) - analyzing objects distribution within the corpus framework mardi 12 avril 2011
  • 3. IDENTIFYING MAJOR TRENDS AND OPPOSITIONS IN A DATASET - Corpus Cocoon : online media analysis following a product launch - 40 000 words - Factorial Correspondence Analysis is used to determine distance between textual objetcs compared on the basis of proximity algorithm (positioning sets of elements in the corpus space) - Closest objects heavily cite the press release ; blogs cite Named Entities (brand and product) but diverge from the press release. ! AFC output to compare user’s comments on different web supports ; french corpus mardi 12 avril 2011
  • 4. INTERACTION-LEVEL RELATIONS : WHY ? - textual interactions as the main material for Opinion Mining/Sentiment Analysis - contextual analysis as an important challenge (Pang & Lee, 2008) and a major ressource for interpretation (Somasundaran, 2010) : interactional features are informative on a global scale (discourse ≠ interaction) - Textometry as a means to go beyond the local context boundaries by taking global dimensions into account : text is considered a component in and of itself (bottum-up approach) - «A lot of information is often not captured in the handbuilt model and lost.» (Boiy et al., 2007) - qualitative coding should not be the first approach but a second step after mining corpus-based knowledge mardi 12 avril 2011
  • 5. INTERACTION-LEVEL RELATIONS : HOW ? annotating interactional relations between user’s contributions in a given discussion > linking and specifying containers > Corpus enhanced with qualitative information > Acquiring information on the context : conversational tree > Determining zones of intensity in a discussion feed (computer- assisted task) Named Entities Recognition + matchnig paraphrases > Analyzing linguistic specificness of linked containers vs. the whole corpus Corpus-driven lexical ressource > Building corpus-driven linguistic ressources (textometric (LR) for thematic analysis objects) Corpus-driven lexical ressource (LR) for opinion mardi 12 avril 2011
  • 6. PROJECTING THE CORPUS-DRIVEN LINGUISTIC RESSOURCES FOR OPINION - Corpus Cocoon : the LR is projected on the dataset’s canvas/framework to highlight distribution of opinions amongst UGCs (adaptation of the Appraisal Theory scale for opinion orientation) - Distributional Inventory is used to identify major trends in opinion expression ; here, most of UGCs are not relevant as they only cite the brand in congratulation messages to the bloggers who posted on the product launch. ! Opinion distribution amongst user’s comments mardi 12 avril 2011
  • 7. «I» NETWORK IN THE ORANGE CORPUS mardi 12 avril 2011
  • 8. «FORFAIT» IN THE ORANGE CORPUS mardi 12 avril 2011
  • 10. Merci ! Marguerite Leenhardt PhD student mleenhardt@le-semiopole.fr mardi 12 avril 2011