SlideShare a Scribd company logo
1 of 10
Download to read offline
Semi­supervised vs. Cross­domain Graph­based Learning for Sentiment
Classification
Natalia Ponomareva
Statistical Cybermetrics Research Group, University of Wolverhampton, UK

December 21, 2013
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 1 / 98
What is sentiment classification?
Task within the research field of Sentiment Analysis.
It concerns classification of documents on the basis of overall sentiments expressed by their
authors.
Different scales can be used:
positive/negative; positive, negative and neutral; rating: 1*, 2*, 3*, 4*, 5*;

Example “The film was fun and I enjoyed it.” ⇒ positive “The film lasted too long and I got bored.”
⇒ negative
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 2 / 98
Outline
1

Background
Introduction Main approaches Motivation
2

Graph­based algorithms Label propagation Modifications to LP Graph construction
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 3 / 98
Outline
3

Data and their characteristics
Preprocessing Baseline results Data characteristics
4 Semi­supervised vs. cross­domain graph­based learning
Semi­supervised experiments Cross­domain experiments Discussion
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 4 / 98
Outline
1

Background
Introduction Main approaches Motivation
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 5 / 98
Why do we need sentiment classification
“What other people think” has always been important for people in their decision­making
process.
Which cell phone to buy? Is it worth watching this movie? Before the development of the Internet
technologies, people used to ask friends. Now the Internet contains a huge amount of opinions in
forums, blogs, social networks, review sites, etc. However, it is not easy to find and analyse
opinions.
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 6 / 98
Commercial applications
Understanding customer feedback. What is customer opinions about some specific company
products? Brand analysis, reputation management. What is customer opinions about the
company? Market survey. What do customers think about the products of the competitors?
Trend prediction. Will be people satisfied with the products in the future?
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 7 / 98
Approaches to Sentiment Classification
Lexical approaches;
Supervised;
Semi­supervised and unsupervised;
Cross­domain.
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 8 / 98
Lexical approaches
Use of dictionaries of sentiment words with a given semantic orientation. Dictionaries are built
either manually or (semi­)automatically. A special scoring function is applied in order to calculate
the final semantic orientation of a text.
Example lightweight +3, good +4, ridiculous ­2 Lightweight, stores a ridiculous amount of books
and good battery life. SO SO
1
2

= = max{|3|,|4|,|−2|} 3+4­2
3

= 12 3
∙ sign(max{|3|,|4|,|−2|})=4
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 9 / 98
Supervised machine learning approaches
Learn sentiment phenomena from an annotated corpus.
Different Machine Learning methods were tested (NB, SVM, ME).
Usually demonstrated relatively high performance even with surface features, e.g. 82.9% on
movies reviews [Pang et al., 2002].
For review data ML approach performs better than lexical one when training and test data belong
to the same domain. For example, SO­CAL achieves only 76.4% on the same dataset [Taboada
et al., 2011].
But it needs substantial amount of annotated data.
Natalia Ponomareva (SCRG, UoW) Semi­supervised vs. Cross­domain Graphs December 21, 2013 10 / 98

More Related Content

More from Natalia Ostapuk

Mt engine on nlp semniar
Mt engine on nlp semniarMt engine on nlp semniar
Mt engine on nlp semniar
Natalia Ostapuk
 
Клышинский 8.12
Клышинский 8.12Клышинский 8.12
Клышинский 8.12
Natalia Ostapuk
 
место онтологий в современной инженерии на примере Iso 15926 v1
место онтологий в современной инженерии на примере Iso 15926 v1место онтологий в современной инженерии на примере Iso 15926 v1
место онтологий в современной инженерии на примере Iso 15926 v1
Natalia Ostapuk
 
2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge
Natalia Ostapuk
 
2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge
Natalia Ostapuk
 
семинар Spb ling_v3
семинар Spb ling_v3семинар Spb ling_v3
семинар Spb ling_v3
Natalia Ostapuk
 
17.03 большакова
17.03 большакова17.03 большакова
17.03 большакова
Natalia Ostapuk
 
Bonch-Osmolovskaya 3.3.2012
Bonch-Osmolovskaya 3.3.2012Bonch-Osmolovskaya 3.3.2012
Bonch-Osmolovskaya 3.3.2012
Natalia Ostapuk
 

More from Natalia Ostapuk (20)

Mt engine on nlp semniar
Mt engine on nlp semniarMt engine on nlp semniar
Mt engine on nlp semniar
 
Tomita 4марта
Tomita 4мартаTomita 4марта
Tomita 4марта
 
Konyushkova
KonyushkovaKonyushkova
Konyushkova
 
Braslavsky 13.12.12
Braslavsky 13.12.12Braslavsky 13.12.12
Braslavsky 13.12.12
 
Клышинский 8.12
Клышинский 8.12Клышинский 8.12
Клышинский 8.12
 
Zizka synasc 2012
Zizka synasc 2012Zizka synasc 2012
Zizka synasc 2012
 
Zizka immm 2012
Zizka immm 2012Zizka immm 2012
Zizka immm 2012
 
Zizka aimsa 2012
Zizka aimsa 2012Zizka aimsa 2012
Zizka aimsa 2012
 
Analysis by-variants
Analysis by-variantsAnalysis by-variants
Analysis by-variants
 
место онтологий в современной инженерии на примере Iso 15926 v1
место онтологий в современной инженерии на примере Iso 15926 v1место онтологий в современной инженерии на примере Iso 15926 v1
место онтологий в современной инженерии на примере Iso 15926 v1
 
Text mining
Text miningText mining
Text mining
 
Additional2
Additional2Additional2
Additional2
 
Additional1
Additional1Additional1
Additional1
 
Seminar1
Seminar1Seminar1
Seminar1
 
2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge
 
2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge2011 04 troussov_graph_basedmethods-weakknowledge
2011 04 troussov_graph_basedmethods-weakknowledge
 
Angelii rus
Angelii rusAngelii rus
Angelii rus
 
семинар Spb ling_v3
семинар Spb ling_v3семинар Spb ling_v3
семинар Spb ling_v3
 
17.03 большакова
17.03 большакова17.03 большакова
17.03 большакова
 
Bonch-Osmolovskaya 3.3.2012
Bonch-Osmolovskaya 3.3.2012Bonch-Osmolovskaya 3.3.2012
Bonch-Osmolovskaya 3.3.2012
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Ponomareva